论文标题
体育中的机器学习:一项关于使用可解释模型预测排球比赛结果的案例研究
Machine Learning in Sports: A Case Study on Using Explainable Models for Predicting Outcomes of Volleyball Matches
论文作者
论文摘要
机器学习已成为包括运动在内的多个领域工程设计和决策的组成部分。深度神经网络(DNNS)一直是预测职业体育赛事结果的最新方法。但是,除了对这些体育活动成果的高度准确预测外,还必须回答诸如“为什么模型预测A团队会赢得与B队的X比赛?”之类的问题? DNN本质上是本质上的黑框。因此,需要为模型在运动中的预测提供高质量的可解释且可理解的解释。本文探讨了一种可解释的人工智能(XAI)方法,以预测巴西排球联盟(Superliga)的比赛结果。在第一阶段,我们直接使用基于规则的ML模型,该模型基于布尔规则列的生成(BRCG;提取简单和 - 或分类规则)和逻辑回归(logReg; logReg;允许估算特征重要性得分),从而提供了对模型行为的全局理解。在第二阶段,我们构建了非线性模型,例如支持向量机(SVM)和深神经网络(DNN),以在排球比赛的结果上获得预测性能。我们使用ProtoDash为每个数据实例构建“事后”解释,该方法在训练数据集中找到与测试实例最相似的原型和Shap,该方法估算了每个功能对模型预测的贡献。我们使用忠诚度量标准评估了摇摆的解释。我们的结果证明了对模型预测的解释的有效性。
Machine Learning has become an integral part of engineering design and decision making in several domains, including sports. Deep Neural Networks (DNNs) have been the state-of-the-art methods for predicting outcomes of professional sports events. However, apart from getting highly accurate predictions on these sports events outcomes, it is necessary to answer questions such as "Why did the model predict that Team A would win Match X against Team B?" DNNs are inherently black-box in nature. Therefore, it is required to provide high-quality interpretable, and understandable explanations for a model's prediction in sports. This paper explores a two-phased Explainable Artificial Intelligence(XAI) approach to predict outcomes of matches in the Brazilian volleyball League (SuperLiga). In the first phase, we directly use the interpretable rule-based ML models that provide a global understanding of the model's behaviors based on Boolean Rule Column Generation (BRCG; extracts simple AND-OR classification rules) and Logistic Regression (LogReg; allows to estimate the feature importance scores). In the second phase, we construct non-linear models such as Support Vector Machine (SVM) and Deep Neural Network (DNN) to obtain predictive performance on the volleyball matches' outcomes. We construct the "post-hoc" explanations for each data instance using ProtoDash, a method that finds prototypes in the training dataset that are most similar to the test instance, and SHAP, a method that estimates the contribution of each feature on the model's prediction. We evaluate the SHAP explanations using the faithfulness metric. Our results demonstrate the effectiveness of the explanations for the model's predictions.