在线免费游戏中评估评分系统的评估

论文标题

在线免费游戏中评估评分系统的评估

The Evaluation of Rating Systems in Online Free-for-All Games

论文作者

Dehpanah, Arman, Ghori, Muheeb Faizan, Gemmell, Jonathan, Mobasher, Bamshad

论文摘要

在线竞争游戏变得越来越流行。为了确保一个令人兴奋和竞争的环境，这些游戏通常会尝试匹配具有类似技能水平的玩家。匹配玩家通常是通过评级系统完成的。关于开发此类评级系统的研究越来越多。但是，对这些系统的评估指标的关注较少。在本文中，我们对六个指标进行了详尽的分析，用于评估在线竞争游戏中的评估系统。我们比较传统指标，例如准确性。然后，我们介绍从信息检索领域改编的其他指标。我们在大型现实世界数据集中对几个众所周知的评级系统评估了这些指标。我们的结果表明其实用性差异很大。一些指标不考虑两个等级之间的偏差。其他人则受到新玩家的影响。许多人没有捕获区分较高排名和较低排名中错误的重要性。在所有研究的指标中，我们建议标准化的折扣累计收益（NDCG），因为它不仅可以解决其他指标所面临的问题，而且还具有灵活性来根据系统的目标调整评估

Online competitive games have become increasingly popular. To ensure an exciting and competitive environment, these games routinely attempt to match players with similar skill levels. Matching players is often accomplished through a rating system. There has been an increasing amount of research on developing such rating systems. However, less attention has been given to the evaluation metrics of these systems. In this paper, we present an exhaustive analysis of six metrics for evaluating rating systems in online competitive games. We compare traditional metrics such as accuracy. We then introduce other metrics adapted from the field of information retrieval. We evaluate these metrics against several well-known rating systems on a large real-world dataset of over 100,000 free-for-all matches. Our results show stark differences in their utility. Some metrics do not consider deviations between two ranks. Others are inordinately impacted by new players. Many do not capture the importance of distinguishing between errors in higher ranks and lower ranks. Among all metrics studied, we recommend Normalized Discounted Cumulative Gain (NDCG) because not only does it resolve the issues faced by other metrics, but it also offers flexibility to adjust the evaluations based on the goals of the system

下载PDF全文

下载文献需遵守相关版权规定

论文标题