论文标题
Q在正规平均场比赛中学习
Q-Learning in Regularized Mean-field Games
论文作者
论文摘要
在本文中,我们介绍了一个正规的平均场游戏,并根据无限 - 马折扣奖励功能进行了研究该游戏的学习。正规化是通过在经典均值游戏模型中的单阶段奖励功能中添加强凹的正则化功能来引入正则化。我们使用合适的Q学习建立了基于价值迭代的学习算法。通常,正规化术语使增强学习算法对系统组件的强大鲁棒性。此外,它使我们能够对学习算法进行错误分析,而不会对系统组件施加限制性凸度假设,这是在没有正则化项的情况下所需的。
In this paper, we introduce a regularized mean-field game and study learning of this game under an infinite-horizon discounted reward function. Regularization is introduced by adding a strongly concave regularization function to the one-stage reward function in the classical mean-field game model. We establish a value iteration based learning algorithm to this regularized mean-field game using fitted Q-learning. The regularization term in general makes reinforcement learning algorithm more robust to the system components. Moreover, it enables us to establish error analysis of the learning algorithm without imposing restrictive convexity assumptions on the system components, which are needed in the absence of a regularization term.