凸凸的匪徒学习非刻痕单调游戏

论文标题

凸凸的匪徒学习非刻痕单调游戏

Bandit Learning in Convex Non-Strictly Monotone Games

论文作者

Tatarenko, Tatiana, Kamgarpour, Maryam

论文摘要

我们在收益信息设置下解决凸台游戏中的NASH Equilibria。我们认为游戏伪级是单调的情况，但不一定严格单调。严格的单调性的放松使学习算法应用于更大类的游戏，例如，零和游戏仅具有凸连接成本功能。我们得出了一种算法，在这种情况下，其迭代可证明会收敛到最小 - 纳什的平衡。 {从单个播放器的角度使用建议的算法，我们将游戏视为在线优化的实例}。通过此镜头，我们量化了算法的遗憾率，并提供了一种选择算法参数以最大程度地降低后悔率的方法。

We address learning Nash equilibria in convex games under the payoff information setting. We consider the case in which the game pseudo-gradient is monotone but not necessarily strictly monotone. This relaxation of strict monotonicity enables application of learning algorithms to a larger class of games, such as, for example, a zero-sum game with a merely convex-concave cost function. We derive an algorithm whose iterates provably converge to the least-norm Nash equilibrium in this setting. {From the perspective of a single player using the proposed algorithm, we view the game as an instance of online optimization}. Through this lens, we quantify the regret rate of the algorithm and provide an approach to choose the algorithm's parameters to minimize the regret rate.

下载PDF全文

下载文献需遵守相关版权规定

论文标题