论文标题
平均场类型多代理控制及其接近最佳性的有限近似
Finite Approximations for Mean Field Type Multi-Agent Control and Their Near Optimality
论文作者
论文摘要
我们在离散时间中研究了一个多代理的平均场类型控制问题,在这种离散时间中,代理人旨在寻找社会最佳策略以及假定代理的状态和行动空间是连续的。这些代理仅通过其状态变量的分布而微弱地耦合。以其原始形式的问题可以作为经典的马尔可夫决策过程(MDP)提出,但是,这种配方遇到了几个实际困难。在这项工作中,我们试图克服维度的诅咒,代理之间的协调复杂性以及所有代理商的完美反馈收集的必要性(对于大人群来说可能很难做到。) 我们提供了几个近似值:我们通过构建和研究有限和无限种群设置的量度值的MDP对应物来确定代理的作用和状态空间离散化的最佳选择。考虑平均场类型模型的无限人口问题是一种众所周知的方法,因为它为代理提供了对称策略,从而简化了代理之间的协调。但是,最佳分析更加困难,因为量度的状态空间价值无限人口MDP是连续的(即使在代理的空间离散后,也是如此。因此,作为最后一步,我们通过专注于较小尺寸的亚种群分布来为无限的人口问题提供进一步的近似值。
We study a multi-agent mean field type control problem in discrete time where the agents aim to find a socially optimal strategy and where the state and action spaces for the agents are assumed to be continuous. The agents are only weakly coupled through the distribution of their state variables. The problem in its original form can be formulated as a classical Markov decision process (MDP), however, this formulation suffers from several practical difficulties. In this work, we attempt to overcome the curse of dimensionality, coordination complexity between the agents, and the necessity of perfect feedback collection from all the agents (which might be hard to do for large populations.) We provide several approximations: we establish the near optimality of the action and state space discretization of the agents under standard regularity assumptions for the considered formulation by constructing and studying the measure valued MDP counterpart for finite and infinite population settings. It is a well known approach to consider the infinite population problem for mean-field type models, since it provides symmetric policies for the agents which simplifies the coordination between the agents. However, the optimality analysis is harder as the state space of the measure valued infinite population MDP is continuous (even after space discretization of the agents). Therefore, as a final step, we provide further approximations for the infinite population problem by focusing on smaller sized sub-population distributions.