基于图形诱导的本地价值功能的分布式多代理增强学习

论文标题

基于图形诱导的本地价值功能的分布式多代理增强学习

Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value Functions

论文作者

Jing, Gangshan, Bai, He, George, Jemin, Chakrabortty, Aranya, Sharma, Piyush K.

论文摘要

实现大规模合作多代理系统（MASS）的分布式增强学习（RL）是具有挑战性的，因为：（i）每个代理只能访问有限的信息；（ii）由于维度的诅咒而出现了有关收敛或计算复杂性的问题。在本文中，我们通过利用此问题所涉及的图形结构提出了一个一般的计算有效分布式框架，用于合作多代理增强学习（MARL）。我们介绍了三个耦合图，描述了MARL中三种类型的代理耦合，即状态图，观察图和奖励图。通过进一步考虑通信图，我们根据耦合图得出的局部价值功能提出了两种分布式RL方法。第一种方法能够在上述四个图的特定条件下显着降低样品复杂性。第二种方法提供了一个近似的解决方案，即使对于密集耦合图的问题也可以有效。在最小化近似误差与降低计算复杂性之间的权衡。模拟表明，与基于集中和共识的分布式RL算法相比，我们的RL算法对大规模质量的可伸缩性显着提高。

Achieving distributed reinforcement learning (RL) for large-scale cooperative multi-agent systems (MASs) is challenging because: (i) each agent has access to only limited information; (ii) issues on convergence or computational complexity emerge due to the curse of dimensionality. In this paper, we propose a general computationally efficient distributed framework for cooperative multi-agent reinforcement learning (MARL) by utilizing the structures of graphs involved in this problem. We introduce three coupling graphs describing three types of inter-agent couplings in MARL, namely, the state graph, the observation graph and the reward graph. By further considering a communication graph, we propose two distributed RL approaches based on local value-functions derived from the coupling graphs. The first approach is able to reduce sample complexity significantly under specific conditions on the aforementioned four graphs. The second approach provides an approximate solution and can be efficient even for problems with dense coupling graphs. Here there is a trade-off between minimizing the approximation error and reducing the computational complexity. Simulations show that our RL algorithms have a significantly improved scalability to large-scale MASs compared with centralized and consensus-based distributed RL algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题