学习单独推断多机构合作的沟通

论文标题

学习单独推断多机构合作的沟通

Learning Individually Inferred Communication for Multi-Agent Cooperation

论文作者

Ding, Ziluo, Huang, Tiejun, Lu, Zongqing

论文摘要

沟通奠定了人类合作的基础。这对于多机构合作也至关重要。但是，现有的工作着重于广播沟通，这不仅是不切实际的，而且会导致信息冗余，甚至可能损害学习过程。为了解决这些困难，我们建议单独推断沟通（I2C），这是一个简单而有效的模型，使代理商能够学习代理商与代理商的交流。先验的知识是通过因果推论来学到的，并通过馈送前向神经网络实现，该神经网络将代理人的本地观察结果映射到对与谁交流的信念。一种代理对另一种代理的影响是通过在多代理增强学习中的联合动作值函数推断出来的，并量化以标记代理机构通信的必要性。此外，代理策略是正规化的，以更好地利用通讯的消息。从经验上讲，我们表明I2C不仅可以减少开销的开销，而且可以改善与现有方法相比，在各种多机构合作场景中的性能。该代码可在https://github.com/pku-ai-ged/i2c上找到。

Communication lays the foundation for human cooperation. It is also crucial for multi-agent cooperation. However, existing work focuses on broadcast communication, which is not only impractical but also leads to information redundancy that could even impair the learning process. To tackle these difficulties, we propose Individually Inferred Communication (I2C), a simple yet effective model to enable agents to learn a prior for agent-agent communication. The prior knowledge is learned via causal inference and realized by a feed-forward neural network that maps the agent's local observation to a belief about who to communicate with. The influence of one agent on another is inferred via the joint action-value function in multi-agent reinforcement learning and quantified to label the necessity of agent-agent communication. Furthermore, the agent policy is regularized to better exploit communicated messages. Empirically, we show that I2C can not only reduce communication overhead but also improve the performance in a variety of multi-agent cooperative scenarios, comparing to existing methods. The code is available at https://github.com/PKU-AI-Edge/I2C.

下载PDF全文

下载文献需遵守相关版权规定

论文标题