DACOM：学习延迟感知的多代理强化学习

论文标题

DACOM：学习延迟感知的多代理强化学习

DACOM: Learning Delay-Aware Communication for Multi-Agent Reinforcement Learning

论文作者

Yuan, Tingting, Chung, Hwei-Ming, Yuan, Jie, Fu, Xiaoming

论文摘要

沟通应该改善合作多代理增强学习（MARL）的多代理协作和整体绩效。但是，由于大多数现有的沟通方案忽略了通信开销（例如，沟通延迟），因此这种改进在实践中普遍受到限制。在本文中，我们证明了忽略沟通延迟对协作有害影响，尤其是在诸如自主驾驶之类的延迟敏感任务中。为了减轻这种影响，我们设计了一种延迟感知的多代理通信模型（DACOM），以使沟通适应延迟。具体来说，DACOM引入了一个组件Timenet，该组件负责调整代理商接收其他代理的等待时间，以便可以解决与延迟相关的不确定性。我们的实验表明，DACOM通过在交流的好处和等待消息的成本之间取消了更好的权衡，而不是其他机制的不可忽略的性能提高。

Communication is supposed to improve multi-agent collaboration and overall performance in cooperative Multi-agent reinforcement learning (MARL). However, such improvements are prevalently limited in practice since most existing communication schemes ignore communication overheads (e.g., communication delays). In this paper, we demonstrate that ignoring communication delays has detrimental effects on collaborations, especially in delay-sensitive tasks such as autonomous driving. To mitigate this impact, we design a delay-aware multi-agent communication model (DACOM) to adapt communication to delays. Specifically, DACOM introduces a component, TimeNet, that is responsible for adjusting the waiting time of an agent to receive messages from other agents such that the uncertainty associated with delay can be addressed. Our experiments reveal that DACOM has a non-negligible performance improvement over other mechanisms by making a better trade-off between the benefits of communication and the costs of waiting for messages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题