混合多代理的深入强化学习，用于自动迁移到需求系统

论文标题

混合多代理的深入强化学习，用于自动迁移到需求系统

Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems

论文作者

Enders, Tobias, Harrison, James, Pavone, Marco, Schiffer, Maximilian

论文摘要

我们考虑了对自主移动性按需系统的利润最大化运营商进行积极主动要求分配和拒绝决策的顺序决策问题。我们将这个问题正式化为马尔可夫决策过程，并提出了多代理软演员批判和加权双分配匹配的新型组合，以获得预期的控制策略。因此，我们将操作员原本棘手的动作空间分解，但仍能获得全球协调的决定。基于实际出租车数据的实验表明，我们的方法在性能，稳定性和计算障碍性方面优于最先进的基准。

We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator's otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题