Flatland-rl：火车上的多代理增强学习

论文标题

Flatland-rl：火车上的多代理增强学习

Flatland-RL : Multi-Agent Reinforcement Learning on Trains

论文作者

Mohanty, Sharada, Nygren, Erik, Laurent, Florian, Schneider, Manuel, Scheller, Christian, Bhattacharya, Nilabha, Watson, Jeremy, Egli, Adrian, Eichenberger, Christian, Baumberger, Christian, Vienken, Gereon, Sturm, Irene, Sartoretti, Guillaume, Spigler, Giacomo

论文摘要

对火车的有效自动安排仍然是现代铁路系统的主要挑战。自数十年以来，基础车辆重新安排问题（VRSP）一直是操作研究（OR）的主要重点。传统方法使用复杂的模拟器来研究VRSP，在其中进行广泛的新思想的实验是耗时的，并且具有巨大的计算开销。在本文中，我们介绍了一个称为“ Flatland”的二维简化网格环境，该环境可以更快地进行实验。 Flatland不仅会降低完整的物理模拟的复杂性，而且还提供了一个易于使用的界面来测试VRSP的新方法，例如增强学习（RL）和模仿学习（IL）。为了探究机器学习的潜力（ML）对Flatland的潜力，我们（1）运行了第一个系列的RL和IL实验，以及（2）在2020年Neurips Alding并执行了公共基准，以吸引大型研究人员社区来解决这个问题。一方面，我们自己的实验结果表明，ML具有解决Flatland上的VRSP的潜力。另一方面，我们确定需要进一步研究的关键主题。总体而言，Flatland环境已被证明是一个强大而有价值的框架，用于调查铁路网络的VRSP。我们的实验为进一步研究和神经2020 Flatland基准的参与者提供了一个很好的起点。所有这些努力共同有可能对塑造未来的流动性产生重大影响。

Efficient automated scheduling of trains remains a major challenge for modern railway systems. The underlying vehicle rescheduling problem (VRSP) has been a major focus of Operations Research (OR) since decades. Traditional approaches use complex simulators to study VRSP, where experimenting with a broad range of novel ideas is time consuming and has a huge computational overhead. In this paper, we introduce a two-dimensional simplified grid environment called "Flatland" that allows for faster experimentation. Flatland does not only reduce the complexity of the full physical simulation, but also provides an easy-to-use interface to test novel approaches for the VRSP, such as Reinforcement Learning (RL) and Imitation Learning (IL). In order to probe the potential of Machine Learning (ML) research on Flatland, we (1) ran a first series of RL and IL experiments and (2) design and executed a public Benchmark at NeurIPS 2020 to engage a large community of researchers to work on this problem. Our own experimental results, on the one hand, demonstrate that ML has potential in solving the VRSP on Flatland. On the other hand, we identify key topics that need further research. Overall, the Flatland environment has proven to be a robust and valuable framework to investigate the VRSP for railway networks. Our experiments provide a good starting point for further research and for the participants of the NeurIPS 2020 Flatland Benchmark. All of these efforts together have the potential to have a substantial impact on shaping the mobility of the future.

下载PDF全文

下载文献需遵守相关版权规定

论文标题