利用连接和自动驾驶汽车的功能以及多方强化学习来减轻高速公路瓶颈拥塞

论文标题

利用连接和自动驾驶汽车的功能以及多方强化学习来减轻高速公路瓶颈拥塞

Leveraging the Capabilities of Connected and Autonomous Vehicles and Multi-Agent Reinforcement Learning to Mitigate Highway Bottleneck Congestion

论文作者

Ha, Paul Young Joun, Chen, Sikai, Dong, Jiqian, Du, Runjia, Li, Yujie, Labi, Samuel

论文摘要

积极的交通管理策略通常是实时采用的，以解决这种突然的流量故障。当排队即将到来时，可以应用上游流量中的速度以减轻下游的流量showckwaves的速度（SH）。但是，由于SH取决于驾驶员的意识和合规性，因此它可能并不总是有效缓解交通拥堵的。将多种强化学习用于协作学习，是解决这一挑战的有前途的解决方案。通过将此技术纳入连接和自动驾驶汽车（CAV）的控制算法中，可以训练骑士制定联合决策，以减轻高速公路瓶颈拥塞而无需人类驾驶员遵守改变速度限制的限制。在这方面，我们提出了一个基于RL的多代理CAV控制模型，该模型在混合流量中运行（CAVS和人类驱动的车辆（HDVS））。结果表明，即使在CAV百分比份额的走廊流量低至10％的情况下，CAVS也可以大大减轻高速公路交通的瓶颈。另一个目的是评估基于RL的控制器相对于基于规则的控制器的功效。在解决这一目标时，我们适当地认识到，基于RL的CAV控制器的主要挑战之一是现实世界中存在的投入的多样性和复杂性，例如其他连接的实体和感知的信息提供给CAV的信息。这些翻译成动态长度输入，难以处理和学习。因此，我们建议使用图形卷积网络（GCN）（一种特定的RL技术）来保留信息网络拓扑和相应的动态长度输入。然后，我们将其与深层确定性政策梯度（DDPG）结合使用，以使用CAV控制器进行多代理培训，以缓解拥塞。

Active Traffic Management strategies are often adopted in real-time to address such sudden flow breakdowns. When queuing is imminent, Speed Harmonization (SH), which adjusts speeds in upstream traffic to mitigate traffic showckwaves downstream, can be applied. However, because SH depends on driver awareness and compliance, it may not always be effective in mitigating congestion. The use of multiagent reinforcement learning for collaborative learning, is a promising solution to this challenge. By incorporating this technique in the control algorithms of connected and autonomous vehicle (CAV), it may be possible to train the CAVs to make joint decisions that can mitigate highway bottleneck congestion without human driver compliance to altered speed limits. In this regard, we present an RL-based multi-agent CAV control model to operate in mixed traffic (both CAVs and human-driven vehicles (HDVs)). The results suggest that even at CAV percent share of corridor traffic as low as 10%, CAVs can significantly mitigate bottlenecks in highway traffic. Another objective was to assess the efficacy of the RL-based controller vis-à-vis that of the rule-based controller. In addressing this objective, we duly recognize that one of the main challenges of RL-based CAV controllers is the variety and complexity of inputs that exist in the real world, such as the information provided to the CAV by other connected entities and sensed information. These translate as dynamic length inputs which are difficult to process and learn from. For this reason, we propose the use of Graphical Convolution Networks (GCN), a specific RL technique, to preserve information network topology and corresponding dynamic length inputs. We then use this, combined with Deep Deterministic Policy Gradient (DDPG), to carry out multi-agent training for congestion mitigation using the CAV controllers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题