使用近端策略优化的深度强化学习使用自动化车道变更策略

论文标题

使用近端策略优化的深度强化学习使用自动化车道变更策略

Automated Lane Change Strategy using Proximal Policy Optimization-based Deep Reinforcement Learning

论文作者

Ye, Fei, Cheng, Xuxin, Wang, Pin, Chan, Ching-Yao, Zhang, Jiucai

论文摘要

驾驶员通常会执行换车道操作，以遵循某个路由计划，超越较慢的车辆，适应前方的合并车道等。但是，不当的车道变更行为可能是交通流动中断的主要原因，甚至崩溃。尽管已经提出了许多基于规则的方法来解决自动驾驶的车道变更问题，但由于驾驶环境的不确定性和复杂性，它们往往表现出有限的性能。基于机器学习的方法提供了一种替代方法，因为深度强化学习（DRL）在许多应用程序领域都表现出了有希望的成功，包括机器人操纵，导航和玩视频游戏。但是，在缓慢的学习率，样本效率低下和安全问题方面，将DRL应用于自动驾驶仍然面临许多实际挑战。在这项研究中，我们使用基于近端策略优化的深度强化学习提出了一种自动化的车道变更策略，该学习在学习效率方面具有很大的优势，同时仍保持稳定的性能。受过训练的代理商能够学习一项平稳，安全，高效的驾驶政策，以在充满挑战的情况（例如密集的交通情况）中做出换车的决策（即何时何时）。通过使用任务成功率和碰撞率的指标来验证拟议政策的有效性。仿真结果表明，车道更改操作可以有效地学习和执行，以安全，平稳和有效的方式进行。

Lane-change maneuvers are commonly executed by drivers to follow a certain routing plan, overtake a slower vehicle, adapt to a merging lane ahead, etc. However, improper lane change behaviors can be a major cause of traffic flow disruptions and even crashes. While many rule-based methods have been proposed to solve lane change problems for autonomous driving, they tend to exhibit limited performance due to the uncertainty and complexity of the driving environment. Machine learning-based methods offer an alternative approach, as Deep reinforcement learning (DRL) has shown promising success in many application domains including robotic manipulation, navigation, and playing video games. However, applying DRL to autonomous driving still faces many practical challenges in terms of slow learning rates, sample inefficiency, and safety concerns. In this study, we propose an automated lane change strategy using proximal policy optimization-based deep reinforcement learning, which shows great advantages in learning efficiency while still maintaining stable performance. The trained agent is able to learn a smooth, safe, and efficient driving policy to make lane-change decisions (i.e. when and how) in a challenging situation such as dense traffic scenarios. The effectiveness of the proposed policy is validated by using metrics of task success rate and collision rate. The simulation results demonstrate the lane change maneuvers can be efficiently learned and executed in a safe, smooth, and efficient manner.

下载PDF全文

下载文献需遵守相关版权规定

论文标题