基于DRL的CAV网络的多基因合作控制框架：图形卷积Q网络

论文标题

基于DRL的CAV网络的多基因合作控制框架：图形卷积Q网络

A DRL-based Multiagent Cooperative Control Framework for CAV Networks: a Graphic Convolution Q Network

论文作者

Dong, Jiqian, Chen, Sikai, Ha, Paul Young Joun, Li, Yujie, Labi, Samuel

论文摘要

连接的自动驾驶汽车（CAV）网络可以定义为在多层走廊上不同位置运行的CAV集合，该骑士提供了一个平台，以促进传播操作信息以及控制说明。在CAV操作系统中，合作至关重要，因为它可以在安全性和机动性方面可以极大地增强运营，并且可以通过在CAV网络中的共同计划和控制来期望CAVS之间的高级合作。然而，由于在多种驾驶任务中具有高度动态和组合性的性质，例如动态数量（CAV）和指数增长的关节动作空间，因此实现合作控制是NP的难度，并且不能受任何基于规则的简单方法来控制。此外，现有文献还包含有关自动驾驶的传感技术和控制逻辑的大量信息，但有关如何融合从协作感应中获取的信息并在融合信息之上构建决策处理器的指导相对较少的指导。在本文中，提出了一种基于图形卷积神经网络（GCN）和深Q网络（DQN）的新型深钢筋学习（DRL）方法，即图形卷积Q网络（GCQ）作为信息融合模块和决策处理器。所提出的模型可以汇总从协作感应和输出的安全和合作车道更改多个CAV的决策中获取的信息，以便即使在高度动态和部分观察到的混合流量中也可以满足个人意图。所提出的算法可以部署在集中式控制基础设施上，例如路边单元（RSU）或云平台，以改善CAV操作。

Connected Autonomous Vehicle (CAV) Network can be defined as a collection of CAVs operating at different locations on a multilane corridor, which provides a platform to facilitate the dissemination of operational information as well as control instructions. Cooperation is crucial in CAV operating systems since it can greatly enhance operation in terms of safety and mobility, and high-level cooperation between CAVs can be expected by jointly plan and control within CAV network. However, due to the highly dynamic and combinatory nature such as dynamic number of agents (CAVs) and exponentially growing joint action space in a multiagent driving task, achieving cooperative control is NP hard and cannot be governed by any simple rule-based methods. In addition, existing literature contains abundant information on autonomous driving's sensing technology and control logic but relatively little guidance on how to fuse the information acquired from collaborative sensing and build decision processor on top of fused information. In this paper, a novel Deep Reinforcement Learning (DRL) based approach combining Graphic Convolution Neural Network (GCN) and Deep Q Network (DQN), namely Graphic Convolution Q network (GCQ) is proposed as the information fusion module and decision processor. The proposed model can aggregate the information acquired from collaborative sensing and output safe and cooperative lane changing decisions for multiple CAVs so that individual intention can be satisfied even under a highly dynamic and partially observed mixed traffic. The proposed algorithm can be deployed on centralized control infrastructures such as road-side units (RSU) or cloud platforms to improve the CAV operation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题