用路边摄像机的自动驾驶的实时全栈交通现场感知

论文标题

用路边摄像机的自动驾驶的实时全栈交通现场感知

Real-time Full-stack Traffic Scene Perception for Autonomous Driving with Roadside Cameras

论文作者

Zou, Zhengxia, Zhang, Rusheng, Shen, Shengyin, Pandey, Gaurav, Chakravarty, Punarjay, Parchami, Armin, Liu, Henry X.

论文摘要

我们为路边摄像机提出了一个针对交通现场的新颖而务实的框架。所提出的框架涵盖了基础架构辅助自动驾驶的路边知觉管道的全部堆栈，包括对象检测，对象定位，对象跟踪和多相机信息融合。与以前的基于视觉的感知框架依赖于训练中的深度偏移或3D注释，我们采用模块化解耦设计并介绍一种基于里程碑的3D本地化方法，在该方法中，检测和本地化可以很好地解耦，以便仅基于2D注释即可轻松训练该模型。所提出的框架适用于带有针孔或鱼眼镜的光相机或热摄像机。我们的框架部署在位于Ellsworth Rd的两车道回旋处。和美国密歇根州安阿伯市的State St.，提供7x24实时交通流量监控和高精度车辆轨迹提取。整个系统在低功率边缘计算设备上有效地运行，全部端到端延迟小于20ms。

We propose a novel and pragmatic framework for traffic scene perception with roadside cameras. The proposed framework covers a full-stack of roadside perception pipeline for infrastructure-assisted autonomous driving, including object detection, object localization, object tracking, and multi-camera information fusion. Unlike previous vision-based perception frameworks rely upon depth offset or 3D annotation at training, we adopt a modular decoupling design and introduce a landmark-based 3D localization method, where the detection and localization can be well decoupled so that the model can be easily trained based on only 2D annotations. The proposed framework applies to either optical or thermal cameras with pinhole or fish-eye lenses. Our framework is deployed at a two-lane roundabout located at Ellsworth Rd. and State St., Ann Arbor, MI, USA, providing 7x24 real-time traffic flow monitoring and high-precision vehicle trajectory extraction. The whole system runs efficiently on a low-power edge computing device with all-component end-to-end delay of less than 20ms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题