图形R-CNN：朝着使用语义装饰的本地图进行准确的3D对象检测

论文标题

图形R-CNN：朝着使用语义装饰的本地图进行准确的3D对象检测

Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph

论文作者

Yang, Honghui, Liu, Zili, Wu, Xiaopei, Wang, Wenxiao, Qian, Wei, He, Xiaofei, Cai, Deng

论文摘要

两阶段探测器在3D对象检测中已广受欢迎。大多数两阶段3D检测器都使用网格点，体素网格或第二阶段的ROI特征提取的采样关键。但是，这种方法在处理不均匀分布和稀疏的室外点方面效率低下。本文在三个方面解决了这个问题。 1）动态点聚集。我们建议补丁搜索以快速在本地区域中为每个3D提案搜索点。然后将最远的体素采样采用以均匀采样。特别是，体素尺寸沿距离变化，以适应点的不均匀分布。 2）Roi graph Poling。我们在采样点上构建本地图，以通过迭代消息传递更好地模型上下文信息和地雷关系。 3）视觉功能增强。我们引入了一种简单而有效的融合策略，以用有限的语义提示来弥补稀疏的激光雷达点。基于这些模块，我们将图形R-CNN构建为第二阶段，可以将其应用于现有的一阶段检测器以始终如一地提高检测性能。广泛的实验表明，图形R-CNN的表现优于最新的3D检测模型，而Kitti和Waymo Open DataSet的差距很大。而且我们在Kitti Bev Car Decction排行榜上排名第一。代码将在\ url {https://github.com/nightmare-n/graphrcnn}上找到。

Two-stage detectors have gained much popularity in 3D object detection. Most two-stage 3D detectors utilize grid points, voxel grids, or sampled keypoints for RoI feature extraction in the second stage. Such methods, however, are inefficient in handling unevenly distributed and sparse outdoor points. This paper solves this problem in three aspects. 1) Dynamic Point Aggregation. We propose the patch search to quickly search points in a local region for each 3D proposal. The dynamic farthest voxel sampling is then applied to evenly sample the points. Especially, the voxel size varies along the distance to accommodate the uneven distribution of points. 2) RoI-graph Pooling. We build local graphs on the sampled points to better model contextual information and mine point relations through iterative message passing. 3) Visual Features Augmentation. We introduce a simple yet effective fusion strategy to compensate for sparse LiDAR points with limited semantic cues. Based on these modules, we construct our Graph R-CNN as the second stage, which can be applied to existing one-stage detectors to consistently improve the detection performance. Extensive experiments show that Graph R-CNN outperforms the state-of-the-art 3D detection models by a large margin on both the KITTI and Waymo Open Dataset. And we rank first place on the KITTI BEV car detection leaderboard. Code will be available at \url{https://github.com/Nightmare-n/GraphRCNN}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题