通过模拟点云上的多模式来提高3D对象检测

论文标题

通过模拟点云上的多模式来提高3D对象检测

Boosting 3D Object Detection by Simulating Multimodality on Point Clouds

论文作者

Zheng, Wu, Hong, Mingxuan, Jiang, Li, Fu, Chi-Wing

论文摘要

本文提出了一种新的方法来提高单模式（LIDAR）3D对象检测器，以模拟遵循多模式（LIDAR形象）检测器的特征和响应。该方法仅在训练单模式检测器时才需要LiDAR-image数据，并且一旦训练良好，它只需要在推断时进行激光数据即可。我们设计了一个新的框架来实现这种方法：响应蒸馏以关注关键响应样本并避免背景样本；从估计的关键体素中学习体素语义和关系的稀疏 - 素蒸馏；细粒度到点蒸馏，以更好地了解小对象的特征；和实例蒸馏以进一步增强深度功能的一致性。 Nuscenes数据集的实验结果表明，我们的方法的表现优于所有仅SOTA激光雷达3D检测器，甚至超过了关键NDS指标上的基线激光镜检测器，填充了单个和多模式探测器之间的72％MAP间隙。

This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector. The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference. We design a novel framework to realize the approach: response distillation to focus on the crucial response samples and avoid the background samples; sparse-voxel distillation to learn voxel semantics and relations from the estimated crucial voxels; a fine-grained voxel-to-point distillation to better attend to features of small and distant objects; and instance distillation to further enhance the deep-feature consistency. Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors and even surpasses the baseline LiDAR-image detector on the key NDS metric, filling 72% mAP gap between the single- and multi-modality detectors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题