论文标题
通过模拟点云上的多模式来提高3D对象检测
Boosting 3D Object Detection by Simulating Multimodality on Point Clouds
论文作者
论文摘要
本文提出了一种新的方法来提高单模式(LIDAR)3D对象检测器,以模拟遵循多模式(LIDAR形象)检测器的特征和响应。该方法仅在训练单模式检测器时才需要LiDAR-image数据,并且一旦训练良好,它只需要在推断时进行激光数据即可。我们设计了一个新的框架来实现这种方法:响应蒸馏以关注关键响应样本并避免背景样本;从估计的关键体素中学习体素语义和关系的稀疏 - 素蒸馏;细粒度到点蒸馏,以更好地了解小对象的特征;和实例蒸馏以进一步增强深度功能的一致性。 Nuscenes数据集的实验结果表明,我们的方法的表现优于所有仅SOTA激光雷达3D检测器,甚至超过了关键NDS指标上的基线激光镜检测器,填充了单个和多模式探测器之间的72%MAP间隙。
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector. The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference. We design a novel framework to realize the approach: response distillation to focus on the crucial response samples and avoid the background samples; sparse-voxel distillation to learn voxel semantics and relations from the estimated crucial voxels; a fine-grained voxel-to-point distillation to better attend to features of small and distant objects; and instance distillation to further enhance the deep-feature consistency. Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors and even surpasses the baseline LiDAR-image detector on the key NDS metric, filling 72% mAP gap between the single- and multi-modality detectors.