2DPASS：2D先验辅助语义分段在激光雷达点云上

论文标题

2DPASS：2D先验辅助语义分段在激光雷达点云上

2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds

论文作者

Yan, Xu, Gao, Jiantao, Zheng, Chaoda, Zheng, Chao, Zhang, Ruimao, Cui, Shenghui, Li, Zhen

论文摘要

随着相机和激光雷达传感器捕获用于自主驾驶的互补信息，已经做出了巨大的努力，通过多模式数据融合来开发语义分割算法。但是，基于融合的方法需要配对的数据，即具有严格的点对像素映射的激光点云和相机图像，因为训练和推理的输入都严重阻碍了在实际情况下的应用。因此，在这项工作中，我们建议通过充分利用具有丰富外观的2D图像来增强对点云上的代表性学习的2D先验辅助语义分割（2DPASS），以增强点云上的表示。实际上，通过利用辅助模态融合和多尺度融合到单个知识蒸馏（MSFSKD），2DPass从多模式数据中获取更丰富的语义和结构信息，然后在线蒸馏到纯3D网络。结果，配备了2DAPS，我们的基线仅使用点云输入显示出显着的改进。具体而言，它在两个大规模的基准（即Semantickitti和Nuscenes）上实现了最先进的方法，其中包括TOP-1的Semantickitti的单扫描和多个扫描竞赛。

As camera and LiDAR sensors capture complementary information used in autonomous driving, great efforts have been made to develop semantic segmentation algorithms through multi-modality data fusion. However, fusion-based approaches require paired data, i.e., LiDAR point clouds and camera images with strict point-to-pixel mappings, as the inputs in both training and inference, which seriously hinders their application in practical scenarios. Thus, in this work, we propose the 2D Priors Assisted Semantic Segmentation (2DPASS), a general training scheme, to boost the representation learning on point clouds, by fully taking advantage of 2D images with rich appearance. In practice, by leveraging an auxiliary modal fusion and multi-scale fusion-to-single knowledge distillation (MSFSKD), 2DPASS acquires richer semantic and structural information from the multi-modal data, which are then online distilled to the pure 3D network. As a result, equipped with 2DPASS, our baseline shows significant improvement with only point cloud inputs. Specifically, it achieves the state-of-the-arts on two large-scale benchmarks (i.e. SemanticKITTI and NuScenes), including top-1 results in both single and multiple scan(s) competitions of SemanticKITTI.

下载PDF全文

下载文献需遵守相关版权规定

论文标题