3D激光点云的实时语义分割的多投影融合

论文标题

3D激光点云的实时语义分割的多投影融合

Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR Point Clouds

论文作者

Alnaggar, Yara Ali, Afifi, Mohamed, Amer, Karim, Elhelw, Mohamed

论文摘要

3D点云数据的语义分割对于增强自主平台中的高级感知至关重要。此外，鉴于汽车和无人机板上的激光雷达传感器的部署越来越大，还特别重点放在在移动GPU上运行的非占有密集算法。以前的有效的最新方法依赖于点云的2D球形投影，作为2D完全卷积神经网络的输入，以平衡准确的速度权衡。本文介绍了3D点云语义分割的新方法，该方法利用了点云的多个投影，以减轻单个投影方法固有的信息丢失。我们的多投影融合（MPF）框架使用两个单独的高效2D完全卷积模型分析球形和鸟类视图投影，然后结合了这两种视图的分割结果。所提出的框架在Semantickitti数据集上进行了验证，在Semantickitti数据集中，它获得了55.5的MIOU，该MIOU高于基于预测的方法Rangenet ++和Polarnet，而比前者快1.6倍，比后者快3.1倍。

Semantic segmentation of 3D point cloud data is essential for enhanced high-level perception in autonomous platforms. Furthermore, given the increasing deployment of LiDAR sensors onboard of cars and drones, a special emphasis is also placed on non-computationally intensive algorithms that operate on mobile GPUs. Previous efficient state-of-the-art methods relied on 2D spherical projection of point clouds as input for 2D fully convolutional neural networks to balance the accuracy-speed trade-off. This paper introduces a novel approach for 3D point cloud semantic segmentation that exploits multiple projections of the point cloud to mitigate the loss of information inherent in single projection methods. Our Multi-Projection Fusion (MPF) framework analyzes spherical and bird's-eye view projections using two separate highly-efficient 2D fully convolutional models then combines the segmentation results of both views. The proposed framework is validated on the SemanticKITTI dataset where it achieved a mIoU of 55.5 which is higher than state-of-the-art projection-based methods RangeNet++ and PolarNet while being 1.6x faster than the former and 3.1x faster than the latter.

下载PDF全文

下载文献需遵守相关版权规定

论文标题