论文标题
可伸缩软组,用于点云上的3D实例分割
Scalable SoftGroup for 3D Instance Segmentation on Point Clouds
论文作者
论文摘要
本文考虑了一个被称为软组的网络,以进行准确且可扩展的3D实例分割。现有的最新方法会产生硬语义预测,然后将实例分割结果分组。不幸的是,由于艰难的决定,出现的错误传播到分组中,导致预测的实例与地面真理和实质性误报之间的重叠不佳。为了解决上述问题,软组允许每个点与多个类相关联,以减轻语义预测引起的不确定性。它还通过学习将它们归类为背景来抑制误报实例。关于可伸缩性,现有的快速方法需要在大规模场景上以数十秒钟的顺序计算时间,这是不令人满意的,远非实时适用。我们的发现是,$ k $ - 最近的邻居($ k $ -nn)模块是分组的先决条件,它引入了计算瓶颈。软组扩展以解决此计算瓶颈,称为软组++。拟议的软组++降低了与octree $ k $ -nn的时间复杂性,并通过阶级感知的金字塔缩放和晚发脱氧化降低了搜索空间。各种室内和室外数据集的实验结果证明了拟议的软组和软组++的功效和通用性。他们的性能超过了表现最佳的基线(6 \%$ \ sim $ 16 \%),就ap $ _ {50} $而言。在具有大规模场景的数据集上,与软组相比,SoftGroup ++平均可提高6 $ \ times $ speed。此外,可以扩展软组以执行对象检测和全磁分割,并且对现有方法的非平地改进。源代码和训练有素的模型可在\ url {https://github.com/thangvubk/softgroup}上获得。
This paper considers a network referred to as SoftGroup for accurate and scalable 3D instance segmentation. Existing state-of-the-art methods produce hard semantic predictions followed by grouping instance segmentation results. Unfortunately, errors stemming from hard decisions propagate into the grouping, resulting in poor overlap between predicted instances and ground truth and substantial false positives. To address the abovementioned problems, SoftGroup allows each point to be associated with multiple classes to mitigate the uncertainty stemming from semantic prediction. It also suppresses false positive instances by learning to categorize them as background. Regarding scalability, the existing fast methods require computational time on the order of tens of seconds on large-scale scenes, which is unsatisfactory and far from applicable for real-time. Our finding is that the $k$-Nearest Neighbor ($k$-NN) module, which serves as the prerequisite of grouping, introduces a computational bottleneck. SoftGroup is extended to resolve this computational bottleneck, referred to as SoftGroup++. The proposed SoftGroup++ reduces time complexity with octree $k$-NN and reduces search space with class-aware pyramid scaling and late devoxelization. Experimental results on various indoor and outdoor datasets demonstrate the efficacy and generality of the proposed SoftGroup and SoftGroup++. Their performances surpass the best-performing baseline by a large margin (6\% $\sim$ 16\%) in terms of AP$_{50}$. On datasets with large-scale scenes, SoftGroup++ achieves a 6$\times$ speed boost on average compared to SoftGroup. Furthermore, SoftGroup can be extended to perform object detection and panoptic segmentation with nontrivial improvements over existing methods. The source code and trained models are available at \url{https://github.com/thangvubk/SoftGroup}.