CMR3D：3D对象检测的上下文化多阶段完善

论文标题

CMR3D：3D对象检测的上下文化多阶段完善

CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection

论文作者

Gaddam, Dhanalaxmi, Lahoud, Jean, Khan, Fahad Shahbaz, Anwer, Rao Muhammad, Cholakkal, Hisham

论文摘要

现有的基于深度学习的3D对象检测器通常依赖于单个对象的外观，并且不明确注意场景的丰富上下文信息。在这项工作中，我们为3D对象检测（CMR3D）框架提出了上下文化的多阶段改进，该框架将3D场景作为输入，并努力在多个级别上明确整合场景的有用上下文信息，以预测一组对象界限盒以及其相应的语义标签。为此，我们建议利用一个上下文增强网络，该网络在不同级别的粒度级别上捕获上下文信息，然后是多阶段修复模块，以逐步完善框位置和类预测。大规模ScannETV2基准的广泛实验揭示了我们提出的方法的好处，从而使基线的绝对提高了2.0％。除3D对象检测外，我们还研究了CMR3D框架对3D对象计数问题的有效性。我们的源代码将公开发布。

Existing deep learning-based 3D object detectors typically rely on the appearance of individual objects and do not explicitly pay attention to the rich contextual information of the scene. In this work, we propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework, which takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene at multiple levels to predict a set of object bounding-boxes along with their corresponding semantic labels. To this end, we propose to utilize a context enhancement network that captures the contextual information at different levels of granularity followed by a multi-stage refinement module to progressively refine the box positions and class predictions. Extensive experiments on the large-scale ScanNetV2 benchmark reveal the benefits of our proposed method, leading to an absolute improvement of 2.0% over the baseline. In addition to 3D object detection, we investigate the effectiveness of our CMR3D framework for the problem of 3D object counting. Our source code will be publicly released.

下载PDF全文

下载文献需遵守相关版权规定

论文标题