论文标题
TIG-BEV:通过目标内几何学习多视图BEV BEV 3D对象检测
TiG-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning
论文作者
论文摘要
为了实现准确和低成本的3D对象检测,现有方法提出,通过带有激光雷达模态提供的空间提示,使基于相机的多视图检测器受益,例如,密集的深度监督和鸟眼视图(BEV)特征蒸馏。但是,他们直接进行从激光镜头到相机的点对点模仿,这忽略了前景目标的内几何形状,并且遭受了2D-3D功能之间的模态差距。在本文中,我们提出了目标内几何体的学习方案,从激光雷达模态到基于相机的BEV检测器,以实现密集深度和BEV特征,称为TIG-BEV。首先,我们引入了一个内在的深度监督模块,以学习不同前景像素之间的低级相对深度关系。这使基于相机的检测器能够更好地了解对象空间结构。其次,我们设计了一个内部功能BEV蒸馏模块,以模仿前景目标中不同关键点的高级语义。为了进一步缓解两种模式之间的BEV特征差距,我们对特征相似性建模采用频道间和播放器蒸馏。借助我们的目标内几何蒸馏,TIG-BEV可以有效地将BEVDEPTH提高到 +2.3%NDS和 +2.4%的MAP,而BevDet则 +9.1%NDS和Nuscenes val Set的MAP +9.1%NDS和 +10.3%的MAP。代码将在https://github.com/adlab3ds/tig-bev上找到。
To achieve accurate and low-cost 3D object detection, existing methods propose to benefit camera-based multi-view detectors with spatial cues provided by the LiDAR modality, e.g., dense depth supervision and bird-eye-view (BEV) feature distillation. However, they directly conduct point-to-point mimicking from LiDAR to camera, which neglects the inner-geometry of foreground targets and suffers from the modal gap between 2D-3D features. In this paper, we propose the learning scheme of Target Inner-Geometry from the LiDAR modality into camera-based BEV detectors for both dense depth and BEV features, termed as TiG-BEV. First, we introduce an inner-depth supervision module to learn the low-level relative depth relations between different foreground pixels. This enables the camera-based detector to better understand the object-wise spatial structures. Second, we design an inner-feature BEV distillation module to imitate the high-level semantics of different keypoints within foreground targets. To further alleviate the BEV feature gap between two modalities, we adopt both inter-channel and inter-keypoint distillation for feature-similarity modeling. With our target inner-geometry distillation, TiG-BEV can effectively boost BEVDepth by +2.3% NDS and +2.4% mAP, along with BEVDet by +9.1% NDS and +10.3% mAP on nuScenes val set. Code will be available at https://github.com/ADLab3Ds/TiG-BEV.