论文标题
LIDAR 3D对象检测的双曲线余弦变压器
Hyperbolic Cosine Transformer for LiDAR 3D Object Detection
论文作者
论文摘要
最近,变压器在计算机视觉方面取得了巨大的成功。但是,由于空间和时间复杂性在3D对象检测应用中的大点数量时,它受到限制。以前的点方法正在遭受时间消耗和有限的接收场,以捕获点之间的信息。在本文中,我们提出了一个两阶段的双曲余弦变压器(CHTR3D),用于从LiDAR点云中检测3D对象。提出的CHTR3D通过在线性计算复杂性中应用COSH注意来编码点之间的丰富上下文关系,从而完善了建议。 COSH注意模块降低了注意操作的空间和时间复杂性。传统的SoftMax操作被非负relu激活和基于双曲线肌动蛋白的运算符取代,并具有重新加权机制。对广泛使用的Kitti数据集进行的广泛实验表明,与香草的关注相比,COSH注意力显着提高了竞争性能的推理速度。实验结果表明,在使用点级特征的两阶段最新方法中,提出的CHTR3D是最快的。
Recently, Transformer has achieved great success in computer vision. However, it is constrained because the spatial and temporal complexity grows quadratically with the number of large points in 3D object detection applications. Previous point-wise methods are suffering from time consumption and limited receptive fields to capture information among points. In this paper, we propose a two-stage hyperbolic cosine transformer (ChTR3D) for 3D object detection from LiDAR point clouds. The proposed ChTR3D refines proposals by applying cosh-attention in linear computation complexity to encode rich contextual relationships among points. The cosh-attention module reduces the space and time complexity of the attention operation. The traditional softmax operation is replaced by non-negative ReLU activation and hyperbolic-cosine-based operator with re-weighting mechanism. Extensive experiments on the widely used KITTI dataset demonstrate that, compared with vanilla attention, the cosh-attention significantly improves the inference speed with competitive performance. Experiment results show that, among two-stage state-of-the-art methods using point-level features, the proposed ChTR3D is the fastest one.