R2-MLP：多视图3D对象识别的圆形MLP

论文标题

R2-MLP：多视图3D对象识别的圆形MLP

R2-MLP: Round-Roll MLP for Multi-View 3D Object Recognition

论文作者

Chen, Shuo, Yu, Tan, Li, Ping

论文摘要

最近，仅基于多层感知器（MLP）的视觉体系结构在计算机视觉社区中引起了很多关注。 MLP样模型在单个2D图像分类上实现竞争性能，而没有手工制作的卷积层的感应性偏差较少。在这项工作中，我们探讨了基于MLP的体系结构对基于视图的3D对象识别任务的有效性。我们提出了一个基于MLP的体系结构，称为圆形MLP（R $^2 $ -MLP）。它通过考虑来自不同视图的补丁之间的通信来扩展空间移位MLP主链。 r $^2 $ -MLP沿视图维度滚动部分频道，并促进相邻视图之间的信息交换。我们基于ModelNet10和ModelNet40数据集进行基准MLP结果，并在各个方面进行消融。实验结果表明，与现有的最新方法相比，凭借概念上简单的结构，我们的R $^2 $ -MLP实现了竞争性能。

Recently, vision architectures based exclusively on multi-layer perceptrons (MLPs) have gained much attention in the computer vision community. MLP-like models achieve competitive performance on a single 2D image classification with less inductive bias without hand-crafted convolution layers. In this work, we explore the effectiveness of MLP-based architecture for the view-based 3D object recognition task. We present an MLP-based architecture termed as Round-Roll MLP (R$^2$-MLP). It extends the spatial-shift MLP backbone by considering the communications between patches from different views. R$^2$-MLP rolls part of the channels along the view dimension and promotes information exchange between neighboring views. We benchmark MLP results on ModelNet10 and ModelNet40 datasets with ablations in various aspects. The experimental results show that, with a conceptually simple structure, our R$^2$-MLP achieves competitive performance compared with existing state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题