3D LatentMapper：查看3D形状的不可知的单视图重建

论文标题

3D LatentMapper：查看3D形状的不可知的单视图重建

3D-LatentMapper: View Agnostic Single-View Reconstruction of 3D Shapes

论文作者

Dirik, Alara, Yanardag, Pinar

论文摘要

计算机图形，3D计算机视觉和机器人群落已经产生了多种代表和生成3D形状的方法，以及大量用例。但是，单视图重建仍然是一个充满挑战的话题，可以解锁各种有趣的用例，例如交互式设计。在这项工作中，我们提出了一个新型框架，该框架利用了视觉变压器（VIT）的中间潜在空间和一个联合图像文本表示模型Clip，用于快速有效的单视图重建（SVR）。更具体地说，我们提出了一种新型的映射网络体系结构，该构图在VIT和CLIP提取的深度特征与基本3D生成模型的潜在空间之间学习了映射。与以前的工作不同，我们的方法可以使3D形状的视图反应重建，即使存在大型遮挡。我们使用shapenetv2数据集并与SOTA方法进行比较进行广泛的实验，以证明我们的方法的有效性。

Computer graphics, 3D computer vision and robotics communities have produced multiple approaches to represent and generate 3D shapes, as well as a vast number of use cases. However, single-view reconstruction remains a challenging topic that can unlock various interesting use cases such as interactive design. In this work, we propose a novel framework that leverages the intermediate latent spaces of Vision Transformer (ViT) and a joint image-text representational model, CLIP, for fast and efficient Single View Reconstruction (SVR). More specifically, we propose a novel mapping network architecture that learns a mapping between deep features extracted from ViT and CLIP, and the latent space of a base 3D generative model. Unlike previous work, our method enables view-agnostic reconstruction of 3D shapes, even in the presence of large occlusions. We use the ShapeNetV2 dataset and perform extensive experiments with comparisons to SOTA methods to demonstrate our method's effectiveness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题