论文标题
新的视图合成仅来自两个阶段网络的6多道相机姿势
Novel View Synthesis from only a 6-DoF Camera Pose by Two-stage Networks
论文作者
论文摘要
新型视图合成是计算机视觉和机器人技术中的一个具有挑战性的问题。与现有作品不同的作品不同,这些作品需要场景的参考图像或3D模型才能在新视图下生成图像,我们为这个问题提出了一种新颖的范式。也就是说,我们只直接从6多型相机姿势中综合了新型视图。尽管此设置是最直接的方式,但几乎没有解决问题的作品。虽然我们的实验表明,使用简洁的CNN,我们可以获得有意义的参数模型,该模型只能从6-DOF姿势重建正确的风景图像。为此,我们提出了一种两阶段的学习策略,该策略由两个连续的CNN组成:Gennet和Refinenet。 Gennet从相机姿势产生粗图像。炼油者是一种生成的对抗网络,可完善粗糙图像。这样,我们将映射和纹理细节渲染之间的几何关系解除了几何关系。在公共数据集上进行的广泛实验证明了我们方法的有效性。我们认为,这种范式具有很高的研究和应用价值,并且可能是新型视图合成的重要方向。
Novel view synthesis is a challenging problem in computer vision and robotics. Different from the existing works, which need the reference images or 3D models of the scene to generate images under novel views, we propose a novel paradigm to this problem. That is, we synthesize the novel view from only a 6-DoF camera pose directly. Although this setting is the most straightforward way, there are few works addressing it. While, our experiments demonstrate that, with a concise CNN, we could get a meaningful parametric model that could reconstruct the correct scenery images only from the 6-DoF pose. To this end, we propose a two-stage learning strategy, which consists of two consecutive CNNs: GenNet and RefineNet. GenNet generates a coarse image from a camera pose. RefineNet is a generative adversarial network that refines the coarse image. In this way, we decouple the geometric relationship between mapping and texture detail rendering. Extensive experiments conducted on the public datasets prove the effectiveness of our method. We believe this paradigm is of high research and application value and could be an important direction in novel view synthesis.