结构感知的nerf无姿势摄像机通过比照限制

论文标题

结构感知的nerf无姿势摄像机通过比照限制

Structure-Aware NeRF without Posed Camera via Epipolar Constraint

论文作者

Chen, Shu, Zhang, Yang, Xu, Yaxin, Zou, Beiji

论文摘要

用于现实的新型视图合成的神经辐射场（NERF）要求相机姿势必须通过一种结构 - 运动（SFM）方法预先获得。这种两阶段的策略并不方便使用和降低性能，因为姿势提取中的错误可以传播到视图合成。我们将姿势提取并将合成视为单一端到端的过程，以便它们可以彼此受益。对于训练NERF模型，仅给出RGB图像，没有预先知道的相机姿势。相机姿势是通过对异地约束获得的，在这种情况下，不同视图中相同的特征具有根据提取的姿势从本地摄像机坐标转变的相同世界坐标。与像素颜色约束共同优化了表极约束。这些姿势由基于CNN的深网表示，其输入是相关框架。这种关节优化使NERF能够意识到现场的结构，其概括性能提高。在各种场景上进行了广泛的实验证明了所提出的方法的有效性。代码可在https://github.com/xtu-pr-lab/sanerf上找到。

The neural radiance field (NeRF) for realistic novel view synthesis requires camera poses to be pre-acquired by a structure-from-motion (SfM) approach. This two-stage strategy is not convenient to use and degrades the performance because the error in the pose extraction can propagate to the view synthesis. We integrate the pose extraction and view synthesis into a single end-to-end procedure so they can benefit from each other. For training NeRF models, only RGB images are given, without pre-known camera poses. The camera poses are obtained by the epipolar constraint in which the identical feature in different views has the same world coordinates transformed from the local camera coordinates according to the extracted poses. The epipolar constraint is jointly optimized with pixel color constraint. The poses are represented by a CNN-based deep network, whose input is the related frames. This joint optimization enables NeRF to be aware of the scene's structure that has an improved generalization performance. Extensive experiments on a variety of scenes demonstrate the effectiveness of the proposed approach. Code is available at https://github.com/XTU-PR-LAB/SaNerf.

下载PDF全文

下载文献需遵守相关版权规定

论文标题