UNERF：用于训练神经辐射场的时间和内存意识U形网络

论文标题

UNERF：用于训练神经辐射场的时间和内存意识U形网络

UNeRF: Time and Memory Conscious U-Shaped Network for Training Neural Radiance Fields

论文作者

Kuganesan, Abiramy, Su, Shih-yang, Little, James J., Rhodin, Helge

论文摘要

神经辐射场（NERFS）增加了新型视图合成和场景重建的重建细节，其应用从大静态场景到动态人类运动。但是，这种神经领域的分辨率和无模型性质的增加是以高训练时间和过度记忆要求为代价的。最近的进步通过使用互补的数据结构改善了推论时间，但这些方法不适合动态场景，并且通常会增加记忆消耗。几乎没有完成培训时所需的资源。我们提出了一种方法，以通过在相邻的样本上部分共享评估来利用NERF基于样本的计算的冗余。我们的UNERF架构的灵感来自UNET，在网络的中间减少空间分辨率，并且在相邻样本之间共享信息。尽管这种变化违反了NERF方法中与观点相关外观和独立的密度估计的严格和有意识的分离，但我们表明它改善了新型视图的综合。我们还引入了一种替代性亚采样策略，该策略共享计算，同时最大程度地减少违反视图不变性的行为。 UNERF是原始NERF网络的插件模块。我们的主要贡献包括减少记忆足迹，提高精度以及在训练和推理期间摊销的处理时间减少。在当地的假设方面只有较弱的假设，我们在各种神经辐射领域任务上实现了改进的资源利用。我们演示了对静态场景的新观点综合以及动态人类形状和运动的应用。

Neural Radiance Fields (NeRFs) increase reconstruction detail for novel view synthesis and scene reconstruction, with applications ranging from large static scenes to dynamic human motion. However, the increased resolution and model-free nature of such neural fields come at the cost of high training times and excessive memory requirements. Recent advances improve the inference time by using complementary data structures yet these methods are ill-suited for dynamic scenes and often increase memory consumption. Little has been done to reduce the resources required at training time. We propose a method to exploit the redundancy of NeRF's sample-based computations by partially sharing evaluations across neighboring sample points. Our UNeRF architecture is inspired by the UNet, where spatial resolution is reduced in the middle of the network and information is shared between adjacent samples. Although this change violates the strict and conscious separation of view-dependent appearance and view-independent density estimation in the NeRF method, we show that it improves novel view synthesis. We also introduce an alternative subsampling strategy which shares computation while minimizing any violation of view invariance. UNeRF is a plug-in module for the original NeRF network. Our major contributions include reduction of the memory footprint, improved accuracy, and reduced amortized processing time both during training and inference. With only weak assumptions on locality, we achieve improved resource utilization on a variety of neural radiance fields tasks. We demonstrate applications to the novel view synthesis of static scenes as well as dynamic human shape and motion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题