论文标题
eva3d:2D图像集合的组成3D人类一代
EVA3D: Compositional 3D Human Generation from 2D Image Collections
论文作者
论文摘要
逆图旨在从2D观测值中恢复3D模型。使用可区分的渲染,最近的3D感知生成模型使用2D图像显示了刚性对象生成的令人印象深刻的结果。但是,由于姿势和外观的复杂性和多样性,产生铰接的物体(如人体)仍然具有挑战性。在这项工作中,我们提出了一种仅从2D图像收集中学到的无条件的3D人类生成模型EVA3D。 EVA3D可以用详细的几何形状对3D人类进行样品,并渲染高质量的图像(高达512x256),而没有铃铛和哨子(例如,超级分辨率)。 EVA3D的核心是人类的组成形成,它将人体分为本地部分。每个部分都由单个音量表示。该组成表示能力1)固有的人先验,2)网络参数的自适应分配,3)有效的训练和渲染。此外,为了适应稀疏2D人类图像收集的特征(例如,姿势分布不平衡),我们提出了一种姿势引导的采样策略,以更好地学习。广泛的实验验证了EVA3D在几何和纹理质量方面取得了最先进的3D人类发电性能。值得注意的是,EVA3D具有通过干净的框架“逆向图”人体“逆向图”的巨大潜力和可扩展性。
Inverse graphics aims to recover 3D models from 2D observations. Utilizing differentiable rendering, recent 3D-aware generative models have shown impressive results of rigid object generation using 2D images. However, it remains challenging to generate articulated objects, like human bodies, due to their complexity and diversity in poses and appearances. In this work, we propose, EVA3D, an unconditional 3D human generative model learned from 2D image collections only. EVA3D can sample 3D humans with detailed geometry and render high-quality images (up to 512x256) without bells and whistles (e.g. super resolution). At the core of EVA3D is a compositional human NeRF representation, which divides the human body into local parts. Each part is represented by an individual volume. This compositional representation enables 1) inherent human priors, 2) adaptive allocation of network parameters, 3) efficient training and rendering. Moreover, to accommodate for the characteristics of sparse 2D human image collections (e.g. imbalanced pose distribution), we propose a pose-guided sampling strategy for better GAN learning. Extensive experiments validate that EVA3D achieves state-of-the-art 3D human generation performance regarding both geometry and texture quality. Notably, EVA3D demonstrates great potential and scalability to "inverse-graphics" diverse human bodies with a clean framework.