论文标题

Rodin:使用扩散雕刻3D数字化头像的生成模型

Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion

论文作者

Wang, Tengfei, Zhang, Bo, Zhang, Ting, Gu, Shuyang, Bao, Jianmin, Baltrusaitis, Tadas, Shen, Jingjing, Chen, Dong, Wen, Fang, Chen, Qifeng, Guo, Baining

论文摘要

本文提出了一个3D生成模型,该模型使用扩散模型自动生成表示为神经辐射场的3D数字化头像。产生此类化身的一个重大挑战是,3D中的记忆和处理成本对于产生高质量化身所需的丰富细节而令人难以置信。为了解决此问题,我们提出了推出扩散网络(RODIN),该网络代表神经辐射场,作为多个2D特征图,并将这些地图推向一个单个2D特征平面,我们在其中执行3D感知的扩散。 Rodin模型带来了急需的计算效率,同时通过使用3D感知卷积来保留3D扩散的完整性,该卷积会根据其原始关系3D中的原始关系在2D特征平面中进行预测。我们还使用潜在调节来协调全球连贯性的功能生成,从而导致高保真化的头像,并根据文本提示启用其语义编辑。最后,我们使用层次合成来进一步增强细节。与现有生成技术产生的模型相比,我们的模型产生的3D化身相比。我们可以产生高度详细的化身,并具有逼真的发型和像胡须一样的面部头发。我们还从图像或文本以及文本指导的编辑性中演示了3D头像生成。

This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars represented as neural radiance fields. A significant challenge in generating such avatars is that the memory and processing costs in 3D are prohibitive for producing the rich details required for high-quality avatars. To tackle this problem we propose the roll-out diffusion network (Rodin), which represents a neural radiance field as multiple 2D feature maps and rolls out these maps into a single 2D feature plane within which we perform 3D-aware diffusion. The Rodin model brings the much-needed computational efficiency while preserving the integrity of diffusion in 3D by using 3D-aware convolution that attends to projected features in the 2D feature plane according to their original relationship in 3D. We also use latent conditioning to orchestrate the feature generation for global coherence, leading to high-fidelity avatars and enabling their semantic editing based on text prompts. Finally, we use hierarchical synthesis to further enhance details. The 3D avatars generated by our model compare favorably with those produced by existing generative techniques. We can generate highly detailed avatars with realistic hairstyles and facial hair like beards. We also demonstrate 3D avatar generation from image or text as well as text-guided editability.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源