通过潜在空间映射面对身份分开

论文标题

通过潜在空间映射面对身份分开

Face Identity Disentanglement via Latent Space Mapping

论文作者

Nitzan, Yotam, Bermano, Amit, Li, Yangyan, Cohen-Or, Daniel

论文摘要

学习数据的数据表示是人工智能中的一个基本问题。具体而言，解散的潜在表示允许生成模型在合成过程中控制并构成了分离的因素。但是，当前的方法需要广泛的监督和培训，或者需要明显损害质量。在本文中，我们提出了一种方法，该方法通过最少的监督来学习如何以分离的方式表示数据，仅使用可用的预训练的网络表现出来。我们的关键见解是通过采用领先的预培训的无条件图像发生器（例如Stylegan）来解除分离和合成过程。通过学习将其映射到其潜在空间中，我们既利用其最先进的质量，又利用其丰富而富有表现力的空间，而没有训练它的负担。我们展示了我们在人头的复杂且高维领域的方法。我们在定性和定量上评估我们的方法，并通过去识别操作和图像序列中的时间身份相干性表现出成功。通过广泛的实验，我们表明我们的方法成功地将身份从其他面部属性中脱离了，即使他们需要更多的培训和监督，也可以超过现有方法。

Learning disentangled representations of data is a fundamental problem in artificial intelligence. Specifically, disentangled latent representations allow generative models to control and compose the disentangled factors in the synthesis process. Current methods, however, require extensive supervision and training, or instead, noticeably compromise quality. In this paper, we present a method that learns how to represent data in a disentangled way, with minimal supervision, manifested solely using available pre-trained networks. Our key insight is to decouple the processes of disentanglement and synthesis, by employing a leading pre-trained unconditional image generator, such as StyleGAN. By learning to map into its latent space, we leverage both its state-of-the-art quality, and its rich and expressive latent space, without the burden of training it. We demonstrate our approach on the complex and high dimensional domain of human heads. We evaluate our method qualitatively and quantitatively, and exhibit its success with de-identification operations and with temporal identity coherency in image sequences. Through extensive experimentation, we show that our method successfully disentangles identity from other facial attributes, surpassing existing methods, even though they require more training and supervision.

下载PDF全文

下载文献需遵守相关版权规定

论文标题