一切都在潜在空间中：属性编辑和属性样式操纵由stylegan潜在空间探索

论文标题

一切都在潜在空间中：属性编辑和属性样式操纵由stylegan潜在空间探索

Everything is There in Latent Space: Attribute Editing and Attribute Style Manipulation by StyleGAN Latent Space Exploration

论文作者

Parihar, Rishubh, Dhiman, Ankit, Karmali, Tejan, Babu, R. Venkatesh

论文摘要

现在，使用最近的生成对抗网络（GAN）可以使用高现实主义的不受约束的图像产生。但是，用给定的一组属性生成图像非常具有挑战性。最近的方法使用基于样式的GAN模型来执行图像编辑，通过利用发电机层中存在的语义层次结构。我们提出了几个基于潜在的属性操纵和编辑（火焰），这是一个简单而有效的框架，可通过潜在空间操纵进行高度控制的图像编辑。具体而言，我们估计了控制生成图像中语义属性的潜在空间（预训练样式）的线性方向。与以前的方法相反，这些方法依赖于大规模属性标记的数据集或属性分类器，而火焰使用了一些策划的图像对的最小监督来估算删除的编辑指示。火焰可以在保留身份的同时，同时在各种图像上同时执行高精度和顺序编辑。此外，我们提出了一项新颖的属性样式操纵任务，以生成诸如眼镜和头发之类的属性的各种样式。我们首先编码相同身份的一组合成图像，但在潜在空间中具有不同的属性样式，以估计属性样式歧管。从该歧管中采样新的潜在将导致生成图像中的新属性样式。我们提出了一种新颖的抽样方法，以从歧管中采样潜在的样品，从而使我们能够生成各种属性样式，而不是训练集中存在的样式。火焰可以以分离的方式生成多种属性样式。我们通过广泛的定性和定量比较来说明火焰与先前的图像编辑方法相对于先前的图像编辑方法的出色性能。火焰在多个数据集（例如汽车和教堂）上也很好地概括了。

Unconstrained Image generation with high realism is now possible using recent Generative Adversarial Networks (GANs). However, it is quite challenging to generate images with a given set of attributes. Recent methods use style-based GAN models to perform image editing by leveraging the semantic hierarchy present in the layers of the generator. We present Few-shot Latent-based Attribute Manipulation and Editing (FLAME), a simple yet effective framework to perform highly controlled image editing by latent space manipulation. Specifically, we estimate linear directions in the latent space (of a pre-trained StyleGAN) that controls semantic attributes in the generated image. In contrast to previous methods that either rely on large-scale attribute labeled datasets or attribute classifiers, FLAME uses minimal supervision of a few curated image pairs to estimate disentangled edit directions. FLAME can perform both individual and sequential edits with high precision on a diverse set of images while preserving identity. Further, we propose a novel task of Attribute Style Manipulation to generate diverse styles for attributes such as eyeglass and hair. We first encode a set of synthetic images of the same identity but having different attribute styles in the latent space to estimate an attribute style manifold. Sampling a new latent from this manifold will result in a new attribute style in the generated image. We propose a novel sampling method to sample latent from the manifold, enabling us to generate a diverse set of attribute styles beyond the styles present in the training set. FLAME can generate diverse attribute styles in a disentangled manner. We illustrate the superior performance of FLAME against previous image editing methods by extensive qualitative and quantitative comparisons. FLAME also generalizes well on multiple datasets such as cars and churches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题