LT-GAN：具有潜在转化检测的自我监督的gan

论文标题

LT-GAN：具有潜在转化检测的自我监督的gan

LT-GAN: Self-Supervised GAN with Latent Transformation Detection

论文作者

Patel, Parth, Kumari, Nupur, Singh, Mayank, Krishnamurthy, Balaji

论文摘要

生成的对抗网络（GAN）以及自我监督任务结合了无条件和半监督图像生成的有希望的结果。我们提出了一种自我监督的方法（LT-GAN），以通过估计GAN诱导的转换（即通过扰动生成器的潜在空间引起的图像引起的转换）来提高图像的产生质量和多样性。具体而言，给定两对图像，其中每对由生成的图像及其转换版本组成，自uphervision任务旨在确定给定对中应用的潜在转换是否与另一对相同。因此，这种辅助损失鼓励发电机产生由辅助网络可区分的图像，从而促进了相对于潜在变换的语义一致图像的综合图像。我们通过在CIFAR-10，Celeba-HQ和Imagenet数据集上的最先进模型和无条件设置的最先进模型上提高图像生成质量，从而展示了这一借口任务的功效。此外，我们从经验上表明，LT-GAN有助于改善基线模型的Celeba-HQ和Imagenet的受控图像编辑。我们在实验上证明，我们提出的LT自我实施任务可以有效地与其他最先进的培训技术相结合，以增加收益。因此，我们表明我们的方法在有条件的CIFAR-10图像生成上实现了9.8的新最新FID得分。

Generative Adversarial Networks (GANs) coupled with self-supervised tasks have shown promising results in unconditional and semi-supervised image generation. We propose a self-supervised approach (LT-GAN) to improve the generation quality and diversity of images by estimating the GAN-induced transformation (i.e. transformation induced in the generated images by perturbing the latent space of generator). Specifically, given two pairs of images where each pair comprises of a generated image and its transformed version, the self-supervision task aims to identify whether the latent transformation applied in the given pair is same to that of the other pair. Hence, this auxiliary loss encourages the generator to produce images that are distinguishable by the auxiliary network, which in turn promotes the synthesis of semantically consistent images with respect to latent transformations. We show the efficacy of this pretext task by improving the image generation quality in terms of FID on state-of-the-art models for both conditional and unconditional settings on CIFAR-10, CelebA-HQ and ImageNet datasets. Moreover, we empirically show that LT-GAN helps in improving controlled image editing for CelebA-HQ and ImageNet over baseline models. We experimentally demonstrate that our proposed LT self-supervision task can be effectively combined with other state-of-the-art training techniques for added benefits. Consequently, we show that our approach achieves the new state-of-the-art FID score of 9.8 on conditional CIFAR-10 image generation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题