检索指导的无监督的多域图像到图像翻译

论文标题

检索指导的无监督的多域图像到图像翻译

Retrieval Guided Unsupervised Multi-domain Image-to-Image Translation

论文作者

Gomez, Raul, Liu, Yahui, De Nadai, Marco, Karatzas, Dimosthenis, Lepri, Bruno, Sebe, Nicu

论文摘要

图像到图像翻译旨在学习将图像从一个视觉域转换为另一个视觉域的映射。最近的作品假定图像描述符可以分解为域不变的内容表示形式和特定于域的样式表示形式。因此，翻译模型试图在将样式更改为目标视觉域的同时保留源图像的内容。但是，综合新图像极具挑战性，尤其是在多域翻译中，因为网络必须构成内容和样式以在多个域中生成可靠和多样的图像。在本文中，我们建议使用图像检索系统来协助图像到图像翻译任务。首先，我们将图像到图像翻译模型训练以将图像映射到多个域。然后，我们使用真实图像和生成的图像训练图像检索模型，以找到类似于内容中但在不同域中查询的图像类似的图像。最后，我们利用图像检索系统来微调图像到图像翻译模型并生成更高质量的图像。我们的实验显示了提出的解决方案的有效性，并突出了检索网络的贡献，该网络可以受益于其他未标记的数据，并在存在稀缺数据的情况下帮助图像到图像翻译模型。

Image to image translation aims to learn a mapping that transforms an image from one visual domain to another. Recent works assume that images descriptors can be disentangled into a domain-invariant content representation and a domain-specific style representation. Thus, translation models seek to preserve the content of source images while changing the style to a target visual domain. However, synthesizing new images is extremely challenging especially in multi-domain translations, as the network has to compose content and style to generate reliable and diverse images in multiple domains. In this paper we propose the use of an image retrieval system to assist the image-to-image translation task. First, we train an image-to-image translation model to map images to multiple domains. Then, we train an image retrieval model using real and generated images to find images similar to a query one in content but in a different domain. Finally, we exploit the image retrieval system to fine-tune the image-to-image translation model and generate higher quality images. Our experiments show the effectiveness of the proposed solution and highlight the contribution of the retrieval network, which can benefit from additional unlabeled data and help image-to-image translation models in the presence of scarce data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题