带有潜在空间插值的光滑图像到图像翻译

论文标题

带有潜在空间插值的光滑图像到图像翻译

Smooth image-to-image translations with latent space interpolations

论文作者

Liu, Yahui, Sangineto, Enver, Chen, Yajing, Bao, Linchao, Zhang, Haoxian, Sebe, Nicu, Lepri, Bruno, De Nadai, Marco

论文摘要

多域图像到图像图像（I2i）翻译可以根据目标域的样式转换源图像。这些转换的一个重要的，理想的特征是它们的渐进性，当它们各自的潜在空间表示线性插值时，它们对应于源和目标图像之间的平稳变化。但是，当使用域间插值评估时，最新方法通常会表现较差，通常会在外观或非现实中间图像中产生突然的变化。在本文中，我们认为这个问题背后的主要原因之一是缺乏足够的域间训练数据，我们提出了两种不同的正则化方法来减轻此问题：一种新的收缩损失，可以压实潜在的空间，以及一种混合数据启动策略，使域之间的样式表示扁平。我们还提出了一个新的指标，以定量评估插值平滑度的程度，这一方面未充分涵盖现有的I2I翻译指标。使用我们提出的指标和标准评估协议，我们表明我们的正则化技术可以大幅度提高最新的多域I2I翻译。在接受本文后，我们的代码将公开可用。

Multi-domain image-to-image (I2I) translations can transform a source image according to the style of a target domain. One important, desired characteristic of these transformations, is their graduality, which corresponds to a smooth change between the source and the target image when their respective latent-space representations are linearly interpolated. However, state-of-the-art methods usually perform poorly when evaluated using inter-domain interpolations, often producing abrupt changes in the appearance or non-realistic intermediate images. In this paper, we argue that one of the main reasons behind this problem is the lack of sufficient inter-domain training data and we propose two different regularization methods to alleviate this issue: a new shrinkage loss, which compacts the latent space, and a Mixup data-augmentation strategy, which flattens the style representations between domains. We also propose a new metric to quantitatively evaluate the degree of the interpolation smoothness, an aspect which is not sufficiently covered by the existing I2I translation metrics. Using both our proposed metric and standard evaluation protocols, we show that our regularization techniques can improve the state-of-the-art multi-domain I2I translations by a large margin. Our code will be made publicly available upon the acceptance of this article.

下载PDF全文

下载文献需遵守相关版权规定

论文标题