DGFONT ++：无监督字体生成的强大可变形生成网络

论文标题

DGFONT ++：无监督字体生成的强大可变形生成网络

DGFont++: Robust Deformable Generative Networks for Unsupervised Font Generation

论文作者

Chen, Xinyuan, Xie, Yangchen, Sun, Li, Lu, Yue

论文摘要

没有人类专家的自动字体生成是一个实用且重大的问题，尤其是对于由大量字符组成的某些语言。现有的字体生成方法通常在监督学习中。他们需要大量的配对数据，这些数据是劳动密集型且收集昂贵的。相反，常见的无监督图像对图像翻译方法不适用于字体生成，因为它们通常将样式定义为一组纹理和颜色。在这项工作中，我们为无监督字体生成（缩写为DGFONT ++）提出了一个可靠的可变形生成网络。我们引入了特征变形跳过连接（FDSC），以学习字体之间的本地模式和几何变换。 FDSC预测了位移图对，并采用预测的地图将可变形的卷积应用于低级内容特征图。 FDSC的输出被馈入混合器以生成最终结果。此外，我们介绍了对比对比的自我监督学习，以了解字体的相似性和相似之处，以了解字体的强大风格表示。为了区分不同的样式，我们使用多任务歧视器训练我们的模型，这确保可以独立歧视每种样式。除了对抗性损失外，还采用了另外两个重建损失，以限制生成的图像和内容图像之间的域不变特征。利用FDSC和采用的损失功能，我们的模型能够维护空间信息并以无监督的方式生成高质量的角色图像。实验表明，我们的模型能够生成比最先进方法更高质量的字符图像。

Automatic font generation without human experts is a practical and significant problem, especially for some languages that consist of a large number of characters. Existing methods for font generation are often in supervised learning. They require a large number of paired data, which are labor-intensive and expensive to collect. In contrast, common unsupervised image-to-image translation methods are not applicable to font generation, as they often define style as the set of textures and colors. In this work, we propose a robust deformable generative network for unsupervised font generation (abbreviated as DGFont++). We introduce a feature deformation skip connection (FDSC) to learn local patterns and geometric transformations between fonts. The FDSC predicts pairs of displacement maps and employs the predicted maps to apply deformable convolution to the low-level content feature maps. The outputs of FDSC are fed into a mixer to generate final results. Moreover, we introduce contrastive self-supervised learning to learn a robust style representation for fonts by understanding the similarity and dissimilarities of fonts. To distinguish different styles, we train our model with a multi-task discriminator, which ensures that each style can be discriminated independently. In addition to adversarial loss, another two reconstruction losses are adopted to constrain the domain-invariant characteristics between generated images and content images. Taking advantage of FDSC and the adopted loss functions, our model is able to maintain spatial information and generates high-quality character images in an unsupervised manner. Experiments demonstrate that our model is able to generate character images of higher quality than state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题