语义上多模式图像合成

论文标题

语义上多模式图像合成

Semantically Multi-modal Image Synthesis

论文作者

Zhu, Zhen, Xu, Zhiliang, You, Ansheng, Bai, Xiang

论文摘要

在本文中，我们着重于语义上的多模式图像合成（SMIS）任务，即在语义级别生成多模式图像。以前的工作试图使用多个类别的发电机，从而在数据集中限制了其使用少量类的数据集。相反，我们提出了一个新的群体减少网络（GroupDnet），该网络利用了发电机中的组卷积，并逐渐减少解码器中的卷积的组数量。因此，GroupDnet具有将语义标签转换为自然图像的更具可控性，并且具有许多类别的数据集具有合理的高质量产量。在几个具有挑战性的数据集上进行的实验证明了GroupDNET对执行SMIS任务的优势。我们还表明，GroupDnet能够执行广泛的有趣合成应用。代码和模型可在以下网址提供：https：//github.com/seanseattle/smis。

In this paper, we focus on semantically multi-modal image synthesis (SMIS) task, namely, generating multi-modal images at the semantic level. Previous work seeks to use multiple class-specific generators, constraining its usage in datasets with a small number of classes. We instead propose a novel Group Decreasing Network (GroupDNet) that leverages group convolutions in the generator and progressively decreases the group numbers of the convolutions in the decoder. Consequently, GroupDNet is armed with much more controllability on translating semantic labels to natural images and has plausible high-quality yields for datasets with many classes. Experiments on several challenging datasets demonstrate the superiority of GroupDNet on performing the SMIS task. We also show that GroupDNet is capable of performing a wide range of interesting synthesis applications. Codes and models are available at: https://github.com/Seanseattle/SMIS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题