论文标题
通过类自适应归一化的有效语义图像合成
Efficient Semantic Image Synthesis via Class-Adaptive Normalization
论文作者
论文摘要
最近在有条件的语义图像合成中,空间自适应的归一化(Spade)非常成功,{Park2019Semantic},该{Park2019semantic}通过从语义布局中汲取的空间变化转换来调节归一化激活,以防止语义信息被洗净。尽管表现令人印象深刻,但仍然需要对框内的优势进行更透彻的了解,以帮助减少这种新颖结构引入的重要计算和参数开销。在本文中,从回报的角度来看,我们对这种空间自适应归一化的有效性进行了深入的分析,并观察到其调制参数从语义意识而不是空间适应性中受益更多,尤其是对于高分辨率输入掩码。受到这一观察的启发,我们提出了类自适应归一化(进化枝),这是一种轻巧但同样有效的变体,仅适应语义类别。为了进一步提高空间适应性,我们介绍了从语义布局计算出的阶级内部位置图,以调节进化枝的归一化参数,并提出了一个真正的空间自适应变体,即进化枝 - iCpe,即在多个有挑战性的数据上进行分离,使得在多个有挑战性的数据中进行分化,以使其相比,我们可以证明既定的质量,从而证明了普遍的质量,以使其相比又可以证明既定的质量Spade,但效率更少,额外的参数较少,计算成本较低。代码和预估计的模型可在\ url {https://github.com/tzt101/clade.git}上找到。
Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis \cite{park2019semantic}, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away. Despite its impressive performance, a more thorough understanding of the advantages inside the box is still highly demanded to help reduce the significant computation and parameter overhead introduced by this novel structure. In this paper, from a return-on-investment point of view, we conduct an in-depth analysis of the effectiveness of this spatially-adaptive normalization and observe that its modulation parameters benefit more from semantic-awareness rather than spatial-adaptiveness, especially for high-resolution input masks. Inspired by this observation, we propose class-adaptive normalization (CLADE), a lightweight but equally-effective variant that is only adaptive to semantic class. In order to further improve spatial-adaptiveness, we introduce intra-class positional map encoding calculated from semantic layouts to modulate the normalization parameters of CLADE and propose a truly spatially-adaptive variant of CLADE, namely CLADE-ICPE.Through extensive experiments on multiple challenging datasets, we demonstrate that the proposed CLADE can be generalized to different SPADE-based methods while achieving comparable generation quality compared to SPADE, but it is much more efficient with fewer extra parameters and lower computational cost. The code and pretrained models are available at \url{https://github.com/tzt101/CLADE.git}.