深层手术：用人绘制草图的稳健且可控的图像编辑

论文标题

深层手术：用人绘制草图的稳健且可控的图像编辑

Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn Sketches

论文作者

Yang, Shuai, Wang, Zhangyang, Liu, Jiaying, Guo, Zongming

论文摘要

基于草图的图像编辑旨在根据人类绘制草图提供的结构信息综合和修改照片。由于难以收集草图，因此以前的方法主要使用边缘地图而不是草图来训练模型（称为基于边缘的模型）。但是，草图显示了具有边缘图的巨大结构差异，从而使基于边缘的模型失败。此外，草图通常在不同用户中表现出众多的种类，要求编辑模型更高的可推广性和鲁棒性。在本文中，我们提出了深层整形手术，这是一个新颖，可靠且可控的图像编辑框架，允许用户使用手绘草图输入进行交互编辑图像。我们提出了一个草图简化策略，灵感来自艺术家的粗到精细绘图过程，我们显示的可以帮助我们的模型很好地适应休闲和多样的草图，而无需真正的草图训练数据。我们的模型进一步提供了一个完善级别的控制参数，该参数使用户能够灵活地定义如何“可靠”的最终输出“可靠”草图，在草图忠诚度和输出真实性之间保持平衡（因为如果绘制输入草图很差，两个目标可能与两个目标相矛盾）。为了实现多层次的改进，我们引入了一个基于样式的调节模块，以允许新网络中不同级别的自适应功能表示。广泛的实验结果表明，我们方法在改善图像编辑的视觉质量和用户控制性方面的优越性优于最先进的方法。

Sketch-based image editing aims to synthesize and modify photos based on the structural information provided by the human-drawn sketches. Since sketches are difficult to collect, previous methods mainly use edge maps instead of sketches to train models (referred to as edge-based models). However, sketches display great structural discrepancy with edge maps, thus failing edge-based models. Moreover, sketches often demonstrate huge variety among different users, demanding even higher generalizability and robustness for the editing model to work. In this paper, we propose Deep Plastic Surgery, a novel, robust and controllable image editing framework that allows users to interactively edit images using hand-drawn sketch inputs. We present a sketch refinement strategy, as inspired by the coarse-to-fine drawing process of the artists, which we show can help our model well adapt to casual and varied sketches without the need for real sketch training data. Our model further provides a refinement level control parameter that enables users to flexibly define how "reliable" the input sketch should be considered for the final output, balancing between sketch faithfulness and output verisimilitude (as the two goals might contradict if the input sketch is drawn poorly). To achieve the multi-level refinement, we introduce a style-based module for level conditioning, which allows adaptive feature representations for different levels in a singe network. Extensive experimental results demonstrate the superiority of our approach in improving the visual quality and user controllablity of image editing over the state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题