论文标题

直接反转:通过扩散模型的无优化文本驱动的真实图像编辑

Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models

论文作者

Elarabawy, Adham, Kamath, Harish, Denton, Samuel

论文摘要

随着大型,公开获取的文本对图像扩散模型的兴起,文本引导的真实图像编辑最近引起了很多研究的关注。现有的方法倾向于依靠某种形式的每一结构或按任务进行微调和优化,需要多种新颖的观点,或者它们固有地纠缠了真实形象身份,语义连贯性和对文本指导的忠诚。在本文中,我们提出了一个无优化和零微调框架,该框架通过文本提示将复杂和非刚性编辑应用于单个真实图像,以避免上述所有陷阱。使用广泛可用的通用预训练的文本对图像扩散模型,我们证明了通过单个目标文本详细说明所需的编辑,以极其灵活的方式调节姿势,场景,背景,样式,颜色,甚至种族身份。此外,我们将$ \ textit {Direct Invernion} $命名的方法提出了多个可配置的超级参数,以允许多种真实图像编辑的类型和范围。我们通过将其应用于多种任务上的各种输入中,将方法证明了方法在生产高质量,多样,语义连贯和忠实的真实图像中的功效。我们还将我们的方法形式化在公认的理论中,详细介绍了未来的实验,以进一步改进,并与最先进的尝试相比。

With the rise of large, publicly-available text-to-image diffusion models, text-guided real image editing has garnered much research attention recently. Existing methods tend to either rely on some form of per-instance or per-task fine-tuning and optimization, require multiple novel views, or they inherently entangle preservation of real image identity, semantic coherence, and faithfulness to text guidance. In this paper, we propose an optimization-free and zero fine-tuning framework that applies complex and non-rigid edits to a single real image via a text prompt, avoiding all the pitfalls described above. Using widely-available generic pre-trained text-to-image diffusion models, we demonstrate the ability to modulate pose, scene, background, style, color, and even racial identity in an extremely flexible manner through a single target text detailing the desired edit. Furthermore, our method, which we name $\textit{Direct Inversion}$, proposes multiple intuitively configurable hyperparameters to allow for a wide range of types and extents of real image edits. We prove our method's efficacy in producing high-quality, diverse, semantically coherent, and faithful real image edits through applying it on a variety of inputs for a multitude of tasks. We also formalize our method in well-established theory, detail future experiments for further improvement, and compare against state-of-the-art attempts.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源