将MCMC降级用于加速基于扩散的生成模型

论文标题

将MCMC降级用于加速基于扩散的生成模型

Denoising MCMC for Accelerating Diffusion-Based Generative Models

论文作者

Kim, Beomsu, Ye, Jong Chul

论文摘要

扩散模型是强大的生成模型，可以使用得分函数模拟扩散过程的反面，以合成噪声数据。扩散模型的采样过程可以解释为求解反向随机微分方程（SDE）或扩散过程的普通微分方程（ODE），这通常需要多达数千个离散步骤来生成单个图像。这引发了人们对开发反向S/ODE的有效整合技术的极大兴趣。在这里，我们提出了一种基于得分的采样的正交方法：DENOISING MCMC（DMCMC）。 DMCMC首先使用MCMC在数据和方差（或扩散时间）的产品空间中生产样品。然后，使用反向S/ODE积分器来降低MCMC样本。由于MCMC遍历数据歧管附近，因此为DMCMC生产干净样本的计算成本远小于从噪声中产生干净样本的计算成本。为了验证拟议的概念，我们表明DNANGEVIN GIBBS（DLG）（DMCMC实例）成功地加速了这项工作中有关CIFAR10和CEELBA-HQ-HQ-256图像生成的所有六个反向S/ODE集成器。值得注意的是，结合了Karras等人的集成商。（2022）和Song等人的预训练分数模型。（2021b），DLG取得了SOTA结果。在CIFAR10上的分数功能评估（NFE）设置的有限数量中，我们有$ 3.86 $ fid，$ \ \ \ \ \ $ \ $ \ $ 2.63 $ fid，$ \ \ \ \ \ 20 $ nfe。在Celeba-HQ-256上，我们有$ 6.99 $ fid，$ \ $ \ 160 $ nfe，击败了Kim等人当前的最佳记录。（2022）在基于分数的型号中，$ 7.16 $ FID，$ 4000 $ NFE。代码：https：//github.com/1202KBS/DMCMC

Diffusion models are powerful generative models that simulate the reverse of diffusion processes using score functions to synthesize data from noise. The sampling process of diffusion models can be interpreted as solving the reverse stochastic differential equation (SDE) or the ordinary differential equation (ODE) of the diffusion process, which often requires up to thousands of discretization steps to generate a single image. This has sparked a great interest in developing efficient integration techniques for reverse-S/ODEs. Here, we propose an orthogonal approach to accelerating score-based sampling: Denoising MCMC (DMCMC). DMCMC first uses MCMC to produce samples in the product space of data and variance (or diffusion time). Then, a reverse-S/ODE integrator is used to denoise the MCMC samples. Since MCMC traverses close to the data manifold, the computation cost of producing a clean sample for DMCMC is much less than that of producing a clean sample from noise. To verify the proposed concept, we show that Denoising Langevin Gibbs (DLG), an instance of DMCMC, successfully accelerates all six reverse-S/ODE integrators considered in this work on the tasks of CIFAR10 and CelebA-HQ-256 image generation. Notably, combined with integrators of Karras et al. (2022) and pre-trained score models of Song et al. (2021b), DLG achieves SOTA results. In the limited number of score function evaluation (NFE) settings on CIFAR10, we have $3.86$ FID with $\approx 10$ NFE and $2.63$ FID with $\approx 20$ NFE. On CelebA-HQ-256, we have $6.99$ FID with $\approx 160$ NFE, which beats the current best record of Kim et al. (2022) among score-based models, $7.16$ FID with $4000$ NFE. Code: https://github.com/1202kbs/DMCMC

下载PDF全文

下载文献需遵守相关版权规定

论文标题