通过操作员学习快速抽样扩散模型

论文标题

通过操作员学习快速抽样扩散模型

Fast Sampling of Diffusion Models via Operator Learning

论文作者

Zheng, Hongkai, Nie, Weili, Vahdat, Arash, Azizzadenesheli, Kamyar, Anandkumar, Anima

论文摘要

扩散模型在各个领域发现了广泛采用。但是，它们的采样过程很慢，因为它需要数百到数千个网络评估才能模拟由微分方程定义的连续过程。在这项工作中，我们使用神经操作员，一种有效的方法来解决概率流微分方程，以加速扩散模型的采样过程。与具有顺序性质的其他快速采样方法相比，我们是第一个提出一种并行解码方法，该方法仅使用一个模型向前传递生成图像。我们建议使用映射初始条件（即高斯分布）的神经操作员（DSNO）进行扩散模型采样，以将其映射到反向扩散过程的连续时间溶液轨迹。为了建模沿轨迹的时间相关性，我们将在傅立叶空间中参数式的时间卷积层引入给定扩散模型主链中。我们显示我们的方法在单模型评估设置中实现了CIFAR-10的最新方法为3.78，Imagenet-64的方法为7.83。

Diffusion models have found widespread adoption in various areas. However, their sampling process is slow because it requires hundreds to thousands of network evaluations to emulate a continuous process defined by differential equations. In this work, we use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models. Compared to other fast sampling methods that have a sequential nature, we are the first to propose a parallel decoding method that generates images with only one model forward pass. We propose diffusion model sampling with neural operator (DSNO) that maps the initial condition, i.e., Gaussian distribution, to the continuous-time solution trajectory of the reverse diffusion process. To model the temporal correlations along the trajectory, we introduce temporal convolution layers that are parameterized in the Fourier space into the given diffusion model backbone. We show our method achieves state-of-the-art FID of 3.78 for CIFAR-10 and 7.83 for ImageNet-64 in the one-model-evaluation setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题