带有变性自动编码器的学习能量模型作为摊销样品

论文标题

带有变性自动编码器的学习能量模型作为摊销样品

Learning Energy-Based Model with Variational Auto-Encoder as Amortized Sampler

论文作者

Xie, Jianwen, Zheng, Zilong, Li, Ping

论文摘要

由于棘手的分区函数，训练基于训练能量的模型（EBM）的最大可能性需要马尔可夫链蒙特卡洛（MCMC）采样，以近似数据和模型分布之间的kullback-leibler差异的梯度。但是，由于模式之间的混合难度，从EBM进行样品是非平凡的。在本文中，我们建议学习一个变性自动编码器（VAE），以初始化有限端MCMC，例如从能量函数中得出的langevin动力学，以有效地对EBM进行摊销采样。借助这些摊销的MCMC样品，可以通过最大似然训练EBM，遵循“通过合成”方案进行“分析”； vae通过各种贝叶斯从这些MCMC样品中学习。我们将这种联合培训算法称为变异MCMC教学，其中VAE将EBM追逐数据分布。我们将学习算法解释为信息几何形状上下文中的动态交替投影。我们提出的模型可以生成与gan和eBM相当的样品。此外，我们证明我们的模型可以学习有效的概率分布，以实现有条件的有条件学习任务。

Due to the intractable partition function, training energy-based models (EBMs) by maximum likelihood requires Markov chain Monte Carlo (MCMC) sampling to approximate the gradient of the Kullback-Leibler divergence between data and model distributions. However, it is non-trivial to sample from an EBM because of the difficulty of mixing between modes. In this paper, we propose to learn a variational auto-encoder (VAE) to initialize the finite-step MCMC, such as Langevin dynamics that is derived from the energy function, for efficient amortized sampling of the EBM. With these amortized MCMC samples, the EBM can be trained by maximum likelihood, which follows an "analysis by synthesis" scheme; while the VAE learns from these MCMC samples via variational Bayes. We call this joint training algorithm the variational MCMC teaching, in which the VAE chases the EBM toward data distribution. We interpret the learning algorithm as a dynamic alternating projection in the context of information geometry. Our proposed models can generate samples comparable to GANs and EBMs. Additionally, we demonstrate that our model can learn effective probabilistic distribution toward supervised conditional learning tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题