与高斯混合模型的自然梯度变异推理的统一观点

论文标题

与高斯混合模型的自然梯度变异推理的统一观点

A Unified Perspective on Natural Gradient Variational Inference with Gaussian Mixture Models

论文作者

Arenz, Oleg, Dahlinger, Philipp, Ye, Zihan, Volpp, Michael, Neumann, Gerhard

论文摘要

使用高斯混合模型（GMM）的变异推断能够学习高度可触犯但多模式的棘手目标分布的近似值，并具有多达数百个维度。目前针对基于GMM的变异推断的两种最有效的方法VIP和IBAYES-GMM都采用独立的自然梯度更新，以实现单个组件及其权重。我们首次表明，尽管它们的实际实现和理论保证有所不同，但他们的派生更新是等效的。我们确定了几种区分两种方法的设计选择，即相对于样本选择，自然梯度估计，步骤调整以及信任区域是否被强制执行或适应的组件数量。我们认为，对于这两种方法，学习近似的质量可能会遭受相应的设计选择：通过使用来自混合模型的样品更新单个组件，Ibayes-GMM通常无法对低重量组件的有意义的更新产生有意义的更新，并且通过使用零阶方法来估算自然梯度，更高范围的较高量表，以估算零阶方法。此外，我们表明，即使使用一阶自然梯度估计值，信息几何信任区域（vips使用）也是有效的，并且通常优于改进的贝叶斯学习规则（IBLR）更新Ibayes-gmm。我们系统地评估了设计选择的效果，并表明混合方法的表现显着超过了先前的作品。除这项工作外，我们还发布了与高斯混合模型的自然梯度变异推理的高度模块化和有效的实现，该模型支持432种不同的设计选择组合，从而促进了我们所有实验的复制，并且可能对从业者来说很有价值。

Variational inference with Gaussian mixture models (GMMs) enables learning of highly tractable yet multi-modal approximations of intractable target distributions with up to a few hundred dimensions. The two currently most effective methods for GMM-based variational inference, VIPS and iBayes-GMM, both employ independent natural gradient updates for the individual components and their weights. We show for the first time, that their derived updates are equivalent, although their practical implementations and theoretical guarantees differ. We identify several design choices that distinguish both approaches, namely with respect to sample selection, natural gradient estimation, stepsize adaptation, and whether trust regions are enforced or the number of components adapted. We argue that for both approaches, the quality of the learned approximations can heavily suffer from the respective design choices: By updating the individual components using samples from the mixture model, iBayes-GMM often fails to produce meaningful updates to low-weight components, and by using a zero-order method for estimating the natural gradient, VIPS scales badly to higher-dimensional problems. Furthermore, we show that information-geometric trust-regions (used by VIPS) are effective even when using first-order natural gradient estimates, and often outperform the improved Bayesian learning rule (iBLR) update used by iBayes-GMM. We systematically evaluate the effects of design choices and show that a hybrid approach significantly outperforms both prior works. Along with this work, we publish our highly modular and efficient implementation for natural gradient variational inference with Gaussian mixture models, which supports 432 different combinations of design choices, facilitates the reproduction of all our experiments, and may prove valuable for the practitioner.

下载PDF全文

下载文献需遵守相关版权规定

论文标题