重新思考清晰度 - 最小化作为变异推断

论文标题

重新思考清晰度 - 最小化作为变异推断

Rethinking Sharpness-Aware Minimization as Variational Inference

论文作者

Ujváry, Szilvia, Telek, Zsigmond, Kerekes, Anna, Mészáros, Anna, Huszár, Ferenc

论文摘要

清晰度感知最小化（SAM）旨在通过寻找平坦的微型体来改善基于梯度的学习的概括。在这项工作中，我们建立了神经网络参数的SAM与平均场变异推理（MFVI）之间的联系。我们表明，这两种方法都有解释为优化平坦度的概念，并且在使用Reparametrisation Track时，它们都沸腾以在当前平均参数的扰动版本下计算梯度。这种思想激发了我们对SAM和MFVI之间结合或插值的算法的研究。我们评估了几个基准数据集上提出的变分算法，并将其性能与SAM变体进行比较。从更广泛的角度来看，我们的工作表明，类似SAM的更新可以用作替换重新测量技巧的替换。

Sharpness-aware minimization (SAM) aims to improve the generalisation of gradient-based learning by seeking out flat minima. In this work, we establish connections between SAM and Mean-Field Variational Inference (MFVI) of neural network parameters. We show that both these methods have interpretations as optimizing notions of flatness, and when using the reparametrisation trick, they both boil down to calculating the gradient at a perturbed version of the current mean parameter. This thinking motivates our study of algorithms that combine or interpolate between SAM and MFVI. We evaluate the proposed variational algorithms on several benchmark datasets, and compare their performance to variants of SAM. Taking a broader perspective, our work suggests that SAM-like updates can be used as a drop-in replacement for the reparametrisation trick.

下载PDF全文

下载文献需遵守相关版权规定

论文标题