通过神经元排列优化模式连接性

论文标题

通过神经元排列优化模式连接性

Optimizing Mode Connectivity via Neuron Alignment

论文作者

Tatro, N. Joseph, Chen, Pin-Yu, Das, Payel, Melnyk, Igor, Sattigeri, Prasanna, Lai, Rongjie

论文摘要

深神网络的损失景观由于其高概念性而无法充分理解。从经验上讲，这些损失函数的局部最小值可以通过模型空间中的学习曲线连接，损失几乎保持恒定。称为模式连接的功能。然而，当前曲线查找算法并未考虑模型重量排列产生的损耗表面中对称性的影响。我们提出了一个更通用的框架，通过考虑正在连接的网络的重量排列来研究对称性对景观连接的影响。为了近似最佳的排列，我们引入了一种廉价的启发式，称为神经元对齐。神经元比对促进了模型沿曲线的中间激活的分布之间的相似性。我们提供理论分析，以基于这种简单的启发式态度确定对模式连接性的好处。我们从经验上验证，通过对齐方式给出的置换是通过近端交替最小化方案在局部最佳的。从经验上讲，优化体重排列对于有效地学习成功推广的网络之间的简单，平面，低损失曲线至关重要。我们的对准方法可以显着缓解连接两个对抗性强大模型的路径上最近确定的稳健损耗屏障，并在路径上找到更健壮和准确的模型。

The loss landscapes of deep neural networks are not well understood due to their high nonconvexity. Empirically, the local minima of these loss functions can be connected by a learned curve in model space, along which the loss remains nearly constant; a feature known as mode connectivity. Yet, current curve finding algorithms do not consider the influence of symmetry in the loss surface created by model weight permutations. We propose a more general framework to investigate the effect of symmetry on landscape connectivity by accounting for the weight permutations of the networks being connected. To approximate the optimal permutation, we introduce an inexpensive heuristic referred to as neuron alignment. Neuron alignment promotes similarity between the distribution of intermediate activations of models along the curve. We provide theoretical analysis establishing the benefit of alignment to mode connectivity based on this simple heuristic. We empirically verify that the permutation given by alignment is locally optimal via a proximal alternating minimization scheme. Empirically, optimizing the weight permutation is critical for efficiently learning a simple, planar, low-loss curve between networks that successfully generalizes. Our alignment method can significantly alleviate the recently identified robust loss barrier on the path connecting two adversarial robust models and find more robust and accurate models on the path.

下载PDF全文

下载文献需遵守相关版权规定

论文标题