论文标题
K-SAM:SGD速度的清晰度最小化
K-SAM: Sharpness-Aware Minimization at the Speed of SGD
论文作者
论文摘要
清晰感最小化(SAM)最近已成为一种强大的技术,用于提高深神经网络的准确性。但是,SAM在实践中产生了高计算成本,最多需要香草SGD的两倍。 SAM提出的计算挑战之所以出现,是因为每次迭代都需要上升和下降步骤,从而使梯度计算增加一倍。为了应对这一挑战,我们建议仅在损失最高的前K样本上计算SAM的两个阶段的梯度。 K-SAM简单且非常易于实现,同时几乎没有额外的成本提供了对香草SGD的显着概括提升。
Sharpness-Aware Minimization (SAM) has recently emerged as a robust technique for improving the accuracy of deep neural networks. However, SAM incurs a high computational cost in practice, requiring up to twice as much computation as vanilla SGD. The computational challenge posed by SAM arises because each iteration requires both ascent and descent steps and thus double the gradient computations. To address this challenge, we propose to compute gradients in both stages of SAM on only the top-k samples with highest loss. K-SAM is simple and extremely easy-to-implement while providing significant generalization boosts over vanilla SGD at little to no additional cost.