论文标题

要了解神经网络上对抗性鲁棒性的正则化

Towards Understanding the Regularization of Adversarial Robustness on Neural Networks

论文作者

Wen, Yuxin, Li, Shuai, Jia, Kui

论文摘要

对抗性例子的问题表明,现代神经网络(NN)模型可能相当脆弱。在解决问题的更具成熟的技术中,一个是要求该模型为{\ it $ε$ -versonally obolost}(ar);也就是说,要求模型在某个范围内扰动任何给定的输入示例时不要更改预测的标签。但是,观察到这种方法会导致标准性能降解,即自然示例的退化。在这项工作中,我们通过正规化的角度研究退化。我们从NNS的概括分析中确定数量;有了确定的数量,我们从经验上发现,通过使大多数层的特征空间变化(由实例空间的变化引起)在所有方向上均匀更平滑,从而实现了AR来实现AR。因此,在一定程度上,它防止了预测的突然变化W.R.T.扰动。但是,这种平滑浓缩样本在决策边界周围的最终结果,导致解决方案较低,并导致标准性能较差。我们的研究表明,人们可能会考虑以较温和的方式将AR建立在NN中,以避免有问题的正则化。

The problem of adversarial examples has shown that modern Neural Network (NN) models could be rather fragile. Among the more established techniques to solve the problem, one is to require the model to be {\it $ε$-adversarially robust} (AR); that is, to require the model not to change predicted labels when any given input examples are perturbed within a certain range. However, it is observed that such methods would lead to standard performance degradation, i.e., the degradation on natural examples. In this work, we study the degradation through the regularization perspective. We identify quantities from generalization analysis of NNs; with the identified quantities we empirically find that AR is achieved by regularizing/biasing NNs towards less confident solutions by making the changes in the feature space (induced by changes in the instance space) of most layers smoother uniformly in all directions; so to a certain extent, it prevents sudden change in prediction w.r.t. perturbations. However, the end result of such smoothing concentrates samples around decision boundaries, resulting in less confident solutions, and leads to worse standard performance. Our studies suggest that one might consider ways that build AR into NNs in a gentler way to avoid the problematic regularization.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源