论文标题

软训练可以保持自然准确性

Soft Adversarial Training Can Retain Natural Accuracy

论文作者

Sharma, Abhijith, Narayan, Apurva

论文摘要

近年来,对神经网络的对抗性培训一直处于众人瞩目的焦点。在过去的十年中,神经网络体系结构的进步导致了其性能的显着改善。它对他们对实时应用程序的部署产生了兴趣。这个过程启动了了解这些模型对对抗攻击的脆弱性的需求。它有助于设计与对手的强大模型。最近的作品提出了针对对手的新技术,最常见地牺牲了自然的准确性。大多数人建议使用对抗版本的输入培训,不断远离原始分布。我们工作的重点是使用抽象认证来提取(因此称其为“软”)对抗性训练的一部分投入。我们提出了一个训练框架,该框架可以保持自然精度而不牺牲稳健性在受约束的环境中。我们的框架专门针对适度关键的应用程序,这些应用需要鲁棒性和准确性之间的合理平衡。结果证明了针对对抗性攻击的防御性训练的想法。最后,我们提出了未来工作的范围,以进一步改善该框架。

Adversarial training for neural networks has been in the limelight in recent years. The advancement in neural network architectures over the last decade has led to significant improvement in their performance. It sparked an interest in their deployment for real-time applications. This process initiated the need to understand the vulnerability of these models to adversarial attacks. It is instrumental in designing models that are robust against adversaries. Recent works have proposed novel techniques to counter the adversaries, most often sacrificing natural accuracy. Most suggest training with an adversarial version of the inputs, constantly moving away from the original distribution. The focus of our work is to use abstract certification to extract a subset of inputs for (hence we call it 'soft') adversarial training. We propose a training framework that can retain natural accuracy without sacrificing robustness in a constrained setting. Our framework specifically targets moderately critical applications which require a reasonable balance between robustness and accuracy. The results testify to the idea of soft adversarial training for the defense against adversarial attacks. At last, we propose the scope of future work for further improvement of this framework.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源