论文标题
A2:有效的自动攻击者来提高对抗性训练
A2: Efficient Automated Attacker for Boosting Adversarial Training
论文作者
论文摘要
基于AT(对抗训练)对模型鲁棒性的显着改善,已经提出了各种变体,以进一步提高性能。公认的方法集中在AT的不同组件上(例如,设计损失功能并利用其他未标记的数据)。人们普遍认为,更强的扰动产生更强大的模型。但是,仍然错过了如何有效产生更强大的扰动。在本文中,我们提出了一个称为A2的高效自动攻击者,以通过在训练过程中在触觉上产生最佳扰动来提高。 A2是一种参数化的自动攻击者,可以在攻击者空间中搜索针对辩护模型和示例的最佳攻击者。跨不同数据集的广泛实验表明,A2会产生更强的扰动,其额外成本较低,并可靠地提高了各种AT方法对不同攻击的鲁棒性。
Based on the significant improvement of model robustness by AT (Adversarial Training), various variants have been proposed to further boost the performance. Well-recognized methods have focused on different components of AT (e.g., designing loss functions and leveraging additional unlabeled data). It is generally accepted that stronger perturbations yield more robust models. However, how to generate stronger perturbations efficiently is still missed. In this paper, we propose an efficient automated attacker called A2 to boost AT by generating the optimal perturbations on-the-fly during training. A2 is a parameterized automated attacker to search in the attacker space for the best attacker against the defense model and examples. Extensive experiments across different datasets demonstrate that A2 generates stronger perturbations with low extra cost and reliably improves the robustness of various AT methods against different attacks.