探索体系结构与对抗性强大的概括之间的关系

论文标题

探索体系结构与对抗性强大的概括之间的关系

Exploring the Relationship between Architecture and Adversarially Robust Generalization

论文作者

Liu, Aishan, Tang, Shiyu, Liang, Siyuan, Gong, Ruihao, Wu, Boxi, Liu, Xianglong, Tao, Dacheng

论文摘要

对抗性训练已被证明是捍卫对抗性例子的最有效的补救措施之一，但它通常遭受着巨大的稳定性概括差距，这是看不见的测试对手，被认为是对抗性稳定的概括问题。尽管初步的理解致力于对抗性强大的概括，但从建筑的角度来看，知之甚少。为了弥合差距，本文首次系统地研究了对抗性强大的概括与建筑设计之间的关系。在本外，我们全面评估了20个在Imagenette和CIFAR-10数据集上的最具代表性的对抗训练的体系结构，用于多个p-norm对抗性攻击。基于广泛的实验，我们发现，在对齐的设置下，视觉变压器（例如Pvt，Coatnet）通常会产生更好的对手概括，而CNN倾向于对特定的攻击过度效力，并且未能对多个对手进行推广。为了更好地理解其背后的性质，我们通过Rademacher复杂性的角度进行理论分析。我们揭示了这样一个事实，即较高的重量稀疏性对变形金刚的较高对手概括产生了重大贡献，这通常可以通过特殊设计的注意力块来实现。我们希望我们的论文能够更好地了解设计强大DNN的机制。我们的模型权重可以在http://robust.art上找到。

Adversarial training has been demonstrated to be one of the most effective remedies for defending adversarial examples, yet it often suffers from the huge robustness generalization gap on unseen testing adversaries, deemed as the adversarially robust generalization problem. Despite the preliminary understandings devoted to adversarially robust generalization, little is known from the architectural perspective. To bridge the gap, this paper for the first time systematically investigated the relationship between adversarially robust generalization and architectural design. Inparticular, we comprehensively evaluated 20 most representative adversarially trained architectures on ImageNette and CIFAR-10 datasets towards multiple `p-norm adversarial attacks. Based on the extensive experiments, we found that, under aligned settings, Vision Transformers (e.g., PVT, CoAtNet) often yield better adversarially robust generalization while CNNs tend to overfit on specific attacks and fail to generalize on multiple adversaries. To better understand the nature behind it, we conduct theoretical analysis via the lens of Rademacher complexity. We revealed the fact that the higher weight sparsity contributes significantly towards the better adversarially robust generalization of Transformers, which can be often achieved by the specially-designed attention blocks. We hope our paper could help to better understand the mechanism for designing robust DNNs. Our model weights can be found at http://robust.art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题