通过卷积过滤器镜头的对抗性鲁棒性

论文标题

通过卷积过滤器镜头的对抗性鲁棒性

Adversarial Robustness through the Lens of Convolutional Filters

论文作者

Gavrikov, Paul, Keuper, Janis

论文摘要

深度学习模型对输入数据中的分布变化本质上敏感。特别是，对输入数据的小而几乎不可感知的扰动可以迫使模型以高度置信度进行错误的预测。一个共同的防御机制是通过对抗训练正规化，该训练将最坏情况的扰动注入训练中，以加强决策界限并减少过度拟合。在这种情况下，我们对以对抗训练的模型形成的3x3卷积过滤器进行了研究。从Linf-Robustbench CIFAR-10/100和ImagEnet1k排行榜的71个公共模型中提取过滤器，并将其与从相同架构建立的型号中提取的过滤器进行了比较，但在没有稳健正则化的情况下进行了培训。我们观察到，对抗性模型似乎比正常的卷积过滤器形成更多样化，稀疏和更多的正交卷积过滤器。强大模型和正常模型之间的最大差异是在最深的层中发现的，而第一卷积层则一致地和主要形成滤波器，这些过滤器可以部分消除扰动，而与架构无关。数据和项目网站：https：//github.com/paulgavrikov/cvpr22w_robustnessthroughthelens

Deep learning models are intrinsically sensitive to distribution shifts in the input data. In particular, small, barely perceivable perturbations to the input data can force models to make wrong predictions with high confidence. An common defense mechanism is regularization through adversarial training which injects worst-case perturbations back into training to strengthen the decision boundaries, and to reduce overfitting. In this context, we perform an investigation of 3x3 convolution filters that form in adversarially-trained models. Filters are extracted from 71 public models of the linf-RobustBench CIFAR-10/100 and ImageNet1k leaderboard and compared to filters extracted from models built on the same architectures but trained without robust regularization. We observe that adversarially-robust models appear to form more diverse, less sparse, and more orthogonal convolution filters than their normal counterparts. The largest differences between robust and normal models are found in the deepest layers, and the very first convolution layer, which consistently and predominantly forms filters that can partially eliminate perturbations, irrespective of the architecture. Data & Project website: https://github.com/paulgavrikov/cvpr22w_RobustnessThroughTheLens

下载PDF全文

下载文献需遵守相关版权规定

论文标题