将隐藏的层表示纳入对抗攻击和防御

论文标题

将隐藏的层表示纳入对抗攻击和防御

Incorporating Hidden Layer representation into Adversarial Attacks and Defences

论文作者

Shen, Haojing, Chen, Sihong, Wang, Ran, Wang, Xizhao

论文摘要

在本文中，我们提出了一种防御策略，以通过合并隐藏的层表示来改善对抗性鲁棒性。这种防御策略的关键旨在压缩或过滤输入信息，包括对抗扰动。而且这种防御策略可以被视为一种激活函数，可以应用于任何类型的神经网络。从理论上讲，我们在某些条件下也证明了这种防御策略的有效性。此外，合并隐藏层表示，我们提出了三种类型的对抗攻击，分别生成三种类型的对抗示例。实验表明，我们的防御方法可以显着改善深度神经网络的对抗性鲁棒性，即使我们不采用对抗性训练，也可以实现最先进的表现。

In this paper, we propose a defence strategy to improve adversarial robustness by incorporating hidden layer representation. The key of this defence strategy aims to compress or filter input information including adversarial perturbation. And this defence strategy can be regarded as an activation function which can be applied to any kind of neural network. We also prove theoretically the effectiveness of this defense strategy under certain conditions. Besides, incorporating hidden layer representation we propose three types of adversarial attacks to generate three types of adversarial examples, respectively. The experiments show that our defence method can significantly improve the adversarial robustness of deep neural networks which achieves the state-of-the-art performance even though we do not adopt adversarial training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题