论文标题
免费的感知改善:探索对图像分类的不可感知的黑盒对抗攻击
Perception Improvement for Free: Exploring Imperceptible Black-box Adversarial Attacks on Image Classification
论文作者
论文摘要
深度神经网络容易受到对抗攻击的影响。白盒对抗攻击可以用小的对抗性扰动欺骗神经网络,尤其是对于大尺寸图像。但是,对于基于转移的黑盒对抗攻击,保持成功的对抗扰动尤其具有挑战性。通常,这种对抗性例子很容易被发现,因为它们令人难以置信的视觉质量差,这在实践中损害了对抗性攻击的威胁。在这项研究中,为了提高黑盒对抗性示例的图像质量,我们通过基于心理感知模型产生对抗性图像来提出结构感知的对抗性攻击。具体而言,我们允许在感知上微不足道的区域上进行更高的扰动,同时分配视觉敏感区域的较低或没有扰动。除了提出的空间约束的对抗扰动外,我们还提出了一种新型的结构感知频率对抗攻击方法(在离散余弦变换(DCT)域中)。由于所提出的攻击与梯度估计无关,因此它们可以直接与现有的基于梯度的攻击合并。实验结果表明,借助类似的攻击成功率(ASR),所提出的方法可以免费产生具有显着改善视觉质量的对抗性示例。凭借可比的感知质量,提出的方法获得了较高的攻击成功率:尤其是对于频率结构感知攻击,平均ASR在基线攻击中提高了10%以上。
Deep neural networks are vulnerable to adversarial attacks. White-box adversarial attacks can fool neural networks with small adversarial perturbations, especially for large size images. However, keeping successful adversarial perturbations imperceptible is especially challenging for transfer-based black-box adversarial attacks. Often such adversarial examples can be easily spotted due to their unpleasantly poor visual qualities, which compromises the threat of adversarial attacks in practice. In this study, to improve the image quality of black-box adversarial examples perceptually, we propose structure-aware adversarial attacks by generating adversarial images based on psychological perceptual models. Specifically, we allow higher perturbations on perceptually insignificant regions, while assigning lower or no perturbation on visually sensitive regions. In addition to the proposed spatial-constrained adversarial perturbations, we also propose a novel structure-aware frequency adversarial attack method in the discrete cosine transform (DCT) domain. Since the proposed attacks are independent of the gradient estimation, they can be directly incorporated with existing gradient-based attacks. Experimental results show that, with the comparable attack success rate (ASR), the proposed methods can produce adversarial examples with considerably improved visual quality for free. With the comparable perceptual quality, the proposed approaches achieve higher attack success rates: particularly for the frequency structure-aware attacks, the average ASR improves more than 10% over the baseline attacks.