对物理世界中深度学习系统的后门攻击

论文标题

对物理世界中深度学习系统的后门攻击

Backdoor Attacks Against Deep Learning Systems in the Physical World

论文作者

Wenger, Emily, Passananti, Josephine, Bhagoji, Arjun, Yao, Yuanshun, Zheng, Haitao, Zhao, Ben Y.

论文摘要

后门攻击将隐藏的恶意行为嵌入到深度学习模型中，这仅激活并在包含特定触发器的模型输入上引起错误分类。但是，现有关于后门攻击和防御的作品，主要集中在使用数字生成模式作为触发器的数字攻击上。一个关键的问题仍未解决：后门攻击能否将物理对象作为触发器成功成功，从而使它们成为对现实世界中深度学习系统的可信威胁？我们进行了一项详细的经验研究，以探索这个问题，以实现面部识别，这是一项批判性的深度学习任务。使用七个物理对象作为触发器，我们收集了3205张志愿者图像的自定义数据集，并使用它来研究在各种现实世界中的物理后门攻击的可行性。我们的研究揭示了两个关键发现。首先，如果物理后门攻击仔细配置以克服物理对象施加的约束，则可以非常成功。特别是，成功触发器的放置在很大程度上受到目标模型对关键面部特征的依赖性的限制。其次，当今针对（数字）后门的最先进的防御能力中有四个对物理后门无效，因为使用物理对象破坏了用于构建这些防御的核心假设。我们的研究证实，（物理）后门攻击不是一种假设的现象，而是对批判性分类任务构成严重的现实威胁。在物理世界中，我们需要针对后门的新的，更强大的防御能力。

Backdoor attacks embed hidden malicious behaviors into deep learning models, which only activate and cause misclassifications on model inputs containing a specific trigger. Existing works on backdoor attacks and defenses, however, mostly focus on digital attacks that use digitally generated patterns as triggers. A critical question remains unanswered: can backdoor attacks succeed using physical objects as triggers, thus making them a credible threat against deep learning systems in the real world? We conduct a detailed empirical study to explore this question for facial recognition, a critical deep learning task. Using seven physical objects as triggers, we collect a custom dataset of 3205 images of ten volunteers and use it to study the feasibility of physical backdoor attacks under a variety of real-world conditions. Our study reveals two key findings. First, physical backdoor attacks can be highly successful if they are carefully configured to overcome the constraints imposed by physical objects. In particular, the placement of successful triggers is largely constrained by the target model's dependence on key facial features. Second, four of today's state-of-the-art defenses against (digital) backdoors are ineffective against physical backdoors, because the use of physical objects breaks core assumptions used to construct these defenses. Our study confirms that (physical) backdoor attacks are not a hypothetical phenomenon but rather pose a serious real-world threat to critical classification tasks. We need new and more robust defenses against backdoors in the physical world.

下载PDF全文

下载文献需遵守相关版权规定

论文标题