正常训练的深度学习模型中的后门脆弱性

论文标题

正常训练的深度学习模型中的后门脆弱性

Backdoor Vulnerabilities in Normally Trained Deep Learning Models

论文作者

Tao, Guanhong, Wang, Zhenting, Cheng, Siyuan, Ma, Shiqing, An, Shengwei, Liu, Yingqi, Shen, Guangyu, Zhang, Zhuo, Mao, Yunshu, Zhang, Xiangyu

论文摘要

我们对正常训练的深度学习模型中的后门脆弱性进行了系统的研究。它们与数据中毒注入的后门一样危险，因为两者都可以同样利用。我们利用文献中的20种不同类型的注射后门攻击作为指导，并在正常训练的模型中研究其对应关系，我们称之为自然的后门脆弱性。我们发现天然后门是广泛存在的，大多数注射的后门攻击具有自然对应关系。我们对这些天然后门进行分类，并提出一个一般检测框架。它在从Internet下载的56个常规训练的模型中找到了315个天然后门，涵盖了所有不同的类别，而为注射后门设计的现有扫描仪最多可以检测到65个后门。我们还研究了自然后门的根本原因和防御。

We conduct a systematic study of backdoor vulnerabilities in normally trained Deep Learning models. They are as dangerous as backdoors injected by data poisoning because both can be equally exploited. We leverage 20 different types of injected backdoor attacks in the literature as the guidance and study their correspondences in normally trained models, which we call natural backdoor vulnerabilities. We find that natural backdoors are widely existing, with most injected backdoor attacks having natural correspondences. We categorize these natural backdoors and propose a general detection framework. It finds 315 natural backdoors in the 56 normally trained models downloaded from the Internet, covering all the different categories, while existing scanners designed for injected backdoors can at most detect 65 backdoors. We also study the root causes and defense of natural backdoors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题