单像素签名：表征用于后门检测的CNN模型

论文标题

单像素签名：表征用于后门检测的CNN模型

One-pixel Signature: Characterizing CNN Models for Backdoor Detection

论文作者

Huang, Shanjiaoyang, Peng, Weiqi, Jia, Zhiwei, Tu, Zhuowen

论文摘要

我们通过提出一种称为一像素签名的新表示，解决了卷积神经网络（CNN）后门检测问题。我们的任务是检测/分类CNN模型是否被恶意插入了未知的特洛伊木马扳机。在这里，每个CNN模型都与通过生成像素像素生成的签名相关联，这是对班级预测的最大变化结果的对抗值。一个像素签名对CNN体系结构的设计选择不可知，以及如何接受培训。可以在不访问网络参数的情况下对黑框CNN模型有效计算它。我们提出的单像素签名表明，与现有的竞争方法相比，用于后式CNN检测/分类的现有竞争方法的改善（在绝对检测准确性中占30％左右）。一像素签名是一种一般表示，可用于表征后门检测超出后门模型。

We tackle the convolution neural networks (CNNs) backdoor detection problem by proposing a new representation called one-pixel signature. Our task is to detect/classify if a CNN model has been maliciously inserted with an unknown Trojan trigger or not. Here, each CNN model is associated with a signature that is created by generating, pixel-by-pixel, an adversarial value that is the result of the largest change to the class prediction. The one-pixel signature is agnostic to the design choice of CNN architectures, and how they were trained. It can be computed efficiently for a black-box CNN model without accessing the network parameters. Our proposed one-pixel signature demonstrates a substantial improvement (by around 30% in the absolute detection accuracy) over the existing competing methods for backdoored CNN detection/classification. One-pixel signature is a general representation that can be used to characterize CNN models beyond backdoor detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题