论文标题

单像素签名:表征用于后门检测的CNN模型

One-pixel Signature: Characterizing CNN Models for Backdoor Detection

论文作者

Huang, Shanjiaoyang, Peng, Weiqi, Jia, Zhiwei, Tu, Zhuowen

论文摘要

我们通过提出一种称为一像素签名的新表示,解决了卷积神经网络(CNN)后门检测问题。我们的任务是检测/分类CNN模型是否被恶意插入了未知的特洛伊木马扳机。在这里,每个CNN模型都与通过生成像素像素生成的签名相关联,这是对班级预测的最大变化结果的对抗值。一个像素签名对CNN体系结构的设计选择不可知,以及如何接受培训。可以在不访问网络参数的情况下对黑框CNN模型有效计算它。我们提出的单像素签名表明,与现有的竞争方法相比,用于后式CNN检测/分类的现有竞争方法的改善(在绝对检测准确性中占30%左右)。一像素签名是一种一般表示,可用于表征后门检测超出后门模型。

We tackle the convolution neural networks (CNNs) backdoor detection problem by proposing a new representation called one-pixel signature. Our task is to detect/classify if a CNN model has been maliciously inserted with an unknown Trojan trigger or not. Here, each CNN model is associated with a signature that is created by generating, pixel-by-pixel, an adversarial value that is the result of the largest change to the class prediction. The one-pixel signature is agnostic to the design choice of CNN architectures, and how they were trained. It can be computed efficiently for a black-box CNN model without accessing the network parameters. Our proposed one-pixel signature demonstrates a substantial improvement (by around 30% in the absolute detection accuracy) over the existing competing methods for backdoored CNN detection/classification. One-pixel signature is a general representation that can be used to characterize CNN models beyond backdoor detection.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源