论文标题
腐败编码器:基于数据中毒的后门攻击以对比学习
CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive Learning
论文作者
论文摘要
对比度学习(CL)使用未标记的预训练数据集(由图像或图像文本对组成)使用未标记的预训练数据集。 CL容易受到基于数据中毒的后门攻击(DPBA)的攻击,其中攻击者将中毒的输入注入前训练数据集中,因此编码器被后门。但是,现有的DPBA实现了有限的有效性。在这项工作中,我们迈出了第一步,分析了现有的后门攻击的局限性,并向CL提出了新的DPBAS dpbas。腐败编码器引入了一种新的攻击策略来创建中毒的投入,并使用理论引导的方法来最大化攻击效果。我们的实验表明,腐败编码器大大优于现有的DPBA。尤其是,腐败编码器是第一个达到90%攻击成功率的DPBA,只有几(3)个参考图像,而小中毒比率为0.5%。此外,我们还提出了一种称为局部种植的辩护,以防止DPBA。我们的结果表明,我们的防御可以降低DPBA的有效性,但它牺牲了编码器的效用,强调了对新防御的需求。
Contrastive learning (CL) pre-trains general-purpose encoders using an unlabeled pre-training dataset, which consists of images or image-text pairs. CL is vulnerable to data poisoning based backdoor attacks (DPBAs), in which an attacker injects poisoned inputs into the pre-training dataset so the encoder is backdoored. However, existing DPBAs achieve limited effectiveness. In this work, we take the first step to analyze the limitations of existing backdoor attacks and propose new DPBAs called CorruptEncoder to CL. CorruptEncoder introduces a new attack strategy to create poisoned inputs and uses a theory-guided method to maximize attack effectiveness. Our experiments show that CorruptEncoder substantially outperforms existing DPBAs. In particular, CorruptEncoder is the first DPBA that achieves more than 90% attack success rates with only a few (3) reference images and a small poisoning ratio 0.5%. Moreover, we also propose a defense, called localized cropping, to defend against DPBAs. Our results show that our defense can reduce the effectiveness of DPBAs, but it sacrifices the utility of the encoder, highlighting the need for new defenses.