论文标题
Confoundergan:通过因果混杂因素保护图像数据隐私
ConfounderGAN: Protecting Image Data Privacy with Causal Confounder
论文作者
论文摘要
深度学习的成功部分归因于从互联网上自由下载的大量数据的可用性。但是,这也意味着未经同意的商业组织可以收集用户的私人数据并用于培训其模型。因此,开发一种方法或工具以防止未经授权的数据开发是重要和必要的。在本文中,我们提出了Confoundergan,这是一种生成的对抗网络(GAN),可以使个人图像数据无法删除以保护其所有者的数据隐私。具体而言,发电机为每个图像产生的噪声具有混杂属性。它可以在图像和标签之间构建虚假的相关性,因此模型无法在此噪音添加的数据集中学习正确的映射到标签。同时,使用鉴别器来确保生成的噪声较小且无法察觉,从而使人类加密图像的正常效用。实验在六个图像分类数据集中进行,由三个天然对象数据集和三个医疗数据集组成。结果表明,我们的方法不仅在标准设置中胜过最先进的方法,而且还可以应用于快速加密方案。此外,我们展示了一系列可传递性和稳定性实验,以进一步说明我们方法的有效性和优越性。
The success of deep learning is partly attributed to the availability of massive data downloaded freely from the Internet. However, it also means that users' private data may be collected by commercial organizations without consent and used to train their models. Therefore, it's important and necessary to develop a method or tool to prevent unauthorized data exploitation. In this paper, we propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners. Specifically, the noise produced by the generator for each image has the confounder property. It can build spurious correlations between images and labels, so that the model cannot learn the correct mapping from images to labels in this noise-added dataset. Meanwhile, the discriminator is used to ensure that the generated noise is small and imperceptible, thereby remaining the normal utility of the encrypted image for humans. The experiments are conducted in six image classification datasets, consisting of three natural object datasets and three medical datasets. The results demonstrate that our method not only outperforms state-of-the-art methods in standard settings, but can also be applied to fast encryption scenarios. Moreover, we show a series of transferability and stability experiments to further illustrate the effectiveness and superiority of our method.