DP-CGAN：私人合成数据和标签生成

论文标题

DP-CGAN：私人合成数据和标签生成

DP-CGAN: Differentially Private Synthetic Data and Label Generation

论文作者

Torkzadehmahani, Reihaneh, Kairouz, Peter, Paten, Benedict

论文摘要

生成对抗网络（GAN）是生成包括图像在内的合成数据的著名模型之一，尤其是对于无法使用原始敏感数据集的研究社区，因为它们不可公开访问。该领域的主要挑战之一是保护参加GAN模型培训的个人的隐私。为了应对这一挑战，我们基于新的剪辑和扰动策略引入了差异性私人条件GAN（DP-CGAN）培训框架，从而改善了模型的性能，同时保留了培训数据集的隐私权。 DP-CGAN同时生成合成数据和相应的标签，并利用最近引入的Renyi差异隐私会计师来跟踪花费的隐私预算。实验结果表明，DP-CGAN可以在MNIST数据集上以视觉和经验上有希望的结果生成具有差异隐私的单位EPSILON参数的结果。

Generative Adversarial Networks (GANs) are one of the well-known models to generate synthetic data including images, especially for research communities that cannot use original sensitive datasets because they are not publicly accessible. One of the main challenges in this area is to preserve the privacy of individuals who participate in the training of the GAN models. To address this challenge, we introduce a Differentially Private Conditional GAN (DP-CGAN) training framework based on a new clipping and perturbation strategy, which improves the performance of the model while preserving privacy of the training dataset. DP-CGAN generates both synthetic data and corresponding labels and leverages the recently introduced Renyi differential privacy accountant to track the spent privacy budget. The experimental results show that DP-CGAN can generate visually and empirically promising results on the MNIST dataset with a single-digit epsilon parameter in differential privacy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题