论文标题
一种基于Hensel在联邦学习中保护隐私保护的压缩的新维度降低方法
A New Dimensionality Reduction Method Based on Hensel's Compression for Privacy Protection in Federated Learning
论文作者
论文摘要
差异隐私(DP)被认为是保护用户在数据分析,机器和深度学习中保护用户隐私的事实标准。现有的基于DP的隐私培训方法包括在与服务器共享之前向客户的梯度增加噪声。但是,由于组成定理而增加同步训练时期,因此在梯度上实施DP并不有效。最近,研究人员能够使用生成回归神经网络(GRNN)恢复训练数据集中使用的图像,即使梯度受到DP的保护。在本文中,我们提出了两层隐私保护方法,以克服现有基于DP的方法的局限性。第一层基于Hensel的引理降低了训练数据集的维度。我们是第一个使用Hensel的引理来降低数据集的尺寸(即压缩)的人。新的降低方法允许降低数据集的维度而不会丢失信息,因为Hensel的引理可以保证唯一性。第二层将DP应用于第一层生成的压缩数据集。拟议的方法克服了由于培训前仅应用一次DP而导致的隐私泄漏问题;客户在第二层生成的隐私数据集上训练本地模型。实验结果表明,所提出的方法可确保强大的隐私保护,同时实现良好的准确性。新的维度降低方法的精度为97%,只有25%的原始数据大小。
Differential privacy (DP) is considered a de-facto standard for protecting users' privacy in data analysis, machine, and deep learning. Existing DP-based privacy-preserving training approaches consist of adding noise to the clients' gradients before sharing them with the server. However, implementing DP on the gradient is not efficient as the privacy leakage increases by increasing the synchronization training epochs due to the composition theorem. Recently researchers were able to recover images used in the training dataset using Generative Regression Neural Network (GRNN) even when the gradient was protected by DP. In this paper, we propose two layers of privacy protection approach to overcome the limitations of the existing DP-based approaches. The first layer reduces the dimension of the training dataset based on Hensel's Lemma. We are the first to use Hensel's Lemma for reducing the dimension (i.e., compress) of a dataset. The new dimensionality reduction method allows reducing the dimension of a dataset without losing information since Hensel's Lemma guarantees uniqueness. The second layer applies DP to the compressed dataset generated by the first layer. The proposed approach overcomes the problem of privacy leakage due to composition by applying DP only once before the training; clients train their local model on the privacy-preserving dataset generated by the second layer. Experimental results show that the proposed approach ensures strong privacy protection while achieving good accuracy. The new dimensionality reduction method achieves an accuracy of 97%, with only 25 % of the original data size.