表格数据的联合学习：探索隐私的潜在风险

论文标题

表格数据的联合学习：探索隐私的潜在风险

Federated Learning for Tabular Data: Exploring Potential Risk to Privacy

论文作者

Wu, Han, Zhao, Zilong, Chen, Lydia Y., van Moorsel, Aad

论文摘要

联合学习（FL）已成为一种潜在的强大隐私机器学习方法，因为它避免了参与者之间的数据交换，而是交换模型参数。传统上，FL已应用于图像，语音和类似数据，但最近它开始引起来自域名的关注，其中包括数据主要是表格的金融服务。但是，对表格数据的工作尚未考虑潜在的攻击，特别是使用生成对抗网络（GAN）的攻击，这些攻击已成功地应用于非符号数据。本文是第一个探索处理表格数据的联合学习系统中私人数据泄漏的信息。我们设计了一个基于生成的对抗网络（GAN）的攻击模型，该模型可以部署在恶意客户端，以重建其他参与者的数据及其属性。作为考虑表格数据的副作用，我们能够从统计上评估攻击的疗效（而不依赖于人类观察（例如为图像for for for for for图像）。我们在最近开发的用于表格数据处理的通用FL软件框架中实现了攻击模型。实验结果证明了拟议的攻击模型的有效性，因此表明需要进一步的研究以抵抗基于GAN的隐私攻击。

Federated Learning (FL) has emerged as a potentially powerful privacy-preserving machine learning methodology, since it avoids exchanging data between participants, but instead exchanges model parameters. FL has traditionally been applied to image, voice and similar data, but recently it has started to draw attention from domains including financial services where the data is predominantly tabular. However, the work on tabular data has not yet considered potential attacks, in particular attacks using Generative Adversarial Networks (GANs), which have been successfully applied to FL for non-tabular data. This paper is the first to explore leakage of private data in Federated Learning systems that process tabular data. We design a Generative Adversarial Networks (GANs)-based attack model which can be deployed on a malicious client to reconstruct data and its properties from other participants. As a side-effect of considering tabular data, we are able to statistically assess the efficacy of the attack (without relying on human observation such as done for FL for images). We implement our attack model in a recently developed generic FL software framework for tabular data processing. The experimental results demonstrate the effectiveness of the proposed attack model, thus suggesting that further research is required to counter GAN-based privacy attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题