隐私保护基于树模型的垂直联合学习

论文标题

隐私保护基于树模型的垂直联合学习

Privacy Preserving Vertical Federated Learning for Tree-based Models

论文作者

Wu, Yuncheng, Cai, Shaofeng, Xiao, Xiaokui, Chen, Gang, Ooi, Beng Chin

论文摘要

联邦学习（FL）是一个新兴的范式，它使多个组织能够共同训练模型，而无需彼此揭示其私人数据。本文研究{\ it垂直}联合学习，该学习可以解决（i）合作组织拥有同一组用户但脱节功能的方案，并且（ii）只有一个组织拥有标签。我们提出了Pivot，这是一种保留垂直决策树训练和预测的新颖解决方案，以确保除了客户同意发布的那些人（即最终树模型和预测输出）之外，没有披露中间信息。 Pivot不依赖任何受信任的第三方，并为可能损害$ M $客户$ M-1 $的半honest对手提供保护。当训练有素的决策树模型以明文发布并提出增强的协议以减轻它们时，我们进一步确定了两个隐私泄漏。提出的解决方案也可以扩展到树的集成模型，例如随机森林（RF）和梯度提升决策树（GBDT），通过将单个决策树视为构建块。理论和实验分析表明，枢轴对获得的隐私有效。

Federated learning (FL) is an emerging paradigm that enables multiple organizations to jointly train a model without revealing their private data to each other. This paper studies {\it vertical} federated learning, which tackles the scenarios where (i) collaborating organizations own data of the same set of users but with disjoint features, and (ii) only one organization holds the labels. We propose Pivot, a novel solution for privacy preserving vertical decision tree training and prediction, ensuring that no intermediate information is disclosed other than those the clients have agreed to release (i.e., the final tree model and the prediction output). Pivot does not rely on any trusted third party and provides protection against a semi-honest adversary that may compromise $m-1$ out of $m$ clients. We further identify two privacy leakages when the trained decision tree model is released in plaintext and propose an enhanced protocol to mitigate them. The proposed solution can also be extended to tree ensemble models, e.g., random forest (RF) and gradient boosting decision tree (GBDT) by treating single decision trees as building blocks. Theoretical and experimental analysis suggest that Pivot is efficient for the privacy achieved.

下载PDF全文

下载文献需遵守相关版权规定

论文标题