激励跨核心联合学习中的数据贡献

论文标题

激励跨核心联合学习中的数据贡献

Incentivizing Data Contribution in Cross-Silo Federated Learning

论文作者

Huang, Chao, Ke, Shuqi, Kamhoua, Charles, Mohapatra, Prasant, Liu, Xin

论文摘要

在跨性别的联合学习中，客户（例如，组织）使用本地数据培训共享的全球模型。但是，由于隐私问题，客户在培训期间可能无法贡献足够的数据点。为了解决这个问题，我们提出了一个一般激励框架，可以将从全球模型获得的利润/收益适当地分配给客户，以激励数据贡献。我们将客户的交互作用作为数据贡献游戏，并研究其均衡。我们表征了平衡存在的条件，并证明每个客户的均衡数据贡献都会提高其数据质量并降低隐私敏感性。我们使用CIFAR-10进一步进行实验，并表明结果与分析一致。此外，我们表明，实用的分配机制，例如线性成比例，一对一和沙普利价值，激励具有高质量数据的客户的更多数据贡献，其中保留的一局倾向于在平衡时实现最高的全球模型准确性。

In cross-silo federated learning, clients (e.g., organizations) train a shared global model using local data. However, due to privacy concerns, the clients may not contribute enough data points during training. To address this issue, we propose a general incentive framework where the profit/benefit obtained from the global model can be appropriately allocated to clients to incentivize data contribution. We formulate the clients' interactions as a data contribution game and study its equilibrium. We characterize conditions for an equilibrium to exist, and prove that each client's equilibrium data contribution increases in its data quality and decreases in the privacy sensitivity. We further conduct experiments using CIFAR-10 and show that the results are consistent with the analysis. Moreover, we show that practical allocation mechanisms such as linearly proportional, leave-one-out, and Shapley-value incentivize more data contribution from clients with higher-quality data, in which leave-one-out tends to achieve the highest global model accuracy at equilibrium.

下载PDF全文

下载文献需遵守相关版权规定

论文标题