重新思考联邦学习中的归一化方法

论文标题

重新思考联邦学习中的归一化方法

Rethinking Normalization Methods in Federated Learning

论文作者

Du, Zhixu, Sun, Jingwei, Li, Ang, Chen, Pin-Yu, Zhang, Jianyi, Li, Hai "Helen", Chen, Yiran

论文摘要

联合学习（FL）是一个受欢迎的分布式学习框架，可以通过不明确共享私人数据来降低隐私风险。在这项工作中，我们明确发现了FL中的外部协变量转移问题，这是由不同设备上独立的本地培训过程引起的。我们证明，外部协变量的变化将导致某些设备对全球模型的贡献的消除。此外，我们表明，归一化层在FL中是必不可少的，因为它们的遗传性能可以减轻消除某些设备的贡献的问题。但是，最近的作品表明，批处理（这是许多深神经网络中的标准组件之一）将导致佛罗里达全球模型的准确性下降。对FL中批处理归一化失败的基本原因研究很少。我们揭示外部协变量移位是批次归一化在FL中无效的关键原因。我们还表明，层归一化是FL的更好选择，可以减轻外部协变量转移并改善全局模型的性能。我们在非IID设置下对CIFAR10进行实验。结果表明，具有层归一化的模型收敛最快，并实现了三种不同模型架构的最佳或可比精度。

Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data. In this work, we explicitly uncover external covariate shift problem in FL, which is caused by the independent local training processes on different devices. We demonstrate that external covariate shifts will lead to the obliteration of some devices' contributions to the global model. Further, we show that normalization layers are indispensable in FL since their inherited properties can alleviate the problem of obliterating some devices' contributions. However, recent works have shown that batch normalization, which is one of the standard components in many deep neural networks, will incur accuracy drop of the global model in FL. The essential reason for the failure of batch normalization in FL is poorly studied. We unveil that external covariate shift is the key reason why batch normalization is ineffective in FL. We also show that layer normalization is a better choice in FL which can mitigate the external covariate shift and improve the performance of the global model. We conduct experiments on CIFAR10 under non-IID settings. The results demonstrate that models with layer normalization converge fastest and achieve the best or comparable accuracy for three different model architectures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题