在任何时间和任何地方进行推断的联合学习

论文标题

在任何时间和任何地方进行推断的联合学习

Federated Learning for Inference at Anytime and Anywhere

论文作者

Liu, Zicheng, Li, Da, Fernandez-Marques, Javier, Laskaridis, Stefanos, Gao, Yan, Dudziak, Łukasz, Li, Stan Z., Hu, Shell Xu, Hospedales, Timothy

论文摘要

联邦学习主要关注从头开始对深层网络的协作培训，尤其是出现的许多挑战，例如沟通成本，对异质数据的稳健性以及对各种设备功能的支持。但是，没有统一的框架共同解决所有这些问题。本文研究了在FL中利用预训练的变压器模型的挑战和机会。特别是，我们建议通过在每个变压器块上注入一个新型的基于注意力的适配器模块来有效地适应此类预训练的模型，该模块都可以调节前进通行证并做出早期预测。仅通过FL培训轻量级适配器，即使在存在异质数据和设备的情况下，也会导致快速和沟通效率的学习。在包括CIFAR-100，女权主义者和SpeechCommandSV2在内的标准FL基准测试的广泛实验表明，这个简单的框架可提供快速准确的FL，同时支持异质设备功能，有效的个性化和可扩展成本的任何时间推断。

Federated learning has been predominantly concerned with collaborative training of deep networks from scratch, and especially the many challenges that arise, such as communication cost, robustness to heterogeneous data, and support for diverse device capabilities. However, there is no unified framework that addresses all these problems together. This paper studies the challenges and opportunities of exploiting pre-trained Transformer models in FL. In particular, we propose to efficiently adapt such pre-trained models by injecting a novel attention-based adapter module at each transformer block that both modulates the forward pass and makes an early prediction. Training only the lightweight adapter by FL leads to fast and communication-efficient learning even in the presence of heterogeneous data and devices. Extensive experiments on standard FL benchmarks, including CIFAR-100, FEMNIST and SpeechCommandsv2 demonstrate that this simple framework provides fast and accurate FL while supporting heterogenous device capabilities, efficient personalization, and scalable-cost anytime inference.

下载PDF全文

下载文献需遵守相关版权规定

论文标题