各种语音处理任务的参数有效传输学习

论文标题

各种语音处理任务的参数有效传输学习

Parameter Efficient Transfer Learning for Various Speech Processing Tasks

论文作者

Otake, Shinta, Kawakami, Rei, Inoue, Nakamasa

论文摘要

自我监督模型的微调是在包括语音处理在内的各个领域中的一种强大的传输学习方法，因为它可以利用从大量未标记数据中获得的通用特征表示。但是，微调需要一个针对每个下游任务的新参数集，这是参数效率低下的。提议适配器体系结构通过将轻质可学习的模块插入冷冻的预训练模型中，以部分解决此问题。但是，现有的适配器体系结构无法适应以不同层中存储的低至高级功能，这对于解决各种语音处理任务是必不可少的。因此，我们提出了一种新的适配器体系结构，以更灵活地获取特征表示，以实现各种语音任务。在实验中，我们将此适配器应用于WAVLM上的四个语音任务。它在标准杆或比幼稚的微调上执行，只有11％的可学习参数。它还优于现有的适配器体系结构。

Fine-tuning of self-supervised models is a powerful transfer learning method in a variety of fields, including speech processing, since it can utilize generic feature representations obtained from large amounts of unlabeled data. Fine-tuning, however, requires a new parameter set for each downstream task, which is parameter inefficient. Adapter architecture is proposed to partially solve this issue by inserting lightweight learnable modules into a frozen pre-trained model. However, existing adapter architectures fail to adaptively leverage low- to high-level features stored in different layers, which is necessary for solving various kinds of speech processing tasks. Thus, we propose a new adapter architecture to acquire feature representations more flexibly for various speech tasks. In experiments, we applied this adapter to WavLM on four speech tasks. It performed on par or better than naive fine-tuning, with only 11% of learnable parameters. It also outperformed an existing adapter architecture.

下载PDF全文

下载文献需遵守相关版权规定

论文标题