仅更新编码器可以防止灾难性忘记端到端ASR模型

论文标题

仅更新编码器可以防止灾难性忘记端到端ASR模型

Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models

论文作者

Takashima, Yuki, Horiguchi, Shota, Watanabe, Shinji, García, Paola, Kawaguchi, Yohei

论文摘要

在本文中，我们提出了一种增量域的适应技术，以防止端到端自动语音识别（ASR）模型的灾难性遗忘。常规方法需要与优化模型相同的额外参数，并且很难将这些方法应用于端到端ASR模型，因为它们具有大量参数。为了解决这个问题，我们首先研究了端到端ASR模型的哪些部分在目标域中有助于高精度，同时防止灾难性遗忘。我们对从Librispeech数据集进行增量域的适应性进行实验，并使用两个流行的端到端ASR模型进行AMI MEETSCERCUS，发现仅适应编码器的线性层可以防止灾难性遗忘。然后，根据这一发现，我们开发了一个元素参数选择，该参数选择集中在特定层上，以进一步减少微调参数的数量。实验结果表明，与整个模型的参数选择相比，我们的方法始终阻止灾难性遗忘。

In this paper, we present an incremental domain adaptation technique to prevent catastrophic forgetting for an end-to-end automatic speech recognition (ASR) model. Conventional approaches require extra parameters of the same size as the model for optimization, and it is difficult to apply these approaches to end-to-end ASR models because they have a huge amount of parameters. To solve this problem, we first investigate which parts of end-to-end ASR models contribute to high accuracy in the target domain while preventing catastrophic forgetting. We conduct experiments on incremental domain adaptation from the LibriSpeech dataset to the AMI meeting corpus with two popular end-to-end ASR models and found that adapting only the linear layers of their encoders can prevent catastrophic forgetting. Then, on the basis of this finding, we develop an element-wise parameter selection focused on specific layers to further reduce the number of fine-tuning parameters. Experimental results show that our approach consistently prevents catastrophic forgetting compared to parameter selection from the whole model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题