通过LM-Aware Mwer培训改善罕见的单词识别

论文标题

通过LM-Aware Mwer培训改善罕见的单词识别

Improving Rare Word Recognition with LM-aware MWER Training

论文作者

Wang, Weiran, Chen, Tongzhou, Sainath, Tara N., Variani, Ehsan, Prabhavalkar, Rohit, Huang, Ronny, Ramabhadran, Bhuvana, Gaur, Neeraj, Mavandadi, Sepand, Peyser, Cal, Strohman, Trevor, He, Yanzhang, Rybach, David

论文摘要

语言模型（LMS）显着提高端到端模型（E2E）模型在训练过程中很少见的单词的识别准确性，当时在浅融合或重新恢复设置中。在这项工作中，我们介绍了LMS在判别培训框架中学习混合自动回旋传感器（HAT）模型的研究，以减轻训练与使用LMS的推理差距。对于浅融合设置，我们在假设生成和丢失计算过程中都使用LMS，而LM感知的MWER训练模型可实现10 \％的相对改进，比用标准MWER在语音搜索测试集中培训的模型相对改进，其中包含稀有单词。对于重新设置，我们学会了一个小的神经模块，以数据依赖性方式产生串联的融合权重。该模型与常规MWER训练的模型相同，但无需清除融合权重。

Language models (LMs) significantly improve the recognition accuracy of end-to-end (E2E) models on words rarely seen during training, when used in either the shallow fusion or the rescoring setups. In this work, we introduce LMs in the learning of hybrid autoregressive transducer (HAT) models in the discriminative training framework, to mitigate the training versus inference gap regarding the use of LMs. For the shallow fusion setup, we use LMs during both hypotheses generation and loss computation, and the LM-aware MWER-trained model achieves 10\% relative improvement over the model trained with standard MWER on voice search test sets containing rare words. For the rescoring setup, we learn a small neural module to generate per-token fusion weights in a data-dependent manner. This model achieves the same rescoring WER as regular MWER-trained model, but without the need for sweeping fusion weights.

下载PDF全文

下载文献需遵守相关版权规定

论文标题