与ASR错误的对话对话中强大的非结构化知识访问

论文标题

与ASR错误的对话对话中强大的非结构化知识访问

Robust Unstructured Knowledge Access in Conversational Dialogue with ASR Errors

论文作者

Tam, Yik-Cheung, Xu, Jiacheng, Zou, Jiakai, Wang, Zecheng, Liao, Tinglong, Yuan, Shuhan

论文摘要

语言理解（SLU）的性能可以通过自动语音识别（ASR）错误降解。我们提出了一种新颖的方法来通过使用ASR错误模拟器随机损坏清洁训练文本来改善SLU鲁棒性，然后自我校正错误并以联合方式最大程度地减少目标分类损失。在提出的错误模拟器中，我们利用没有人体转录的ASR解码器生成的混淆网络来生成用于模型训练的各种误差模式。我们评估了针对以ASR错误为目标的针对知识的面向任务对话对话的DSTC10挑战的方法。实验结果表明我们提出的方法的有效性，从0.9433显着提高了知识转向检测（KTD）F1。在@1中，知识群集分类从0.7924提高到0.9333。经过知识文档的重新排序后，我们的方法在所有知识选择指标上显示出显着改善，从0.7358到召回@1的0.7806，从0.8301到recce@5中的0.8301到0.9333，从0.7798到0.7798到0.7798，在测试集中的MRR@5中从0.7798到0.8460。在最近的DSTC10评估中，我们的方法表明知识选择的显着改善，与官方基线相比，将召回@1从0.495提高到0.7144。我们的源代码在github https://github.com/yctam/dstc10_track2_task2.git中发布。

Performance of spoken language understanding (SLU) can be degraded with automatic speech recognition (ASR) errors. We propose a novel approach to improve SLU robustness by randomly corrupting clean training text with an ASR error simulator, followed by self-correcting the errors and minimizing the target classification loss in a joint manner. In the proposed error simulator, we leverage confusion networks generated from an ASR decoder without human transcriptions to generate a variety of error patterns for model training. We evaluate our approach on the DSTC10 challenge targeted for knowledge-grounded task-oriented conversational dialogues with ASR errors. Experimental results show the effectiveness of our proposed approach, boosting the knowledge-seeking turn detection (KTD) F1 significantly from 0.9433 to 0.9904. Knowledge cluster classification is boosted from 0.7924 to 0.9333 in Recall@1. After knowledge document re-ranking, our approach shows significant improvement in all knowledge selection metrics, from 0.7358 to 0.7806 in Recall@1, from 0.8301 to 0.9333 in Recall@5, and from 0.7798 to 0.8460 in MRR@5 on the test set. In the recent DSTC10 evaluation, our approach demonstrates significant improvement in knowledge selection, boosting Recall@1 from 0.495 to 0.7144 compared to the official baseline. Our source code is released in GitHub https://github.com/yctam/dstc10_track2_task2.git.

下载PDF全文

下载文献需遵守相关版权规定

论文标题