用结构化信息进行分类，以在口语中检测出来

论文标题

用结构化信息进行分类，以在口语中检测出来

Span Classification with Structured Information for Disfluency Detection in Spoken Utterances

论文作者

Ghosh, Sreyan, Kumar, Sonal, Singla, Yaman Kumar, Shah, Rajiv Ratn, Umesh, S.

论文摘要

现有的探索方法中的现有方法着重于解决标记级别的分类任务，以识别和消除文本中的疏远。此外，大多数作品都专注于仅利用文本中线性序列捕获的上下文信息，因此忽略了文本中的结构化信息，而文本中的结构化信息则有效地捕获了依赖树。在本文中，基于实体识别的跨度分类范式，我们提出了一种新颖的体系结构，用于检测来自口语话语的转录本中的裂痕，通过变压器和依赖树捕获的长距离结构化信息，通过图形卷积网络（GCNS）结合了上下文信息。实验结果表明，我们提出的模型可以在广泛使用的英语总机上实现最先进的结果，以探测不足的检测，并以明显的边距优于先验。我们在GitHub（https://github.com/sreyan88/disfluency-detection-with-span-classification）上公开提供所有代码。

Existing approaches in disfluency detection focus on solving a token-level classification task for identifying and removing disfluencies in text. Moreover, most works focus on leveraging only contextual information captured by the linear sequences in text, thus ignoring the structured information in text which is efficiently captured by dependency trees. In this paper, building on the span classification paradigm of entity recognition, we propose a novel architecture for detecting disfluencies in transcripts from spoken utterances, incorporating both contextual information through transformers and long-distance structured information captured by dependency trees, through graph convolutional networks (GCNs). Experimental results show that our proposed model achieves state-of-the-art results on the widely used English Switchboard for disfluency detection and outperforms prior-art by a significant margin. We make all our codes publicly available on GitHub (https://github.com/Sreyan88/Disfluency-Detection-with-Span-Classification)

下载PDF全文

下载文献需遵守相关版权规定

论文标题