数据集分区对功能障碍检测系统的影响

论文标题

数据集分区对功能障碍检测系统的影响

The Influence of Dataset Partitioning on Dysfluency Detection Systems

论文作者

Bayerl, Sebastian P., Wagner, Dominik, Nöth, Elmar, Bocklet, Tobias, Riedhammer, Korbinian

论文摘要

本文经验研究了不同数据拆分和分裂策略对功能障碍检测系统性能的影响。为此，我们使用具有分类头的WAV2VEC 2.0模型以及支持向量机（SVM）以及从WAV2VEC 2.0模型中提取的功能进行实验。我们使用播客中的口吃事件（SEP-28K）数据集中使用不同的非说话者和扬声器的分裂来训练和评估系统，以阐明结果W.R.T.的可变性。使用使用的分区方法。此外，我们表明SEP-28K数据集仅由少数扬声器主导，因此很难评估。为了解决这个问题，我们为SEP-28K语料库中包含半自动生成的扬声器和性别信息，创建了Sep-28k扩展（SEP-28K-E），并建议不同的数据拆分，每种数据分配有用，可用于评估功能障碍检测方法的其他方面。

This paper empirically investigates the influence of different data splits and splitting strategies on the performance of dysfluency detection systems. For this, we perform experiments using wav2vec 2.0 models with a classification head as well as support vector machines (SVM) in conjunction with the features extracted from the wav2vec 2.0 model to detect dysfluencies. We train and evaluate the systems with different non-speaker-exclusive and speaker-exclusive splits of the Stuttering Events in Podcasts (SEP-28k) dataset to shed some light on the variability of results w.r.t. to the partition method used. Furthermore, we show that the SEP-28k dataset is dominated by only a few speakers, making it difficult to evaluate. To remedy this problem, we created SEP-28k-Extended (SEP-28k-E), containing semi-automatically generated speaker and gender information for the SEP-28k corpus, and suggest different data splits, each useful for evaluating other aspects of methods for dysfluency detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题