论文标题

关于婴儿父母语音诊断的比较研究

A Comparison Study on Infant-Parent Voice Diarization

论文作者

Zhu, Junzhe, Hasegawa-Johnson, Mark, McElwain, Nancy

论文摘要

我们设计了一个框架,用于根据Di-Arization的最新算法研究3至24个月的前语言儿童语音。我们的系统由时间不变的特征前扣子,与上下文相关的嵌入发生器和一个CLAS-Sifier组成。我们研究了交换系统的不同组合以及改变损失功能以找到最佳性能的效果。我们还提出了一种多个稳定的学习技术,该技术使我们能够在带有更粗的片段边界标签的较大数据集上预先培训我们的参数。我们发现,我们的最佳系统在TestDataset上达到了43.8%的DER,而Lena软件实现了55.4%。我们还发现,使用卷积特征外向而不是logmel特征可显着增加神经腹膜的每一体。

We design a framework for studying prelinguistic child voicefrom 3 to 24 months based on state-of-the-art algorithms in di-arization. Our system consists of a time-invariant feature ex-tractor, a context-dependent embedding generator, and a clas-sifier. We study the effect of swapping out different compo-nents of the system, as well as changing loss function, to findthe best performance. We also present a multiple-instancelearning technique that allows us to pre-train our parame-ters on larger datasets with coarser segment boundary labels.We found that our best system achieved 43.8% DER on testdataset, compared to 55.4% DER achieved by LENA soft-ware. We also found that using convolutional feature extrac-tor instead of logmel features significantly increases the per-formance of neural diarization.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源