论文标题

Data2Vec-AQC:在教师培训设置中搜索合适的助教

data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup

论文作者

Lodagala, Vasista Sai, Ghosh, Sreyan, Umesh, S.

论文摘要

在本文中,我们提出了一种称为data2Vec-aqc的新的自我监督学习(SSL)算法,用于从未标记的语音数据中学习语音表示。我们的目标是在未标记和标记数据受到限制的领域中改善SSL的语音。在最近引入的Data2Vec的基础上,我们将其他模块引入了Data2VEC框架,该框架利用了数据增强,量化表示和群集的好处。这些模块之间的相互作用有助于解决交叉对比度损失,作为额外的自我监督目标。 Data2Vec-AQC在不使用任何语言模型(LM)的情况下分别在测试清洁和其他测试中,相对于现有的最新Data2Vec系统的相对相对提高了14.1%和20.9%。当在打电筒数据集的子集上进行微调时,我们提出的模型还可以在基线数据上获得17.8 \%相对的增长。代码:https://github.com/speech-lab-iitm/data2vec-aqc。

In this paper, we propose a new Self-Supervised Learning (SSL) algorithm called data2vec-aqc, for speech representation learning from unlabeled speech data. Our goal is to improve SSL for speech in domains where both unlabeled and labeled data are limited. Building on the recently introduced data2vec, we introduce additional modules to the data2vec framework that leverage the benefit of data augmentations, quantized representations, and clustering. The interaction between these modules helps solve the cross-contrastive loss as an additional self-supervised objective. data2vec-aqc achieves up to 14.1% and 20.9% relative WER improvement over the existing state-of-the-art data2vec system over the test-clean and test-other sets, respectively of LibriSpeech, without the use of any language model (LM). Our proposed model also achieves up to 17.8\% relative WER gains over the baseline data2vec when fine-tuned on a subset of the Switchboard dataset. Code: https://github.com/Speech-Lab-IITM/data2vec-aqc.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源