论文标题

哈萨克特2:扩展开源哈萨克语TTS语料库,并提供更多数据,扬声器和主题

KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics

论文作者

Mussakhojayeva, Saida, Khassanov, Yerbolat, Varol, Huseyin Atakan

论文摘要

我们提出了以前发布的哈萨克语文本到语音(哈萨克特人)综合语料库的扩展版本。在新的哈萨克(Hazakhtts2)语料库中,总体规模从93小时增加到271小时,演讲者的数量从2个增加到五个(三名女性和两名男性),在新来源的帮助下,主题覆盖范围已经多样化,包括一本书和Wikipedia文章。该语料库对于为哈萨克(Hazakh)建立高质量的TTS系统是必不可少的,哈萨克(Hazakh)是一种来自土耳其家族的中亚凝集性语言,该语言提出了几种语言挑战。我们描述了语料库的构建过程,并提供了TTS系统的培训和评估程序的详细信息。我们的实验结果表明,构造的语料库足以为现实世界应用构建强大的TTS模型,所有五位扬声器的主观平均意见分数范围为3.6至4.2。我们认为,我们的语料库将促进哈萨克语和其他突厥语的语音和语言研究,由于免费语言数据的可用性有限,因此被广泛认为是低资源的。我们的GitHub存储库中公开可用的构建的语料库,代码和预算模型。

We present an expanded version of our previously released Kazakh text-to-speech (KazakhTTS) synthesis corpus. In the new KazakhTTS2 corpus, the overall size has increased from 93 hours to 271 hours, the number of speakers has risen from two to five (three females and two males), and the topic coverage has been diversified with the help of new sources, including a book and Wikipedia articles. This corpus is necessary for building high-quality TTS systems for Kazakh, a Central Asian agglutinative language from the Turkic family, which presents several linguistic challenges. We describe the corpus construction process and provide the details of the training and evaluation procedures for the TTS system. Our experimental results indicate that the constructed corpus is sufficient to build robust TTS models for real-world applications, with a subjective mean opinion score ranging from 3.6 to 4.2 for all the five speakers. We believe that our corpus will facilitate speech and language research for Kazakh and other Turkic languages, which are widely considered to be low-resource due to the limited availability of free linguistic data. The constructed corpus, code, and pretrained models are publicly available in our GitHub repository.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源