论文标题

神经语音克隆与一些低质量样本

Neural voice cloning with a few low-quality samples

论文作者

Jung, Sunghee, Kim, Hoirin

论文摘要

在本文中,我们探讨了仅使用有限数量的目标扬声器样本发现的低质量数据中语音合成的可能性。与以前的作品不同,我们尝试从目标扬声器的发现数据中嵌入的扬声器嵌入,这些作品试图在发现的数据上训练整个文本到语音系统。此外,两个扬声器模仿适应性的方法和基于扬声器编码器的方法都应用于新发布的库丽特数据集,并以前发布的VCTK语料库来研究说话者品种对清晰度和目标扬声器相似性的影响。

In this paper, we explore the possibility of speech synthesis from low quality found data using only limited number of samples of target speaker. We try to extract only the speaker embedding from found data of target speaker unlike previous works which tries to train the entire text-to-speech system on found data. Also, the two speaker mimicking approaches which are adaptation and speaker-encoder-based are applied on newly released LibriTTS dataset and previously released VCTK corpus to examine the impact of speaker variety on clarity and target-speaker-similarity .

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源