RWCP-SSD-onomatopoeia：环境声音综合的拟声词数据集

论文标题

RWCP-SSD-onomatopoeia：环境声音综合的拟声词数据集

RWCP-SSD-Onomatopoeia: Onomatopoeic Word Dataset for Environmental Sound Synthesis

论文作者

Okamoto, Yuki, Imoto, Keisuke, Takamichi, Shinnosuke, Yamanishi, Ryosuke, Fukumori, Takahiro, Yamashita, Yoichi

论文摘要

环境声音合成是一种产生自然环境声音的技术。使用声音事件标签的环境声音合成的常规工作无法精心控制合成的声音，例如音调和音色。我们认为拟声词可以用于环境声音合成。拟声词可有效解释声音的特征。我们认为，使用拟声单词将使我们能够控制合成声音的精细时间结构。但是，没有使用拟声词的数据集可用于环境声音合成。在本文中，我们提出了RWCP-SSD-ononomatopoeia，该数据集由155,568个拟声道单词与音频样本配对以进行环境声音合成。我们还收集了自我报告的置信度评分和其他报告的拟声词的接受评分，以帮助我们研究转录和选择合适的环境声音综合词的难度。

Environmental sound synthesis is a technique for generating a natural environmental sound. Conventional work on environmental sound synthesis using sound event labels cannot finely control synthesized sounds, for example, the pitch and timbre. We consider that onomatopoeic words can be used for environmental sound synthesis. Onomatopoeic words are effective for explaining the feature of sounds. We believe that using onomatopoeic words will enable us to control the fine time-frequency structure of synthesized sounds. However, there is no dataset available for environmental sound synthesis using onomatopoeic words. In this paper, we thus present RWCP-SSD-Onomatopoeia, a dataset consisting of 155,568 onomatopoeic words paired with audio samples for environmental sound synthesis. We also collected self-reported confidence scores and others-reported acceptance scores of onomatopoeic words, to help us investigate the difficulty in the transcription and selection of a suitable word for environmental sound synthesis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题