论文标题
弥合韵律差距:与人们有效采样情绪韵律的遗传算法
Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody
论文作者
论文摘要
人类的声音有效地传达了一系列情绪,具有细微的声音。现有的情感语音语料库是有限的,因为它们是(a)高度策划的,可以诱导特定的情绪,而这些类别可能无法捕捉到可能无法捕捉到情感体验的全部程度,或者(b)纠缠在其语义和韵律线索中,限制了分别研究这些线索的能力。为了克服这一挑战,我们提出了一种称为“与人的遗传算法”(GAP)的新方法,该方法将人类的决策和生产纳入遗传算法。在我们的设计中,我们允许创作者和评估者共同优化几代人的情感韵律。我们证明,差距可以有效地从情感上的语音空间中取样并捕捉到广泛的情绪,并与最先进的情感语音语料库显示出可比的结果。 GAP是不依赖语言的,并支持大型众包,因此可以支持未来的大规模跨文化研究。
The human voice effectively communicates a range of emotions with nuanced variations in acoustics. Existing emotional speech corpora are limited in that they are either (a) highly curated to induce specific emotions with predefined categories that may not capture the full extent of emotional experiences, or (b) entangled in their semantic and prosodic cues, limiting the ability to study these cues separately. To overcome this challenge, we propose a new approach called 'Genetic Algorithm with People' (GAP), which integrates human decision and production into a genetic algorithm. In our design, we allow creators and raters to jointly optimize the emotional prosody over generations. We demonstrate that GAP can efficiently sample from the emotional speech space and capture a broad range of emotions, and show comparable results to state-of-the-art emotional speech corpora. GAP is language-independent and supports large crowd-sourcing, thus can support future large-scale cross-cultural research.