本特征：使用动态模式分解进行语音情感分类的光谱话语表示

论文标题

本特征：使用动态模式分解进行语音情感分类的光谱话语表示

EigenEmo: Spectral Utterance Representation Using Dynamic Mode Decomposition for Speech Emotion Classification

论文作者

Mao, Shuiyang, Ching, P. C., Lee, Tan

论文摘要

从本质上讲，人类的情感言论是一种变体信号。这导致基于语音的自动情绪分类的动力学固有。在这项工作中，我们探讨了一种由流体动力学（称为动态模式分解（DMD））造成的光谱分解方法，以计算表示和分析情感语音的全局话语级动力学。具体而言，首先通过情感蒸馏过程来学习细分级的情感特定表示。这形成了每种话语的情感流的多维信号，称为情感曲线（EPS）。然后将DMD算法应用于所得的EPS以捕获特征频率，从而捕获情绪流的基本过渡动力学。使用拟议方法进行的评估实验，我们称为特征者，显示出令人鼓舞的结果。此外，由于其互补特性的积极组合，将特征emo产生的话语表示与简单的EPS平均产生的话可以产生明显的收益。

Human emotional speech is, by its very nature, a variant signal. This results in dynamics intrinsic to automatic emotion classification based on speech. In this work, we explore a spectral decomposition method stemming from fluid-dynamics, known as Dynamic Mode Decomposition (DMD), to computationally represent and analyze the global utterance-level dynamics of emotional speech. Specifically, segment-level emotion-specific representations are first learned through an Emotion Distillation process. This forms a multi-dimensional signal of emotion flow for each utterance, called Emotion Profiles (EPs). The DMD algorithm is then applied to the resultant EPs to capture the eigenfrequencies, and hence the fundamental transition dynamics of the emotion flow. Evaluation experiments using the proposed approach, which we call EigenEmo, show promising results. Moreover, due to the positive combination of their complementary properties, concatenating the utterance representations generated by EigenEmo with simple EPs averaging yields noticeable gains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题