论文标题
通过结合几次射击分类和插值来处理低资源对话系统中的班级不平衡
Handling Class Imbalance in Low-Resource Dialogue Systems by Combining Few-Shot Classification and Interpolation
论文作者
论文摘要
低资源对话系统中的话语分类性能受到类标签中不可避免的高度数据不平衡的限制。我们提出了一个新的端到端成对学习框架,该框架专门旨在通过在话语表示中诱导一些弹出的分类能力来解决这一现象,并通过插入话语表示来增强数据。我们的方法是一种通用训练方法,是用于编码话语的神经结构的不可知论。对于三种不同的神经体系结构的标准跨膜培训,我们显示出宏F1分数的显着改善,这表明了虚拟患者对话数据集的改进以及《调音板对话法》分类数据集的低资源模拟数据集。
Utterance classification performance in low-resource dialogue systems is constrained by an inevitably high degree of data imbalance in class labels. We present a new end-to-end pairwise learning framework that is designed specifically to tackle this phenomenon by inducing a few-shot classification capability in the utterance representations and augmenting data through an interpolation of utterance representations. Our approach is a general purpose training methodology, agnostic to the neural architecture used for encoding utterances. We show significant improvements in macro-F1 score over standard cross-entropy training for three different neural architectures, demonstrating improvements on a Virtual Patient dialogue dataset as well as a low-resourced emulation of the Switchboard dialogue act classification dataset.