论文标题
MSCTD:多模式情感聊天翻译数据集
MSCTD: A Multimodal Sentiment Chat Translation Dataset
论文作者
论文摘要
近年来,多模式的机器翻译和文本聊天翻译受到了很大的关注。尽管以自然形式的对话通常是多模式的,但在对话中仍然缺乏对多模式的翻译的工作。在这项工作中,我们介绍了一项名为多模式聊天翻译(MCT)的新任务,旨在在相关的对话历史记录和视觉上下文的帮助下生成更准确的翻译。为此,我们首先构建了一个多模式情感聊天翻译数据集(MSCTD),其中包含142,871个英语 - 中国话语对,其中包括14,762个双语对话和30,370英语 - 德语话语对中的3,079个双语对话。每个话语对,与反映当前对话场景的视觉上下文相对应,都用情感标签注释。然后,我们通过建立多个基线系统来基于任务,这些系统包含MCT的多模式和情感功能。在四个语言方向(英语和英语 - 德语)上进行初步实验,验证了上下文和多模式信息融合的潜力以及情感对MCT任务的积极影响。此外,作为MSCTD的副产品,它还提供了多模式对话情感分析的两个新基准。我们的工作可以促进对多模式聊天翻译和多模式对话情感分析的研究。
Multimodal machine translation and textual chat translation have received considerable attention in recent years. Although the conversation in its natural form is usually multimodal, there still lacks work on multimodal machine translation in conversations. In this work, we introduce a new task named Multimodal Chat Translation (MCT), aiming to generate more accurate translations with the help of the associated dialogue history and visual context. To this end, we firstly construct a Multimodal Sentiment Chat Translation Dataset (MSCTD) containing 142,871 English-Chinese utterance pairs in 14,762 bilingual dialogues and 30,370 English-German utterance pairs in 3,079 bilingual dialogues. Each utterance pair, corresponding to the visual context that reflects the current conversational scene, is annotated with a sentiment label. Then, we benchmark the task by establishing multiple baseline systems that incorporate multimodal and sentiment features for MCT. Preliminary experiments on four language directions (English-Chinese and English-German) verify the potential of contextual and multimodal information fusion and the positive impact of sentiment on the MCT task. Additionally, as a by-product of the MSCTD, it also provides two new benchmarks on multimodal dialogue sentiment analysis. Our work can facilitate research on both multimodal chat translation and multimodal dialogue sentiment analysis.