通过层次的一致性建模，并提高知识，朝着多模式的讽刺检测

论文标题

通过层次的一致性建模，并提高知识，朝着多模式的讽刺检测

Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity Modeling with Knowledge Enhancement

论文作者

Liu, Hui, Wang, Wenya, Li, Haoliang

论文摘要

讽刺是一种语言现象，表明字面意义与隐含意图之间存在差异。由于其复杂的性质，通常可以从文本本身中检测到它。结果，多模式讽刺检测在学术界和行业都受到了更多关注。但是，大多数现有技术仅对文本输入及其随附图像之间的原子级不一致进行建模，从而忽略了这两种模态的更复杂的构图。此外，他们忽略了外部知识中包含的丰富信息，例如图像标题。在本文中，我们通过探索基于多头交叉注意机制的原子水平的一致性和基于图形神经网络的组成级一致性，提出了一个新型的讽刺框架，以探索原子级的一致性，其中可以将其较低的帖子确定为讽刺。此外，我们利用各种知识资源对讽刺检测的影响。基于Twitter的公共多模式讽刺检测数据集的评估结果证明了我们提出的模型的优势。

Sarcasm is a linguistic phenomenon indicating a discrepancy between literal meanings and implied intentions. Due to its sophisticated nature, it is usually challenging to be detected from the text itself. As a result, multi-modal sarcasm detection has received more attention in both academia and industries. However, most existing techniques only modeled the atomic-level inconsistencies between the text input and its accompanying image, ignoring more complex compositions for both modalities. Moreover, they neglected the rich information contained in external knowledge, e.g., image captions. In this paper, we propose a novel hierarchical framework for sarcasm detection by exploring both the atomic-level congruity based on multi-head cross attention mechanism and the composition-level congruity based on graph neural networks, where a post with low congruity can be identified as sarcasm. In addition, we exploit the effect of various knowledge resources for sarcasm detection. Evaluation results on a public multi-modal sarcasm detection dataset based on Twitter demonstrate the superiority of our proposed model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题