多语言和多模式滥用检测

论文标题

多语言和多模式滥用检测

Multilingual and Multimodal Abuse Detection

论文作者

Sharon, Rini, Shah, Heet, Mukherjee, Debdoot, Gupta, Vikram

论文摘要

社交媒体平台上滥用内容的存在是不可取的，因为它严重阻碍了健康且安全的社交媒体互动。尽管在文本域中广泛探索了自动滥用检测，但仍未探索音频滥用检测。在本文中，我们尝试从多语言社交媒体环境中的多模式的角度从对话音频中进行滥用检测。我们的关键假设是，随着音频的建模，将其他模式的歧视性信息结合在一起可能对这项任务非常有益。我们提出的方法MADA明确侧重于音频本身以外的两种方式，即在滥用音频中表达的潜在情绪以及以相应的文本形式封装的语义信息。观察证明，MADA在Adima数据集上展示了仅凭音频方法的收益。我们在10种不同的语言上测试了提出的方法，并通过利用多种方式观察到0.6％-5.2％的范围内的一致收益。我们还进行了广泛的消融实验，以研究每种模式的贡献，并在利用所有模式的同时观察最佳结果。此外，我们执行实验以从经验上确认潜在的情绪与虐待行为之间存在很强的相关性。

The presence of abusive content on social media platforms is undesirable as it severely impedes healthy and safe social media interactions. While automatic abuse detection has been widely explored in textual domain, audio abuse detection still remains unexplored. In this paper, we attempt abuse detection in conversational audio from a multimodal perspective in a multilingual social media setting. Our key hypothesis is that along with the modelling of audio, incorporating discriminative information from other modalities can be highly beneficial for this task. Our proposed method, MADA, explicitly focuses on two modalities other than the audio itself, namely, the underlying emotions expressed in the abusive audio and the semantic information encapsulated in the corresponding textual form. Observations prove that MADA demonstrates gains over audio-only approaches on the ADIMA dataset. We test the proposed approach on 10 different languages and observe consistent gains in the range 0.6%-5.2% by leveraging multiple modalities. We also perform extensive ablation experiments for studying the contributions of every modality and observe the best results while leveraging all the modalities together. Additionally, we perform experiments to empirically confirm that there is a strong correlation between underlying emotions and abusive behaviour.

下载PDF全文

下载文献需遵守相关版权规定

论文标题