论文标题

使用多模式知识图表示赋予语言模型

Endowing Language Models with Multimodal Knowledge Graph Representations

论文作者

Huang, Ningyuan, Deshpande, Yash R., Liu, Yibo, Alberts, Houda, Cho, Kyunghyun, Vania, Clara, Calixto, Iacer

论文摘要

我们提出了一种方法,通过将知识存储在外部知识图(kg)中,并使用密集的索引从该kg中检索,以使自然语言理解模型更加有效。给定(可能是多语言的)下游任务数据,例如德语中的句子,我们从kg中检索实体,并使用其多模式表示形式来改善下游任务绩效。我们将最近发布的VisualSem KG用作我们的外部知识存储库,该存储库涵盖了Wikipedia和WordNet实体的一部分,并比较了基于元组和基于图的算法的组合,以学习基于KG多模式信息的实体和关系表示。我们在两个下游任务上展示了学识渊博的实体表示形式的有用性,并在多语言命名实体识别任务上的性能提高了$ 0.3 \%$ - $ 0.7 \%\%$ f1,而我们在视觉中置于视觉意义上的毫无疑问的准确性提高了$ 2.5 \%$ $。我们所有的代码和数据都提供:\ url {https://github.com/iacercalixto/visualsem-kg}。

We propose a method to make natural language understanding models more parameter efficient by storing knowledge in an external knowledge graph (KG) and retrieving from this KG using a dense index. Given (possibly multilingual) downstream task data, e.g., sentences in German, we retrieve entities from the KG and use their multimodal representations to improve downstream task performance. We use the recently released VisualSem KG as our external knowledge repository, which covers a subset of Wikipedia and WordNet entities, and compare a mix of tuple-based and graph-based algorithms to learn entity and relation representations that are grounded on the KG multimodal information. We demonstrate the usefulness of the learned entity representations on two downstream tasks, and show improved performance on the multilingual named entity recognition task by $0.3\%$--$0.7\%$ F1, while we achieve up to $2.5\%$ improvement in accuracy on the visual sense disambiguation task. All our code and data are available in: \url{https://github.com/iacercalixto/visualsem-kg}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源