Colake：语境化语言和知识嵌入

论文标题

Colake：语境化语言和知识嵌入

CoLAKE: Contextualized Language and Knowledge Embedding

论文作者

Sun, Tianxiang, Shao, Yunfan, Qiu, Xipeng, Guo, Qipeng, Hu, Yaru, Huang, Xuanjing, Zhang, Zheng

论文摘要

随着将事实知识纳入诸如BERT之类的预训练的语言模型的新兴分支，大多数现有模型都考虑了浅，静态和单独训练的实体嵌入，这限制了这些模型的性能提高。在注入知识时，很少有作品探索深层情境化的知识表示的潜力。在本文中，我们提出了上下文化的语言和知识嵌入（Colake），该语言和知识嵌入，它通过扩展的MLM目标共同学习语言和知识的情境化表示。 Colake不是仅注射实体嵌入，而是从大规模知识库中提取了实体的知识上下文。为了处理知识上下文和语言上下文的异质性，我们将它们集成到统一的数据结构，单词知识图（WK Graph）中。 Colake在具有修改的变压器编码器的大规模WK图上进行了预训练。我们对知识驱动的任务，知识探查任务和语言理解任务进行实验。实验结果表明，Colake在大多数任务上的表现都优于先前的同行。此外，Colake在我们的合成任务上表现出了令人惊讶的高性能，称为Word-nkookledge图形完成，这表明了同时上下文化语言和知识表示的优势。

With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models. Few works explore the potential of deep contextualized knowledge representation when injecting knowledge. In this paper, we propose the Contextualized Language and Knowledge Embedding (CoLAKE), which jointly learns contextualized representation for both language and knowledge with the extended MLM objective. Instead of injecting only entity embeddings, CoLAKE extracts the knowledge context of an entity from large-scale knowledge bases. To handle the heterogeneity of knowledge context and language context, we integrate them in a unified data structure, word-knowledge graph (WK graph). CoLAKE is pre-trained on large-scale WK graphs with the modified Transformer encoder. We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks. Experimental results show that CoLAKE outperforms previous counterparts on most of the tasks. Besides, CoLAKE achieves surprisingly high performance on our synthetic task called word-knowledge graph completion, which shows the superiority of simultaneously contextualizing language and knowledge representation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题