多模式知识图上的端到端实体分类

论文标题

多模式知识图上的端到端实体分类

End-to-End Entity Classification on Multimodal Knowledge Graphs

论文作者

Wilcke, W. X., Bloem, P., de Boer, V., Veer, R. H. van t, van Harmelen, F. A. H.

论文摘要

知识图上的端到端多模式学习在很大程度上没有得到解决。取而代之的是，大多数端到端模型（例如消息传递网络）仅从图形结构中编码的关系信息中学习：原始值或文字被完全省略，要么从其值中删除并被视为常规节点。无论哪种情况，我们都会失去潜在的相关信息，否则我们的学习方法可以利用这些信息。为了避免这种情况，我们必须将文字和非数字视为单独的案例。我们还必须分别和相应地解决每种方式：数字，文本，图像，几何形状等。我们提出了一个多模式的消息传递网络，该网络不仅从图形的结构中学习端到端，而且还从其可能的潜水模式节点特征中学习。我们的模型使用专用的（神经）编码器自然学习属于五种不同类型模式的节点特征的嵌入，包括图像和几何形状，这些图像和几何形式将投影到联合表示空间以及其关系信息。我们在节点分类任务上演示了我们的模型，并评估每种模式对整体性能的影响。我们的结果支持我们的假设，即包括来自多种方式的信息可以帮助我们的模型获得更好的整体性能。

End-to-end multimodal learning on knowledge graphs has been left largely unaddressed. Instead, most end-to-end models such as message passing networks learn solely from the relational information encoded in graphs' structure: raw values, or literals, are either omitted completely or are stripped from their values and treated as regular nodes. In either case we lose potentially relevant information which could have otherwise been exploited by our learning methods. To avoid this, we must treat literals and non-literals as separate cases. We must also address each modality separately and accordingly: numbers, texts, images, geometries, et cetera. We propose a multimodal message passing network which not only learns end-to-end from the structure of graphs, but also from their possibly divers set of multimodal node features. Our model uses dedicated (neural) encoders to naturally learn embeddings for node features belonging to five different types of modalities, including images and geometries, which are projected into a joint representation space together with their relational information. We demonstrate our model on a node classification task, and evaluate the effect that each modality has on the overall performance. Our result supports our hypothesis that including information from multiple modalities can help our models obtain a better overall performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题