论文标题
从知识图中学习实例化逻辑规则
Towards Learning Instantiated Logical Rules from Knowledge Graphs
论文作者
论文摘要
有效地从知识图(KGS)中诱导高级可解释的规律性是一项必不可少但具有挑战性的任务,使许多下游应用程序受益。在这项工作中,我们提出了GPFL,这是一项优化的概率规则学习者,以挖掘kgs的一阶逻辑规则。实例化规则包含从kgs提取的常数。与不包含常数的抽象规则相比,实例化规则能够在更多细节中解释和表达概念。 GPFL利用了一种新型的两阶段规则生成机制,该机制首先将提取的路径推广到一个无环的抽象规则中,直到达到一定程度的模板饱和度,然后将生成的模板专门用于实例化规则。与现有的作品不同,每个开采的实例化规则都基于评估规则,GPFL在结构上相似的规则之间分享了集体评估规则之间的基础。此外,我们揭示了过度拟合规则的存在,它们对预测性能的影响以及简单验证方法过滤过度拟合规则的有效性。通过公共基准数据集的大量实验,我们表明GPFL 1.)大大降低了评估实例化规则的运行时; 2.)发现比现有作品更质量的实例化规则; 3.)通过验证删除过度拟合规则来提高学习规则的预测性能; 4.)与最先进的基线相比,知识图完成任务具有竞争力。
Efficiently inducing high-level interpretable regularities from knowledge graphs (KGs) is an essential yet challenging task that benefits many downstream applications. In this work, we present GPFL, a probabilistic rule learner optimized to mine instantiated first-order logic rules from KGs. Instantiated rules contain constants extracted from KGs. Compared to abstract rules that contain no constants, instantiated rules are capable of explaining and expressing concepts in more details. GPFL utilizes a novel two-stage rule generation mechanism that first generalizes extracted paths into templates that are acyclic abstract rules until a certain degree of template saturation is achieved, then specializes the generated templates into instantiated rules. Unlike existing works that ground every mined instantiated rule for evaluation, GPFL shares groundings between structurally similar rules for collective evaluation. Moreover, we reveal the presence of overfitting rules, their impact on the predictive performance, and the effectiveness of a simple validation method filtering out overfitting rules. Through extensive experiments on public benchmark datasets, we show that GPFL 1.) significantly reduces the runtime on evaluating instantiated rules; 2.) discovers much more quality instantiated rules than existing works; 3.) improves the predictive performance of learned rules by removing overfitting rules via validation; 4.) is competitive on knowledge graph completion task compared to state-of-the-art baselines.