论文标题
递归标签偶像图查询的可及性指数
A Reachability Index for Recursive Label-Concatenated Graph Queries
论文作者
论文摘要
可以查看从源节点到目标节点的路径的存在性查询是查询和处理图数据的基本操作员。当前基于指数的可及性查询评估的方法要么集中于平整可及性或基于约束的可及性。在本文中,我们第一次研究基于索引的递归标记符合性可及性查询的问题,称为RLC查询。这些查询检查了可以满足通过Kleene Plus下最多k边缘标签的串联定义的约束的路径的存在。许多实用的图形数据库和网络分析应用程序都显示RLC查询。但是,他们的评估在当前的图形数据库引擎中仍然令人难以置信。 我们介绍了RLC索引,这是有效处理RLC查询的第一个可及性索引。 RLC索引检查源顶点是否可以到达中间顶点,该顶点也可以在递归标签征收约束下达到目标顶点。我们提出了一种索引算法来构建RLC索引,该算法保证了查询执行的健全性和完整性,并避免记录冗余索引条目。关于现实世界图的全面实验表明,RLC指数可以显着降低离线处理成本和及时闭合的内存开销,同时改善查询处理比在线遍历上最多六个数量级。最后,我们对RLC索引的开源实施极大地胜过当前主流图引擎,用于评估RLC查询。
Reachability queries checking the existence of a path from a source node to a target node are fundamental operators for querying and processing graph data. Current approaches for index-based evaluation of reachability queries either focus on plain reachability or constraint-based reachability with alternation only. In this paper, for the first time we study the problem of index-based processing for recursive label-concatenated reachability queries, referred to as RLC queries. These queries check the existence of a path that can satisfy the constraint defined by a concatenation of at most k edge labels under the Kleene plus. Many practical graph database and network analysis applications exhibit RLC queries. However, their evaluation remains prohibitive in current graph database engines. We introduce the RLC index, the first reachability index to efficiently process RLC queries. The RLC index checks whether the source vertex can reach an intermediate vertex that can also reach the target vertex under a recursive label-concatenated constraint. We propose an indexing algorithm to build the RLC index, which guarantees the soundness and the completeness of query execution and avoids recording redundant index entries. Comprehensive experiments on real-world graphs show that the RLC index can significantly reduce both the offline processing cost and the memory overhead of transitive closure while improving query processing up to six orders of magnitude over online traversals. Finally, our open-source implementation of the RLC index significantly outperforms current mainstream graph engines for evaluating RLC queries.