大规模两分图的高效社区搜索

论文标题

大规模两分图的高效社区搜索

Efficient and Effective Community Search on Large-scale Bipartite Graphs

论文作者

Wang, Kai, Zhang, Wenjie, Lin, Xuemin, Zhang, Ying, Qin, Lu, Zhang, Yuting

论文摘要

两部分图被广泛用于建模两种类型的实体之间的关系。社区搜索检索了包含查询顶点的密集连接的子图，该子图已在单分段图上进行了广泛研究。但是，在两分图上进行社区搜索仍然没有探索。此外，在两分图上的所有现有凝聚力子图模型只能应用于测量两组顶点之间的结构凝聚力，同时忽略形成社区的边缘重量。在本文中，我们研究了加权两分图的重要（Alpha，beta） - 社区搜索问题。鉴于查询顶点Q，我们旨在找到Q的显着（Alpha，beta） - 社区r，该Q采用（Alpha，beta） - 核心来表征顶点的参与度，并最大程度地提高R。为了支持R的快速检索，我们首先检索（Alpha，beta）的最大连接子图，其中包含查询顶点（（Alpha，beta） - community），搜索空间仅限于此子图，其尺寸比原始图小得多。提出了一个新颖的索引结构，可以在O（delta * m）时间内构建，并占用O（delta * m）空间，其中m是G中的边缘数，Delta受M的平方根界定，并且在实践中较小。利用索引，可以在最佳时间内检索（alpha，beta） - 社区。为了进一步获得R，我们通过分别从（Alpha，beta） - 社区缩小并从查询顶点扩展来开发剥离和扩展算法来进行搜索。实际图上的实验结果不仅证明了显着（Alpha，beta） - 社区模型的有效性，而且还验证了我们的查询处理和索引技术的效率。

Bipartite graphs are widely used to model relationships between two types of entities. Community search retrieves densely connected subgraphs containing a query vertex, which has been extensively studied on unipartite graphs. However, community search on bipartite graphs remains largely unexplored. Moreover, all existing cohesive subgraph models on bipartite graphs can only be applied to measure the structure cohesiveness between two sets of vertices while overlooking the edge weight in forming the community. In this paper, we study the significant (alpha, beta)-community search problem on weighted bipartite graphs. Given a query vertex q, we aim to find the significant (alpha, beta)-community R of q which adopts (alpha, beta)-core to characterize the engagement level of vertices, and maximizes the minimum edge weight (significance) within R. To support fast retrieval of R, we first retrieve the maximal connected subgraph of (alpha, beta)-core containing the query vertex (the (alpha, beta)-community), and the search space is limited to this subgraph with a much smaller size than the original graph. A novel index structure is presented which can be built in O(delta * m) time and takes O(delta * m) space where m is the number of edges in G, delta is bounded by the square root of m and is much smaller in practice. Utilizing the index, the (alpha, beta)-community can be retrieved in optimal time. To further obtain R, we develop peeling and expansion algorithms to conduct searches by shrinking from the (alpha, beta)-community and expanding from the query vertex, respectively. The experimental results on real graphs not only demonstrate the effectiveness of the significant (alpha, beta)-community model but also validate the efficiency of our query processing and indexing techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题