论文标题
近似时只需slaq:网络尺度图的准确光谱距离
Just SLaQ When You Approximate: Accurate Spectral Distances for Web-Scale Graphs
论文作者
论文摘要
图比较是数据挖掘和信息检索的基本操作。由于图的组合性质,很难平衡相似度度量及其可扩展性的表现力。光谱分析提供了用于研究图的多尺度结构的典型工具,并且是旨在推理图之间差异的基础。但是,计算全图的全频谱在计算上是过敏的。因此,光谱图比较方法通常依赖于具有弱误差保证的粗糙近似技术。在这项工作中,我们提出了Slaq,这是一种用于计算数十亿节点和边缘图之间光谱距离的有效近似技术。我们得出相应的误差界限,并证明在图边数中可能会在时间线性中进行准确的计算。在彻底的实验评估中,我们表明,Slaq的表现优于现有方法,通常以近似精度的几个数量级数量,并保持可比性的性能,从而可以在一台机器上在几分钟内比较数百万尺度的图形。
Graph comparison is a fundamental operation in data mining and information retrieval. Due to the combinatorial nature of graphs, it is hard to balance the expressiveness of the similarity measure and its scalability. Spectral analysis provides quintessential tools for studying the multi-scale structure of graphs and is a well-suited foundation for reasoning about differences between graphs. However, computing full spectrum of large graphs is computationally prohibitive; thus, spectral graph comparison methods often rely on rough approximation techniques with weak error guarantees. In this work, we propose SLaQ, an efficient and effective approximation technique for computing spectral distances between graphs with billions of nodes and edges. We derive the corresponding error bounds and demonstrate that accurate computation is possible in time linear in the number of graph edges. In a thorough experimental evaluation, we show that SLaQ outperforms existing methods, oftentimes by several orders of magnitude in approximation accuracy, and maintains comparable performance, allowing to compare million-scale graphs in a matter of minutes on a single machine.