论文标题

日志(图):近乎最佳的高性能图表

Log(Graph): A Near-Optimal High-Performance Graph Representation

论文作者

Besta, Maciej, Stanojevic, Dimitri, Zivic, Tijana, Singh, Jagpreet, Hoerold, Maurice, Hoefler, Torsten

论文摘要

当今在机器学习或社交网络分析之类的域中使用的图形可能包含数百亿个边缘。但是,它们不一定是有效存储的,诸如邻接之类的标准图表示列出了大量位,而图形压缩方案(例如WebGraph)通常需要耗时的减压。为了解决这个问题,我们提出了log(图):将高压缩比与非常低的压缩减压结合起来的图表表示,以实现更便宜,更快的图形处理。关键想法是编码图形,以便表示各个部分接近或匹配相应的存储下限。我们称我们的方法为“图对数”,因为这些边界通常是对数。我们的高性能日志(图)实现基于现代位置操作和最先进的简洁数据结构可实现高压缩比和性能。例如,与调谐的图算法处理基准套件(GAPB)相比,它将图形大小降低了20-35%,同时匹配GAPBS的性能,甚至由于减少了传输的数据而产生的加速。它接近已建立的WebGraph压缩库的压缩率,同时使高达2倍以上的加速度。日志(图)可以改善单个NUMA节点以及分布式内存系统上各种图形处理引擎或库的设计。

Today's graphs used in domains such as machine learning or social network analysis may contain hundreds of billions of edges. Yet, they are not necessarily stored efficiently, and standard graph representations such as adjacency lists waste a significant number of bits while graph compression schemes such as WebGraph often require time-consuming decompression. To address this, we propose Log(Graph): a graph representation that combines high compression ratios with very low-overhead decompression to enable cheaper and faster graph processing. The key idea is to encode a graph so that the parts of the representation approach or match the respective storage lower bounds. We call our approach "graph logarithmization" because these bounds are usually logarithmic. Our high-performance Log(Graph) implementation based on modern bitwise operations and state-of-the-art succinct data structures achieves high compression ratios as well as performance. For example, compared to the tuned Graph Algorithm Processing Benchmark Suite (GAPBS), it reduces graph sizes by 20-35% while matching GAPBS' performance or even delivering speedups due to reducing amounts of transferred data. It approaches the compression ratio of the established WebGraph compression library while enabling speedups of up to more than 2x. Log(Graph) can improve the design of various graph processing engines or libraries on single NUMA nodes as well as distributed-memory systems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源