FPGA上以基因为中心的最大匹配

论文标题

FPGA上以基因为中心的最大匹配

Substream-Centric Maximum Matchings on FPGA

论文作者

Besta, Maciej, Fischer, Marc, Ben-Nun, Tal, Stanojevic, Dimitri, Licht, Johannes De Fine, Hoefler, Torsten

论文摘要

在社交网络分析，计算科学，调度和其他方面，开发高性能和节能算法变得越来越重要。在这项工作中，我们提出了为FPGA设计的第一个最大匹配算法。它是节能的，并且可以证明可以保证准确性，性能和存储利用率。为了实现这一目标，我们放弃了流行的图形处理范例，例如以顶点为中心的编程，通常需要大量的沟通成本。取而代之的是，我们提出了一种以基因为中心的方法，其中的数据流分为独立处理的子流，以使更多的并行性，同时降低通信成本。我们的工作基于流图算法的理论，并分析了14个模型和28个算法。我们使用此分析来提供与FPGA平台的物理约束相匹配的理论基础。我们的算法可提供高性能（超过4倍的速度超过调谐的并行CPU变体），低内存，高准确性和有效使用FPGA资源。以基因为中心的方法可以轻松地扩展到其他算法，以在FPGA上提供低功率和高性能图处理。

Developing high-performance and energy-efficient algorithms for maximum matchings is becoming increasingly important in social network analysis, computational sciences, scheduling, and others. In this work, we propose the first maximum matching algorithm designed for FPGAs; it is energy-efficient and has provable guarantees on accuracy, performance, and storage utilization. To achieve this, we forego popular graph processing paradigms, such as vertex-centric programming, that often entail large communication costs. Instead, we propose a substream-centric approach, in which the input stream of data is divided into substreams processed independently to enable more parallelism while lowering communication costs. We base our work on the theory of streaming graph algorithms and analyze 14 models and 28 algorithms. We use this analysis to provide theoretical underpinning that matches the physical constraints of FPGA platforms. Our algorithm delivers high performance (more than 4x speedup over tuned parallel CPU variants), low memory, high accuracy, and effective usage of FPGA resources. The substream-centric approach could easily be extended to other algorithms to offer low-power and high-performance graph processing on FPGAs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题