在MPI+线程通信上汲取的教训

论文标题

在MPI+线程通信上汲取的教训

Lessons Learned on MPI+Threads Communication

论文作者

Zambre, Rohit, Chandramowlishwaran, Aparna

论文摘要

Hybrid MPI+线程编程正在变得突出，但实际上，与无处不在的MPI模型相比，应用程序的性能较慢。对MPI+线程应用程序的并行效率的最关键挑战是慢速MPI_THREAD_MULTIPER性能。 MPI库最近在这方面取得了长足的进步，但是要利用其功能，用户必须在其MPI+线程应用程序中揭示通信并行性。最近的研究表明，MPI 4.0为用户提供了新的以性能为导向的选择，但是我们对这些新机制的评估表明它们构成了一些挑战。另一种设计是MPI端点。在本文中，我们从MPI的最终用户的角度进行了对不同设计的比较：领域科学家和应用程序开发人员。我们评估超出性能的指标机制，例如可用性，范围和可移植性。根据所学到的教训，我们为未来的方向提出了理由。

Hybrid MPI+threads programming is gaining prominence, but, in practice, applications perform slower with it compared to the MPI everywhere model. The most critical challenge to the parallel efficiency of MPI+threads applications is slow MPI_THREAD_MULTIPLE performance. MPI libraries have recently made significant strides on this front, but to exploit their capabilities, users must expose the communication parallelism in their MPI+threads applications. Recent studies show that MPI 4.0 provides users with new performance-oriented options to do so, but our evaluation of these new mechanisms shows that they pose several challenges. An alternative design is MPI Endpoints. In this paper, we present a comparison of the different designs from the perspective of MPI's end-users: domain scientists and application developers. We evaluate the mechanisms on metrics beyond performance such as usability, scope, and portability. Based on the lessons learned, we make a case for a future direction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题