论文标题
MPIX流:混合MPI+X编程的明确解决方案
MPIX Stream: An Explicit Solution to Hybrid MPI+X Programming
论文作者
论文摘要
X的混合MPI+X编程范式在高性能计算领域中获得了突出性。这对应于系统体系结构增长更加异质的趋势。当前的MPI标准仅指定MPI和线程运行时间之间的兼容性级别。对于将线程上下文或GPU流上下文传递给MPI实现的应用程序,不存在MPI概念或接口。在某些情况下,这种缺乏使性能优化变得复杂,在其他情况下是不可能的。我们在MPI中提出了一个名为MPIX流的新概念,以表示X Runtimes中存在的一般串行执行上下文。 MPIX流可以直接映射到线程或GPU执行流。将线程上下文传递到MPI允许实现可以精确地将执行上下文映射到网络端点。将GPU执行上下文传递到MPI中允许实现直接在GPU流上运行,从而降低CPU/GPU同步成本。
The hybrid MPI+X programming paradigm, where X refers to threads or GPUs, has gained prominence in the high-performance computing arena. This corresponds to a trend of system architectures growing more heterogeneous. The current MPI standard only specifies the compatibility levels between MPI and threading runtimes. No MPI concept or interface exists for applications to pass thread context or GPU stream context to MPI implementations explicitly. This lack has made performance optimization complicated in some cases and impossible in other cases. We propose a new concept in MPI, called MPIX stream, to represent the general serial execution context that exists in X runtimes. MPIX streams can be directly mapped to threads or GPU execution streams. Passing thread context into MPI allows implementations to precisely map the execution contexts to network endpoints. Passing GPU execution context into MPI allows implementations to directly operate on GPU streams, lowering the CPU/GPU synchronization cost.