论文标题

QMCPACK MONTE CARLO代码中的层次平行性设计的高性能设计

A High-Performance Design for Hierarchical Parallelism in the QMCPACK Monte Carlo code

论文作者

Luo, Ye, Doak, Peter, Kent, Paul

论文摘要

我们在量子蒙特卡洛代码QMCPACK中引入了针对并行性的新的高性能设计。我们证明,与以前的GPU实现相比,新设计可以更好地利用异质体系结构的层次平行性。新版本能够通过蒙特卡洛步行者人群的新概念实现更高的GPU占用率,并通过使更多的主机CPU线程有效地卸载到GPU。预计将独立于基础硬件独立实现较高的性能,从而显着提高开发人员的生产率并降低代码维护成本。当GPU实施不可用或CPU执行更为最佳时,通过全力支持CPU执行,科学生产力也得到了提高。

We introduce a new high-performance design for parallelism within the Quantum Monte Carlo code QMCPACK. We demonstrate that the new design is better able to exploit the hierarchical parallelism of heterogeneous architectures compared to the previous GPU implementation. The new version is able to achieve higher GPU occupancy via the new concept of crowds of Monte Carlo walkers, and by enabling more host CPU threads to effectively offload to the GPU. The higher performance is expected to be achieved independent of the underlying hardware, significantly improving developer productivity and reducing code maintenance costs. Scientific productivity is also improved with full support for fallback to CPU execution when GPU implementations are not available or CPU execution is more optimal.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源