Julia Cloud Matrix机器：云中多层簇上的动态矩阵加速度

论文标题

Julia Cloud Matrix机器：云中多层簇上的动态矩阵加速度

Julia Cloud Matrix Machine: Dynamic Matrix Language Acceleration on Multicore Clusters in the Cloud

论文作者

Lee, Jay Hwan, Kim, Yeonsoo, Ryu, Younghyun, Sodsong, Wasuwee, Jeon, Hyunjun, Park, Jinsik, Burgstaller, Bernd, Scholz, Bernhard

论文摘要

在新兴的科学计算环境中，大小和复杂性的增加矩阵计算变得越来越普遍。但是，当代矩阵语言实现不足以支持有效利用云计算资源，尤其是在用户方面。因此，我们开发了朱莉娅高性能计算语言的扩展，以便在云中自动并行矩阵计算，其中用户与与复杂的显式显式平行计算直接相互作用。我们实现了懒惰的评估语义与有向图相结合，以优化矩阵操作，而动态模拟为给定的云节点群找到了最佳的瓷砖大小和时间表。构建了集群性能能力的时间模型预测以启用模拟。云网络上通信和工作过程的自动配置允许该框架自动扩展到异质节点的簇。我们的框架的实验评估包括AWS公共云中14个节点（564 CPU）群的11个基准，揭示了高达5.1倍的加速度，平均74.39％的上限为加速速度。

In emerging scientific computing environments, matrix computations of increasing size and complexity are increasingly becoming prevalent. However, contemporary matrix language implementations are insufficient in their support for efficient utilization of cloud computing resources, particularly on the user side. We thus developed an extension of the Julia high-performance computation language such that matrix computations are automatically parallelized in the cloud, where users are separated from directly interacting with complex explicitly-parallel computations. We implement lazy evaluation semantics combined with directed graphs to optimize matrix operations on the fly while dynamic simulation finds the optimal tile size and schedule for a given cluster of cloud nodes. A time model prediction of the cluster's performance capacity is constructed to enable simulations. Automatic configuration of communication and worker processes on the cloud networks allow for the framework to automatically scale up for clusters of heterogeneous nodes. Our framework's experimental evaluation comprises eleven benchmarks on an fourteen node (564 CPUs) cluster in the AWS public cloud, revealing speedups of up to a factor of 5.1, with an average 74.39% of the upper bound for speedups.

下载PDF全文

下载文献需遵守相关版权规定

论文标题