论文标题

部分可观测时空混沌系统的无模型预测

Enabling Dynamic and Intelligent Workflows for HPC, Data Analytics, and AI Convergence

论文作者

Ejarque, Jorge, Badia, Rosa M., Albertin, Loïc, Aloisio, Giovanni, Baglione, Enrico, Becerra, Yolanda, Boschert, Stefan, Berlin, Julian R., D'Anca, Alessandro, Elia, Donatello, Exertier, François, Fiore, Sandro, Flich, José, Folch, Arnau, Gibbons, Steven J, Koldunov, Nikolay, Lordan, Francesc, Lorito, Stefano, Løvholt, Finn, Macías, Jorge, Marozzo, Fabrizio, Michelini, Alberto, Monterrubio-Velasco, Marisol, Pienkowska, Marta, de la Puente, Josep, Queralt, Anna, Quintana-Ortí, Enrique S., Rodríguez, Juan E., Romano, Fabrizio, Rossi, Riccardo, Rybicki, Jedrzej, Kupczyk, Miroslaw, Selva, Jacopo, Talia, Domenico, Tonini, Roberto, Trunfio, Paolo, Volp, Manuela

论文摘要

高性能计算(HPC)平台的演变使这些系统中逐渐更大,更复杂的工作流程应用程序的设计和执行。复杂性不仅来自组成工作流的元素数量,还来自它们执行的计算类型。尽管传统的HPC工作流程目标模拟和物理现象的建模,但当前需求还需要数据分析(DA)和人工智能(AI)任务。但是,缺乏支持HPC,DA和AI集成的适当编程模型和环境以及缺乏轻松部署和执行HPC系统中的工作流程的工具,从而阻碍了这些工作流的开发。为了朝这个方向发展,本文介绍了需要复杂工作流程的用例,并研究了HPC/DA/AI收敛要解决的主要问题。基于这项研究,本文确定了一个新的工作流平台来管理复杂工作流程的挑战。最后,它提出了一种开发方法,用于在两个方向上解决这些挑战的这种工作流程平台:首先,通过定义一个提供了管理这些复杂工作流程的功能的软件堆栈;其次,通过提出HPC工作流程作为服务(HPCWAAS)范式,该范式利用软件堆栈来促进联合HPC基础架构中复杂工作流的可重复性。作为EuroHPC Eflows4HPC项目的一部分,这项工作中提出的建议需要进行研究和发展。

The evolution of High-Performance Computing (HPC) platforms enables the design and execution of progressively larger and more complex workflow applications in these systems. The complexity comes not only from the number of elements that compose the workflows but also from the type of computations they perform. While traditional HPC workflows target simulations and modelling of physical phenomena, current needs require in addition data analytics (DA) and artificial intelligence (AI) tasks. However, the development of these workflows is hampered by the lack of proper programming models and environments that support the integration of HPC, DA, and AI, as well as the lack of tools to easily deploy and execute the workflows in HPC systems. To progress in this direction, this paper presents use cases where complex workflows are required and investigates the main issues to be addressed for the HPC/DA/AI convergence. Based on this study, the paper identifies the challenges of a new workflow platform to manage complex workflows. Finally, it proposes a development approach for such a workflow platform addressing these challenges in two directions: first, by defining a software stack that provides the functionalities to manage these complex workflows; and second, by proposing the HPC Workflow as a Service (HPCWaaS) paradigm, which leverages the software stack to facilitate the reusability of complex workflows in federated HPC infrastructures. Proposals presented in this work are subject to study and development as part of the EuroHPC eFlows4HPC project.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源