论文标题
AIIDA 1.0,一种可扩展的计算基础架构,用于自动重复的工作流和数据出处
AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
论文作者
论文摘要
计算能力的不断增长的可用性和先进的计算方法的持续发展为最近的科学进步做出了很大的贡献。这些发展提出了由要管理的大量计算和数据驱动的新挑战。下一代Exascale超级计算机将加强这些挑战,从而使自动化和可扩展的解决方案变得至关重要。近年来,我们一直在开发AIIDA(http://www.aiida.net),这是一个强大的开放源源高通量基础架构,解决了自动化工作流管理和数据出处记录的需求所带来的挑战。在这里,我们介绍了达到持续性能所需的发展和能力,AIIDA支持了数万个/小时的流程的吞吐量,同时自动保留和存储完整的数据来源在关系数据库中,从而使其可查询且可遍及可遍及,从而实现了高绩效数据分析。 AIIDA的工作流语言提供高级自动化,错误处理功能和灵活的插件模型,以允许与任何仿真软件进行接口。关联的插件注册表可以无缝共享扩展名,从而赋予了一个充满活力的用户社区,该社区致力于使模拟更加健壮,用户友好且可重复。
The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial. In recent years, we have been developing AiiDA (http://www.aiida.net), a robust open-source high-throughput infrastructure addressing the challenges arising from the needs of automated workflow management and data provenance recording. Here, we introduce developments and capabilities required to reach sustained performance, with AiiDA supporting throughputs of tens of thousands processes/hour, while automatically preserving and storing the full data provenance in a relational database making it queryable and traversable, thus enabling high-performance data analytics. AiiDA's workflow language provides advanced automation, error handling features and a flexible plugin model to allow interfacing with any simulation software. The associated plugin registry enables seamless sharing of extensions, empowering a vibrant user community dedicated to making simulations more robust, user-friendly and reproducible.