MLCOMP：一种基于机器学习的方法估计和自适应选择帕累托最佳编译器优化序列的方法

论文标题

MLCOMP：一种基于机器学习的方法估计和自适应选择帕累托最佳编译器优化序列的方法

MLComp: A Methodology for Machine Learning-based Performance Estimation and Adaptive Selection of Pareto-Optimal Compiler Optimization Sequences

论文作者

Colucci, Alessio, Juhász, Dávid, Mosbeck, Martin, Marchisio, Alberto, Rehman, Semeen, Kreutzer, Manfred, Nadbath, Guenther, Jantsch, Axel, Shafique, Muhammad

论文摘要

随着网络物理系统和物联网的发展，嵌入式系统已在各种消费者和工业应用中增殖。这些系统受到严格的约束，因此必须同时针对多个目标优化嵌入式软件，即减少能耗，执行时间和代码大小。编译器提供优化阶段以改善这些指标。但是，对它们的正确选择和顺序取决于多种因素，通常需要专家知识。最先进的优化器促进了不同的平台和应用程序的情况，并且一次通过一次优化一个指标，并且需要通过动态分析来对不同目标进行耗时的适应。为了解决这些问题，我们提出了新型的MLCOMP方法，其中优化阶段通过基于强化学习的政策进行了测序。基于机器学习的分析模型以快速的性能估计来支持该政策的培训，从而大大减少了动态分析所花费的时间。在我们的框架中，自动测试了不同的机器学习模型以选择最合适的机器。利用训练有素的性能估计器模型，以有效地设计基于增强学习的多目标策略，以创建准最佳相位序列。与最新的估计模型相比，我们的性能估计器模型在多个平台和应用程序域上的训练时间更快，相对误差（<2％）较低。我们的阶段选择政策分别将给定代码的执行时间和能源消耗分别提高了12％和6％。可以为任何目标平台和应用领域有效培训性能估计器和阶段选择策略。

Embedded systems have proliferated in various consumer and industrial applications with the evolution of Cyber-Physical Systems and the Internet of Things. These systems are subjected to stringent constraints so that embedded software must be optimized for multiple objectives simultaneously, namely reduced energy consumption, execution time, and code size. Compilers offer optimization phases to improve these metrics. However, proper selection and ordering of them depends on multiple factors and typically requires expert knowledge. State-of-the-art optimizers facilitate different platforms and applications case by case, and they are limited by optimizing one metric at a time, as well as requiring a time-consuming adaptation for different targets through dynamic profiling. To address these problems, we propose the novel MLComp methodology, in which optimization phases are sequenced by a Reinforcement Learning-based policy. Training of the policy is supported by Machine Learning-based analytical models for quick performance estimation, thereby drastically reducing the time spent for dynamic profiling. In our framework, different Machine Learning models are automatically tested to choose the best-fitting one. The trained Performance Estimator model is leveraged to efficiently devise Reinforcement Learning-based multi-objective policies for creating quasi-optimal phase sequences. Compared to state-of-the-art estimation models, our Performance Estimator model achieves lower relative error (<2%) with up to 50x faster training time over multiple platforms and application domains. Our Phase Selection Policy improves execution time and energy consumption of a given code by up to 12% and 6%, respectively. The Performance Estimator and the Phase Selection Policy can be trained efficiently for any target platform and application domain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题