论文标题
基于强化学习的适应性元启发术
Reinforcement learning based adaptive metaheuristics
论文作者
论文摘要
参数适应性,即根据面临的问题自动调整算法的超参数的能力,是应用于数值优化的进化计算的主要趋势之一。尽管多年来已经提出了几种手工制作的适应政策来解决这个问题,但到目前为止,只有很少的尝试应用机器学习来学习此类政策。在这里,我们介绍了一个通用框架,用于基于最新的增强学习算法在连续域元启发术中进行参数适应。我们证明了该框架在两种算法上的适用性,即协方差矩阵适应进化策略(CMA-ES)和差异进化(DE),我们分别学习了步骤大小的适应性策略(CMA-ES),以及CMA-ES的适应性策略,以及尺度因子和交叉率(DE)。我们在不同维度的一组46个基准函数上训练这些策略,并在两个设置中对策略进行了各种投入:每个功能的一个策略,以及所有功能的全局策略。将分别与累积的阶梯尺寸适应性(CSA)政策和两个众所周知的自适应DE变体(IDE和JDE)进行了比较,我们的政策能够在大多数情况下产生竞争性结果,尤其是在DE的情况下。
Parameter adaptation, that is the capability to automatically adjust an algorithm's hyperparameters depending on the problem being faced, is one of the main trends in evolutionary computation applied to numerical optimization. While several handcrafted adaptation policies have been proposed over the years to address this problem, only few attempts have been done so far at applying machine learning to learn such policies. Here, we introduce a general-purpose framework for performing parameter adaptation in continuous-domain metaheuristics based on state-of-the-art reinforcement learning algorithms. We demonstrate the applicability of this framework on two algorithms, namely Covariance Matrix Adaptation Evolution Strategies (CMA-ES) and Differential Evolution (DE), for which we learn, respectively, adaptation policies for the step-size (for CMA-ES), and the scale factor and crossover rate (for DE). We train these policies on a set of 46 benchmark functions at different dimensionalities, with various inputs to the policies, in two settings: one policy per function, and one global policy for all functions. Compared, respectively, to the Cumulative Step-size Adaptation (CSA) policy and to two well-known adaptive DE variants (iDE and jDE), our policies are able to produce competitive results in the majority of cases, especially in the case of DE.