论文标题
通过基于加强学习的神经控制政策对风力涡轮机的能源优化马尔可夫链蒙特卡洛算法
Energy Optimization of Wind Turbines via a Neural Control Policy Based on Reinforcement Learning Markov Chain Monte Carlo Algorithm
论文作者
论文摘要
这项研究着重于使用贝叶斯加固学习(RL)对垂直轴风力涡轮机(VAWT)的数值分析和最佳控制。我们专门讨论了小规模的风力涡轮机,这些风力涡轮机非常适合小规模的局部和紧凑的电能生产,例如城市和农村基础设施装置。现有文献集中在大型风力涡轮机上,这些风力涡轮机以毫无阻碍的(主要是恒定的风轮廓)运行。但是,城市装置通常必须应对迅速变化的风模式。为了弥合这一差距,我们使用马尔可夫链蒙特卡洛(MCMC)算法制定并实施了RL策略,以优化风力涡轮机的长期能量输出。我们的基于MCMC的RL算法是一种无模型且无梯度的算法,其中设计师不必知道植物及其不确定性的精确动力学。与传统RL方法中使用的添加奖励相比,我们的方法通过使用乘法奖励结构来解决不确定性。我们已经从数值上表明,该方法特异性克服了通常与常规解决方案相关的缺点,包括但不限于在风速模式的估计中,包括但不限于组件衰老,建模误差和不准确性。我们的结果表明,所提出的方法特别成功地从风瞬变捕获功率。通过调节发电机负载,从而使转子扭矩负载,以便转子尖端速度迅速达到预期风速的最佳值。转子尖端速度与风速的比率在风能应用中至关重要。表明该方法的载荷能量效率比其他两种方法优越。经典的最大功率跟踪方法和由深层确定性策略梯度(DDPG)方法控制的生成器。
This study focuses on the numerical analysis and optimal control of vertical-axis wind turbines (VAWT) using Bayesian reinforcement learning (RL). We specifically address small-scale wind turbines, which are well-suited to local and compact production of electrical energy on a small scale, such as urban and rural infrastructure installations. Existing literature concentrates on large scale wind turbines which run in unobstructed, mostly constant wind profiles. However urban installations generally must cope with rapidly changing wind patterns. To bridge this gap, we formulate and implement an RL strategy using the Markov chain Monte Carlo (MCMC) algorithm to optimize the long-term energy output of a wind turbine. Our MCMC-based RL algorithm is a model-free and gradient-free algorithm, in which the designer does not have to know the precise dynamics of the plant and its uncertainties. Our method addresses the uncertainties by using a multiplicative reward structure, in contrast with additive reward used in conventional RL approaches. We have shown numerically that the method specifically overcomes the shortcomings typically associated with conventional solutions, including, but not limited to, component aging, modeling errors, and inaccuracies in the estimation of wind speed patterns. Our results show that the proposed method is especially successful in capturing power from wind transients; by modulating the generator load and hence the rotor torque load, so that the rotor tip speed quickly reaches the optimum value for the anticipated wind speed. This ratio of rotor tip speed to wind speed is known to be critical in wind power applications. The wind to load energy efficiency of the proposed method was shown to be superior to two other methods; the classical maximum power point tracking method and a generator controlled by deep deterministic policy gradient (DDPG) method.