了解过去预测未来：加强虚拟学习

论文标题

了解过去预测未来：加强虚拟学习

Knowing the Past to Predict the Future: Reinforcement Virtual Learning

论文作者

Zhang, Peng, Huang, Yawen, Hu, Bingzhang, Wang, Shizheng, Duan, Haoran, Moubayed, Noura Al, Zheng, Yefeng, Long, Yang

论文摘要

近几十年来，基于强化学习（RL）的控制系统已受到了很大的关注。但是，在许多现实世界中的问题（例如批处理过程控制）中，环境尚不确定，这需要昂贵的互动才能获得状态和奖励价值。在本文中，我们提出了一个具有成本效益的框架，以便使用只有历史数据的预测模型在虚拟空间中可以在虚拟空间中发展自身。提出的框架使一个逐步的RL模型可以预测未来状态，并选择长远决策的最佳动作。主要重点总结为：1）如何平衡长远和短远的奖励与最佳策略； 2）如何使虚拟模型与真实环境相互作用以收敛到最终学习政策。在美联储过程的实验设置下，我们的方法始终优于现有的最新方法。

Reinforcement Learning (RL)-based control system has received considerable attention in recent decades. However, in many real-world problems, such as Batch Process Control, the environment is uncertain, which requires expensive interaction to acquire the state and reward values. In this paper, we present a cost-efficient framework, such that the RL model can evolve for itself in a Virtual Space using the predictive models with only historical data. The proposed framework enables a step-by-step RL model to predict the future state and select optimal actions for long-sight decisions. The main focuses are summarized as: 1) how to balance the long-sight and short-sight rewards with an optimal strategy; 2) how to make the virtual model interacting with real environment to converge to a final learning policy. Under the experimental settings of Fed-Batch Process, our method consistently outperforms the existing state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题