B2RL：用于构建批处理学习的开源数据集

论文标题

B2RL：用于构建批处理学习的开源数据集

B2RL: An open-source Dataset for Building Batch Reinforcement Learning

论文作者

Liu, Hsin-Yu, Fu, Xiaohan, Balaji, Bharathan, Gupta, Rajesh, Hong, Dezhi

论文摘要

批处理增强学习（BRL）是RL社区的新兴研究领域。它只能从静态数据集（即重播缓冲区）中学习，而无需与环境相互作用。在离线设置中，现有的重播体验被用作BRL模型的先验知识，以找到最佳策略。因此，生成重播缓冲液对于BRL模型基准至关重要。在我们的B2RL（建筑批处理RL）数据集中，我们从建筑物管理系统中收集了现实世界中的数据，以及模拟环境中几种行为策略生成的缓冲区。我们认为这可以帮助建立BRL研究专家。据我们所知，我们是第一个出于BRL学习目的开源构建数据集的人。

Batch reinforcement learning (BRL) is an emerging research area in the RL community. It learns exclusively from static datasets (i.e. replay buffers) without interaction with the environment. In the offline settings, existing replay experiences are used as prior knowledge for BRL models to find the optimal policy. Thus, generating replay buffers is crucial for BRL model benchmark. In our B2RL (Building Batch RL) dataset, we collected real-world data from our building management systems, as well as buffers generated by several behavioral policies in simulation environments. We believe it could help building experts on BRL research. To the best of our knowledge, we are the first to open-source building datasets for the purpose of BRL learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题