凹入式和背包设置中有限的情节增强学习

论文标题

凹入式和背包设置中有限的情节增强学习

Constrained episodic reinforcement learning in concave-convex and knapsack settings

论文作者

Brantley, Kianté, Dudik, Miroslav, Lykouris, Thodoris, Miryoosefi, Sobhan, Simchowitz, Max, Slivkins, Aleksandrs, Sun, Wen

论文摘要

我们提出了一种具有限制性的表个性发作加强学习算法。我们为具有凹入奖励和凸约限制的设置以及具有硬约束（背包）的设置提供了具有强大理论保证的模块化分析。以前的大多数限制强化学习中的工作都仅限于线性约束，其余的工作集中于可行性问题或单个情节的设置。我们的实验表明，在现有受约束的情节环境中，提出的算法显着优于这些方法。

We propose an algorithm for tabular episodic reinforcement learning with constraints. We provide a modular analysis with strong theoretical guarantees for settings with concave rewards and convex constraints, and for settings with hard constraints (knapsacks). Most of the previous work in constrained reinforcement learning is limited to linear constraints, and the remaining work focuses on either the feasibility question or settings with a single episode. Our experiments demonstrate that the proposed algorithm significantly outperforms these approaches in existing constrained episodic environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题