相邻的基于州的RL勘探

论文标题

相邻的基于州的RL勘探

Neighboring state-based RL Exploration

论文作者

Cheng, Jeffery, Li, Kevin, Lin, Justin, Pachuca, Pedro

论文摘要

强化学习是建模决策过程的强大工具。但是，它依赖于探索探索的权衡，这仍然是许多任务的公开挑战。在这项工作中，我们研究了由直觉的邻近基于州的，无模型的探索，即对于早期代理而言，考虑到附近各州有限区域的行动可能会导致探索时更好的行动。我们提出了两种算法，这些算法根据对附近州的调查选择探索性措施，并发现我们的一种方法，即$ρ$ - 探索，在评估奖励回报方面，在离散环境中始终优于离散环境中的双DQN基线49 \％。

Reinforcement Learning is a powerful tool to model decision-making processes. However, it relies on an exploration-exploitation trade-off that remains an open challenge for many tasks. In this work, we study neighboring state-based, model-free exploration led by the intuition that, for an early-stage agent, considering actions derived from a bounded region of nearby states may lead to better actions when exploring. We propose two algorithms that choose exploratory actions based on a survey of nearby states, and find that one of our methods, $ρ$-explore, consistently outperforms the Double DQN baseline in an discrete environment by 49\% in terms of Eval Reward Return.

下载PDF全文

下载文献需遵守相关版权规定

论文标题