通过资源理性的强化学习对人类探索进行建模

论文标题

通过资源理性的强化学习对人类探索进行建模

Modeling Human Exploration Through Resource-Rational Reinforcement Learning

论文作者

Binz, Marcel, Schulz, Eric

论文摘要

为人工代理配备有用的探索机制仍然是今天的挑战。另一方面，人类似乎毫不费力地管理了探索和剥削之间的权衡。在本文中，我们提出了这样的假设，即他们通过最佳利用有限的计算资源来实现这一目标。我们通过元学习的增强学习算法来研究这一假设，这些学习算法牺牲了较短的描述长度（定义为实现给定算法所需的位数）。新兴类别的模型比以前考虑的方法更好地捕获了人类的探索行为，例如Boltzmann探索，上限置信界算法和Thompson采样。我们还证明，改变模型类别的描述长度会产生预期效果：减少描述长度捕获了脑部渗透患者的行为，同时增加了IT的行为反映了青春期的认知发展。

Equipping artificial agents with useful exploration mechanisms remains a challenge to this day. Humans, on the other hand, seem to manage the trade-off between exploration and exploitation effortlessly. In the present article, we put forward the hypothesis that they accomplish this by making optimal use of limited computational resources. We study this hypothesis by meta-learning reinforcement learning algorithms that sacrifice performance for a shorter description length (defined as the number of bits required to implement the given algorithm). The emerging class of models captures human exploration behavior better than previously considered approaches, such as Boltzmann exploration, upper confidence bound algorithms, and Thompson sampling. We additionally demonstrate that changing the description length in our class of models produces the intended effects: reducing description length captures the behavior of brain-lesioned patients while increasing it mirrors cognitive development during adolescence.

下载PDF全文

下载文献需遵守相关版权规定

论文标题