通过神经组成的模块化终生增强学习

论文标题

通过神经组成的模块化终生增强学习

Modular Lifelong Reinforcement Learning via Neural Composition

论文作者

Mendez, Jorge A., van Seijen, Harm, Eaton, Eric

论文摘要

人类通常通过将它们分解为更容易的子问题，然后结合子问题解决方案来解决复杂的问题。这种类型的组成推理允许在解决共享一部分基本构图结构的未来任务时重复使用子问题解决方案。在持续或终生的强化学习（RL）设置中，将知识分解为可重复使用的组件的能力将使代理通过利用积累的组成结构来快速学习新的RL任务。我们基于神经模块探索了一种特定形式的组成形式，并提出了一组RL问题，这些问题直观地接受组成溶液。从经验上讲，我们证明了神经成分确实捕获了问题空间的基本结构。我们进一步提出了一种构图终身RL方法，该方法利用累积的神经成分来加速对未来任务的学习，同时通过离线RL通过离线RL在重播经验上保留先前任务的绩效。

Humans commonly solve complex problems by decomposing them into easier subproblems and then combining the subproblem solutions. This type of compositional reasoning permits reuse of the subproblem solutions when tackling future tasks that share part of the underlying compositional structure. In a continual or lifelong reinforcement learning (RL) setting, this ability to decompose knowledge into reusable components would enable agents to quickly learn new RL tasks by leveraging accumulated compositional structures. We explore a particular form of composition based on neural modules and present a set of RL problems that intuitively admit compositional solutions. Empirically, we demonstrate that neural composition indeed captures the underlying structure of this space of problems. We further propose a compositional lifelong RL method that leverages accumulated neural components to accelerate the learning of future tasks while retaining performance on previous tasks via off-line RL over replayed experiences.

下载PDF全文

下载文献需遵守相关版权规定

论文标题