教师学生的课程学习用于加强学习

论文标题

教师学生的课程学习用于加强学习

Teacher-student curriculum learning for reinforcement learning

论文作者

Schraner, Yanick

论文摘要

强化学习（RL）是顺序决策问题的流行范式。在过去的十年中，RL的进步导致了许多具有挑战性的领域的突破，例如视频游戏，棋盘游戏，机器人技术和芯片设计。在将RL应用于现实世界问题时，深化增强学习方法的样本效率低下是一个重要的障碍。转移学习已应用于强化学习，以便在一项新任务中训练时可以应用一项任务中获得的知识。课程学习与测序任务或数据样本有关，以便可以在这些任务之间转移知识，以学习否则难以解决的目标任务。设计提高样品效率的课程是一个复杂的问题。在这篇论文中，我们提出了一个教师学生的课程学习设置，我们同时培训一名教师，在学生学习如何解决所选任务时为学生选择任务。我们的方法独立于人类领域知识和手动课程设计。我们评估了两种强化学习基准的方法：网格世界和充满挑战的Google足球环境。通过我们的方法，与Tabula-Rasa增强学习相比，我们可以提高学生的样本效率和一般性。

Reinforcement learning (rl) is a popular paradigm for sequential decision making problems. The past decade's advances in rl have led to breakthroughs in many challenging domains such as video games, board games, robotics, and chip design. The sample inefficiency of deep reinforcement learning methods is a significant obstacle when applying rl to real-world problems. Transfer learning has been applied to reinforcement learning such that the knowledge gained in one task can be applied when training in a new task. Curriculum learning is concerned with sequencing tasks or data samples such that knowledge can be transferred between those tasks to learn a target task that would otherwise be too difficult to solve. Designing a curriculum that improves sample efficiency is a complex problem. In this thesis, we propose a teacher-student curriculum learning setting where we simultaneously train a teacher that selects tasks for the student while the student learns how to solve the selected task. Our method is independent of human domain knowledge and manual curriculum design. We evaluated our methods on two reinforcement learning benchmarks: grid world and the challenging Google Football environment. With our method, we can improve the sample efficiency and generality of the student compared to tabula-rasa reinforcement learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题