从先前数据中学习和检索基于技能的模仿学习

论文标题

从先前数据中学习和检索基于技能的模仿学习

Learning and Retrieval from Prior Data for Skill-based Imitation Learning

论文作者

Nasiriany, Soroush, Gao, Tian, Mandlekar, Ajay, Zhu, Yuke

论文摘要

模仿学习为机器人学习通用行为提供了有前途的途径，但由于高数据监督要求和脆弱的概括，传统上表现出有限的可扩展性。受到多任务模仿学习的最新进展的启发，我们研究了以前任务中先前数据的使用，以促进学习新任务，以强大的数据效率方式。为了有效利用先前的数据，机器人必须从过去的经验中内化知识，并在新任务中将这些知识背景而出。为此，我们开发了一个基于技能的模仿学习框架，该框架从先前的数据中提取时间扩展的感觉运动技能，并随后学习针对目标任务的政策，以调用这些学习技能。我们确定了几种关键的设计选择，可以显着提高新任务的性能，即表示学习目标，以实现更可预测的技能表示和基于检索的数据增强机制，以提高对政策培训的监督范围。在模拟和现实世界的操纵域的集合中，我们证明我们的方法显着优于现有的模仿学习和离线增强学习方法。视频和代码可从https://ut-aut-autin-rpl.github.io/sailor获得

Imitation learning offers a promising path for robots to learn general-purpose behaviors, but traditionally has exhibited limited scalability due to high data supervision requirements and brittle generalization. Inspired by recent advances in multi-task imitation learning, we investigate the use of prior data from previous tasks to facilitate learning novel tasks in a robust, data-efficient manner. To make effective use of the prior data, the robot must internalize knowledge from past experiences and contextualize this knowledge in novel tasks. To that end, we develop a skill-based imitation learning framework that extracts temporally extended sensorimotor skills from prior data and subsequently learns a policy for the target task that invokes these learned skills. We identify several key design choices that significantly improve performance on novel tasks, namely representation learning objectives to enable more predictable skill representations and a retrieval-based data augmentation mechanism to increase the scope of supervision for policy training. On a collection of simulated and real-world manipulation domains, we demonstrate that our method significantly outperforms existing imitation learning and offline reinforcement learning approaches. Videos and code are available at https://ut-austin-rpl.github.io/sailor

下载PDF全文

下载文献需遵守相关版权规定

论文标题