部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Cross-Domain Transfer via Semantic Skill Imitation

论文作者

Pertsch, Karl, Desai, Ruta, Kumar, Vikash, Meier, Franziska, Lim, Joseph J., Batra, Dhruv, Rai, Akshara

论文摘要

我们提出了一种语义模仿的方法，该方法使用了来自源域的演示，例如人类视频，以加速强化学习（RL）在不同的目标领域中，例如模拟厨房中的机器人操纵器。我们的方法没有模仿诸如联合速度之类的低级动作，而是模仿了“打开微波炉”或“打开炉子”等语义技能的顺序。这使我们能够在环境（例如现实世界中将厨房）和试剂实施方案（例如，双媒体示范到机器人臂）转移演示。我们评估了三个具有挑战性的跨域学习问题，并符合需要示范的RL方法的性能，该方法需要内域示范。在模拟的厨房环境中，我们的方法学习了长达3分钟的人类视频演示，从现实世界中的厨房学习了长马机器人操纵任务。这可以通过示范的重复使用来扩展机器人学习，例如作为人类视频收集，用于在任何数量的目标域中学习。

We propose an approach for semantic imitation, which uses demonstrations from a source domain, e.g. human videos, to accelerate reinforcement learning (RL) in a different target domain, e.g. a robotic manipulator in a simulated kitchen. Instead of imitating low-level actions like joint velocities, our approach imitates the sequence of demonstrated semantic skills like "opening the microwave" or "turning on the stove". This allows us to transfer demonstrations across environments (e.g. real-world to simulated kitchen) and agent embodiments (e.g. bimanual human demonstration to robotic arm). We evaluate on three challenging cross-domain learning problems and match the performance of demonstration-accelerated RL approaches that require in-domain demonstrations. In a simulated kitchen environment, our approach learns long-horizon robot manipulation tasks, using less than 3 minutes of human video demonstrations from a real-world kitchen. This enables scaling robot learning via the reuse of demonstrations, e.g. collected as human videos, for learning in any number of target domains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题