唯一的模仿学习灵巧操纵

论文标题

唯一的模仿学习灵巧操纵

State-Only Imitation Learning for Dexterous Manipulation

论文作者

Radosavovic, Ilija, Wang, Xiaolong, Pinto, Lerrel, Malik, Jitendra

论文摘要

现代无模型的强化学习方法最近在许多问题上表现出了令人印象深刻的结果。然而，由于样品的复杂性高，诸如灵巧操纵之类的复杂域仍然是一个挑战。为了解决这个问题，当前的方法采用了以国家行动对的形式进行的专家演示，对于现实世界中的设置（例如从视频中学习）很难获得。在本文中，我们朝着更现实的环境迈进，并探索纯粹的模仿学习。为了应对此设置，我们训练一个反向动态模型，并使用它来预测仅国家示范的行动。逆动力学模型和政策是共同培训的。我们的方法以国家行动方法的态度执行，并且单独的RL表现非常优于RL。通过不依靠专家行动，我们可以从具有不同动态，形态和对象的演示中学习。视频可在https://people.eecs.berkeley.edu/~ilija/soil上找到。

Modern model-free reinforcement learning methods have recently demonstrated impressive results on a number of problems. However, complex domains like dexterous manipulation remain a challenge due to the high sample complexity. To address this, current approaches employ expert demonstrations in the form of state-action pairs, which are difficult to obtain for real-world settings such as learning from videos. In this paper, we move toward a more realistic setting and explore state-only imitation learning. To tackle this setting, we train an inverse dynamics model and use it to predict actions for state-only demonstrations. The inverse dynamics model and the policy are trained jointly. Our method performs on par with state-action approaches and considerably outperforms RL alone. By not relying on expert actions, we are able to learn from demonstrations with different dynamics, morphologies, and objects. Videos available at https://people.eecs.berkeley.edu/~ilija/soil .

下载PDF全文

下载文献需遵守相关版权规定

论文标题