论文标题
共同描述:通过模仿学习设计和行为
Co-Imitation: Learning Design and Behaviour by Imitation
论文作者
论文摘要
机器人的共同适应一直是一项长期的研究努力,其目的是将系统的身体和行为适应给定的任务,灵感来自动物的自然演变。共同适应有可能消除昂贵的手动硬件工程,并提高系统性能。共同适应的标准方法是使用奖励功能来优化行为和形态。但是,众所周知,定义和构建这种奖励功能是困难的,并且通常是一项重大的工程工作。本文介绍了关于共同适应问题的新观点,我们称之为共同构图:寻找形态和政策,使模仿者可以密切匹配演示者的行为。为此,我们提出了一种通过与演示者的状态分布相匹配来适应行为和形态的共同模拟方法。具体而言,我们专注于两种代理之间的状态和动作空间不匹配的具有挑战性的情况。我们发现,共同构图增加了各种任务和设置中的行为相似性,并通过将人的步行,慢跑和踢到模拟的人形生物转移来证明共同映射。
The co-adaptation of robots has been a long-standing research endeavour with the goal of adapting both body and behaviour of a system for a given task, inspired by the natural evolution of animals. Co-adaptation has the potential to eliminate costly manual hardware engineering as well as improve the performance of systems. The standard approach to co-adaptation is to use a reward function for optimizing behaviour and morphology. However, defining and constructing such reward functions is notoriously difficult and often a significant engineering effort. This paper introduces a new viewpoint on the co-adaptation problem, which we call co-imitation: finding a morphology and a policy that allow an imitator to closely match the behaviour of a demonstrator. To this end we propose a co-imitation methodology for adapting behaviour and morphology by matching state distributions of the demonstrator. Specifically, we focus on the challenging scenario with mismatched state- and action-spaces between both agents. We find that co-imitation increases behaviour similarity across a variety of tasks and settings, and demonstrate co-imitation by transferring human walking, jogging and kicking skills onto a simulated humanoid.