使用专家的产品从演示中学习：操纵和任务优先级的应用

论文标题

使用专家的产品从演示中学习：操纵和任务优先级的应用

Learning from demonstration using products of experts: applications to manipulation and task prioritization

论文作者

Pignat, Emmanuel, Silvério, João, Calinon, Sylvain

论文摘要

概率分布是许多从演示（LFD）方法学习的关键组成部分。尽管操纵器的配置是由其关节角度定义的，但通常在几个任务空间中最好地解释姿势。在许多方法中，相关任务空间中的分布是独立学习的，并且仅在控制级别合并。这种简化意味着这项工作中解决的几个问题。我们表明，模型在不同任务空间中的融合可以表示为专家（POE）的产物，其中模型的概率被乘以和重新归一化，从而成为关节角度的正确分布。提出了多个实验，以表明在POE框架中共同学习不同的模型可以显着提高模型的质量。当机器人必须学习竞争性或分层目标时，提出的方法特别突出。共同训练该模型通常依赖于对比差异，这需要昂贵的近似值来影响性能。我们提出了一种使用变分推断和混合模型近似值的替代策略。特别是，我们表明所提出的方法可以通过零空间结构（POENS）扩展到POE，在该结构中，该模型能够恢复被高级目标掩盖的任务。

Probability distributions are key components of many learning from demonstration (LfD) approaches. While the configuration of a manipulator is defined by its joint angles, poses are often best explained within several task spaces. In many approaches, distributions within relevant task spaces are learned independently and only combined at the control level. This simplification implies several problems that are addressed in this work. We show that the fusion of models in different task spaces can be expressed as a product of experts (PoE), where the probabilities of the models are multiplied and renormalized so that it becomes a proper distribution of joint angles. Multiple experiments are presented to show that learning the different models jointly in the PoE framework significantly improves the quality of the model. The proposed approach particularly stands out when the robot has to learn competitive or hierarchical objectives. Training the model jointly usually relies on contrastive divergence, which requires costly approximations that can affect performance. We propose an alternative strategy using variational inference and mixture model approximations. In particular, we show that the proposed approach can be extended to PoE with a nullspace structure (PoENS), where the model is able to recover tasks that are masked by the resolution of higher-level objectives.

下载PDF全文

下载文献需遵守相关版权规定

论文标题