利用潜在空间先验的示威

论文标题

利用潜在空间先验的示威

Leveraging Demonstrations with Latent Space Priors

论文作者

Gehring, Jonas, Gopinath, Deepak, Won, Jungdam, Krause, Andreas, Synnaeve, Gabriel, Usunier, Nicolas

论文摘要

演示提供了对相关状态或行动空间区域的见解，具有巨大的潜力，可以提高强化学习者的效率和实用性。在这项工作中，我们建议通过结合技能学习和序列建模来利用演示数据集。从博学的联合潜在空间开始，我们分别培训了演示序列的生成模型和随附的低级政策。序列模型在合理的演示行为上构成了一个潜在的空间，以加速对高级政策的学习。我们展示了如何从仅国家运动捕获示范中获取此类先验，并探索几种将其集成到转移任务的政策学习中的方法。我们的实验结果证实，潜在空间先验在学习速度和最终表现方面可取得显着提高。我们在具有复杂，模拟的类人动物和离线RL基准测试和物体操纵基准上基于一系列具有挑战性的稀疏奖励环境进行基准测试。 Videos, source code and pre-trained models are available at the corresponding project website at https://facebookresearch.github.io/latent-space-priors .

Demonstrations provide insight into relevant state or action space regions, bearing great potential to boost the efficiency and practicality of reinforcement learning agents. In this work, we propose to leverage demonstration datasets by combining skill learning and sequence modeling. Starting with a learned joint latent space, we separately train a generative model of demonstration sequences and an accompanying low-level policy. The sequence model forms a latent space prior over plausible demonstration behaviors to accelerate learning of high-level policies. We show how to acquire such priors from state-only motion capture demonstrations and explore several methods for integrating them into policy learning on transfer tasks. Our experimental results confirm that latent space priors provide significant gains in learning speed and final performance. We benchmark our approach on a set of challenging sparse-reward environments with a complex, simulated humanoid, and on offline RL benchmarks for navigation and object manipulation. Videos, source code and pre-trained models are available at the corresponding project website at https://facebookresearch.github.io/latent-space-priors .

下载PDF全文

下载文献需遵守相关版权规定

论文标题