在线逆强化学习和未知的干扰

论文标题

在线逆强化学习和未知的干扰

Online inverse reinforcement learning with unknown disturbances

论文作者

Self, Ryan, Abudia, Moad, Kamalapurkar, Rushikesh

论文摘要

本文解决了非线性系统的在线逆增强学习的问题，并在存在未知的干扰的情况下对不确定性进行建模。开发的方法观察了代理商的状态和输入轨迹，并在线标识未知的奖励功能。未知的外部干扰在观察到的轨迹中引入的次要性通过使用一种新型基于模型的逆增强学习方法来补偿。观察者估计外部干扰，并使用结果估计来学习演示者的动态模型。学识渊博的演示器模型以及观察到的次优轨迹用于实施逆增强学习。使用Lyapunov理论提供了理论保证，并显示了一个模拟示例以证明所提出的技术的有效性。

This paper addresses the problem of online inverse reinforcement learning for nonlinear systems with modeling uncertainties while in the presence of unknown disturbances. The developed approach observes state and input trajectories for an agent and identifies the unknown reward function online. Sub-optimality introduced in the observed trajectories by the unknown external disturbance is compensated for using a novel model-based inverse reinforcement learning approach. The observer estimates the external disturbances and uses the resulting estimates to learn the dynamic model of the demonstrator. The learned demonstrator model along with the observed suboptimal trajectories are used to implement inverse reinforcement learning. Theoretical guarantees are provided using Lyapunov theory and a simulation example is shown to demonstrate the effectiveness of the proposed technique.

下载PDF全文

下载文献需遵守相关版权规定

论文标题