无人机自动导航的可解释深入的强化学习

论文标题

无人机自动导航的可解释深入的强化学习

Explainable Deep Reinforcement Learning for UAV Autonomous Navigation

论文作者

He, Lei, Nabil, Aouf, Song, Bifeng

论文摘要

在未知复杂环境中的自主导航仍然是一个困难的问题，尤其是对于具有有限计算资源的小型无人机（UAV）。在本文中，提出了一个基于神经网络的反应性控制器，使四极管在未知的室外环境中自动飞行。导航控制器仅利用当前传感器数据来生成控制信号，而无需进行任何优化或配置空间搜索，从而减少了内存和计算要求。导航问题被建模为马尔可夫决策过程（MDP），并使用深入增强学习（DRL）方法解决。具体来说，为了更好地了解训练有素的网络，提出了一些模型解释方法。基于功能归因，使用视觉和纹理说明来解释飞行过程中的每个决策结果。此外，还为专家提供了一些全球分析，以评估和改善训练有素的神经网络。仿真结果说明了所提出的方法可以为受过训练的模型提供有用且合理的解释，这对非专家用户和控制器设计师都是有益的。最后，现实世界的测试显示了所提出的控制器可以成功地导航四摩托器到目标位置，并且在同一计算资源下，反应性控制器的执行速度比某些常规方法快得多。

Autonomous navigation in unknown complex environment is still a hard problem, especially for small Unmanned Aerial Vehicles (UAVs) with limited computation resources. In this paper, a neural network-based reactive controller is proposed for a quadrotor to fly autonomously in unknown outdoor environment. The navigation controller makes use of only current sensor data to generate the control signal without any optimization or configuration space searching, which reduces both memory and computation requirement. The navigation problem is modelled as a Markov Decision Process (MDP) and solved using deep reinforcement learning (DRL) method. Specifically, to get better understanding of the trained network, some model explanation methods are proposed. Based on the feature attribution, each decision making result during flight is explained using both visual and texture explanation. Moreover, some global analysis are also provided for experts to evaluate and improve the trained neural network. The simulation results illustrated the proposed method can make useful and reasonable explanation for the trained model, which is beneficial for both non-expert users and controller designer. Finally, the real world tests shown the proposed controller can navigate the quadrotor to goal position successfully and the reactive controller performs much faster than some conventional approach under the same computation resource.

下载PDF全文

下载文献需遵守相关版权规定

论文标题