通过增强学习的空中视图本地化：拟定搜索搜索

论文标题

通过增强学习的空中视图本地化：拟定搜索搜索

Aerial View Localization with Reinforcement Learning: Towards Emulating Search-and-Rescue

论文作者

Pirinen, Aleksis, Samuelsson, Anton, Backsund, John, Åström, Kalle

论文摘要

气候引起的灾难一直在继续上升，因此搜索和救援（SAR）操作是，任务是本地化和协助失踪的一个或几个人变得越来越相关。在许多情况下，可能知道粗糙的位置，并且可以部署无人机来探索一个给定的限制区域，以精确定位失踪人员。由于时间和电池限制，至关重要的是，要尽可能高效地进行定位。在这项工作中，我们通过将其作为空中视图目标本地化任务将其抽象为模拟类似SAR的设置而无需访问实际无人机的框架来解决此类问题。在此框架中，代理在空中图像的顶部（搜索区域的代理）运行，并负责本地定位一个用视觉提示描述的目标。为了进一步模仿实际无人机上的情况，代理无法整体观察搜索区域，甚至在低分辨率下也无法观察到搜索区域，因此它必须仅根据朝目标导航时的部分瞥见来操作。为了解决这项任务，我们提出了Airloc，Airloc是一个基于加固学习（RL）的模型，该模型将探索（寻找遥远的目标）和剥削（本地化附近的目标）。广泛的评估表明，Airloc的表现优于启发式搜索方法以及替代性可学习的方法，并且它跨数据集概括了，例如到灾难地区，没有在培训期间看到任何灾难方案。我们还进行了一项概念验证研究，表明可学习的方法平均要优于人类。代码和模型已在https://github.com/aleksispi/airloc上公开提供。

Climate-induced disasters are and will continue to be on the rise, and thus search-and-rescue (SAR) operations, where the task is to localize and assist one or several people who are missing, become increasingly relevant. In many cases the rough location may be known and a UAV can be deployed to explore a given, confined area to precisely localize the missing people. Due to time and battery constraints it is often critical that localization is performed as efficiently as possible. In this work we approach this type of problem by abstracting it as an aerial view goal localization task in a framework that emulates a SAR-like setup without requiring access to actual UAVs. In this framework, an agent operates on top of an aerial image (proxy for a search area) and is tasked with localizing a goal that is described in terms of visual cues. To further mimic the situation on an actual UAV, the agent is not able to observe the search area in its entirety, not even at low resolution, and thus it has to operate solely based on partial glimpses when navigating towards the goal. To tackle this task, we propose AiRLoc, a reinforcement learning (RL)-based model that decouples exploration (searching for distant goals) and exploitation (localizing nearby goals). Extensive evaluations show that AiRLoc outperforms heuristic search methods as well as alternative learnable approaches, and that it generalizes across datasets, e.g. to disaster-hit areas without seeing a single disaster scenario during training. We also conduct a proof-of-concept study which indicates that the learnable methods outperform humans on average. Code and models have been made publicly available at https://github.com/aleksispi/airloc.

下载PDF全文

下载文献需遵守相关版权规定

论文标题