显着性引导的街道视图图像图像介绍框架，以实现高效的最后一台寻路

论文标题

显着性引导的街道视图图像图像介绍框架，以实现高效的最后一台寻路

A Saliency-Guided Street View Image Inpainting Framework for Efficient Last-Meters Wayfinding

论文作者

Hu, Chuanbo, Jia, Shan, Zhang, Fan, Li, Xin

论文摘要

全球定位系统（GPS）在各种导航应用中发挥了至关重要的作用。然而，在最后几米内将理想目的地定位仍然是一个重要但尚未解决的问题。受GPS定位精度的限制，导航系统始终向用户显示目的地的附近，但不是其确切位置。作为沉浸式媒体技术的地图中的街景图图像（SVI）有助于为人类的最后一部分寻路提供物理环境。但是，由于地理环境和获取条件的多样性，被捕获的SVI始终包含各种分散注意力的物体（例如，行人和车辆），这将使人类视觉注意力分散在过去几米中有效地找到目的地的视觉关注。为了解决这个问题，我们强调了通过提出显着引导的图像介入框架来减少基于图像的寻路的视觉分散框架的重要性。它旨在将人类视觉关注从分散注意力的对象转移到目的地相关的对象，以在最后一米中更有效，准确地找到寻路。具体而言，由深度显着对象检测驱动的分心对象检测方法已设计为从SVI中的三个语义级别提取分散注意力的对象。然后，我们采用具有快速傅立叶卷积的大型掩盖介入方法来删除被检测到的分心对象。定性和定量分析的实验结果表明，我们的显着性介绍方法不仅可以在街道视图图像中获得出色的感知质量，而且还可以将人的视觉关注重定向到更多地关注静态位置相关的物体，而不是分散注意力。基于人类的评估还证明了我们方法在提高定位目标目的地效率方面的有效性。

Global Positioning Systems (GPS) have played a crucial role in various navigation applications. Nevertheless, localizing the perfect destination within the last few meters remains an important but unresolved problem. Limited by the GPS positioning accuracy, navigation systems always show users a vicinity of a destination, but not its exact location. Street view images (SVI) in maps as an immersive media technology have served as an aid to provide the physical environment for human last-meters wayfinding. However, due to the large diversity of geographic context and acquisition conditions, the captured SVI always contains various distracting objects (e.g., pedestrians and vehicles), which will distract human visual attention from efficiently finding the destination in the last few meters. To address this problem, we highlight the importance of reducing visual distraction in image-based wayfinding by proposing a saliency-guided image inpainting framework. It aims at redirecting human visual attention from distracting objects to destination-related objects for more efficient and accurate wayfinding in the last meters. Specifically, a context-aware distracting object detection method driven by deep salient object detection has been designed to extract distracting objects from three semantic levels in SVI. Then we employ a large-mask inpainting method with fast Fourier convolutions to remove the detected distracting objects. Experimental results with both qualitative and quantitative analysis show that our saliency-guided inpainting method can not only achieve great perceptual quality in street view images but also redirect the human's visual attention to focus more on static location-related objects than distracting ones. The human-based evaluation also justified the effectiveness of our method in improving the efficiency of locating the target destination.

下载PDF全文

下载文献需遵守相关版权规定

论文标题