论文标题
大规模视觉本地化的语义签名
Semantic Signatures for Large-scale Visual Localization
论文作者
论文摘要
视觉定位是标准定位技术的有用替代方法。它通过使用相机来起作用。在典型的情况下,从捕获的图像中提取功能,并将其与地理参考数据库进行比较。然后从匹配结果推断位置信息。传统方案主要使用低级视觉特征。这些方法具有良好的准确性,但遇到了可伸缩性问题。为了协助大型城市地区的本地化,这项工作通过利用高级语义信息来探索不同的路径。发现街道视图中的对象信息可以促进本地化。提出了一种名为“语义签名”的新颖描述方案来总结此信息。语义签名由空间位置的可见对象的类型和角度信息组成。提出了几种指标和协议以进行签名比较和检索。它们说明了准确性和复杂性之间的不同权衡。广泛的仿真结果证实了在大规模应用中提出的方案的潜力。本文是CBMI'18中会议论文的扩展版。提出了更有效的检索方案,并提供了其他实验结果。
Visual localization is a useful alternative to standard localization techniques. It works by utilizing cameras. In a typical scenario, features are extracted from captured images and compared with geo-referenced databases. Location information is then inferred from the matching results. Conventional schemes mainly use low-level visual features. These approaches offer good accuracy but suffer from scalability issues. In order to assist localization in large urban areas, this work explores a different path by utilizing high-level semantic information. It is found that object information in a street view can facilitate localization. A novel descriptor scheme called "semantic signature" is proposed to summarize this information. A semantic signature consists of type and angle information of visible objects at a spatial location. Several metrics and protocols are proposed for signature comparison and retrieval. They illustrate different trade-offs between accuracy and complexity. Extensive simulation results confirm the potential of the proposed scheme in large-scale applications. This paper is an extended version of a conference paper in CBMI'18. A more efficient retrieval protocol is presented with additional experiment results.