通过精炼全景数据的时间和特征伪标签，自我监督的地方识别

论文标题

通过精炼全景数据的时间和特征伪标签，自我监督的地方识别

Self-Supervised Place Recognition by Refining Temporal and Featural Pseudo Labels from Panoramic Data

论文作者

Chen, Chao, Cheng, Zegang, Liu, Xinhao, Li, Yiming, Ding, Li, Wang, Ruoyu, Feng, Chen

论文摘要

使用深网的Visual Place识别（VPR）已实现了最先进的性能。但是，他们中的大多数都需要一个带有地面真相传感器姿势的培训，以获取每个观察的空间邻里的正面和负面样本，以进行监督学习。当不可用的信息不可用时，尽管我们发现其性能次优训练，但可以利用来自依次收集的数据流的时间社区进行自我监督训练。受嘈杂的标签学习的启发，我们提出了一个名为TF-VPR的新颖的自我监督框架，该框架使用时间社区和可学习的特征社区来发现未知的空间社区。我们的方法遵循一个迭代训练范式，该范式在以下方面交替：（1）与数据增强的表示学习，（2）正设置膨胀以包括当前的特征空间邻居，以及（3）通过几何验证进行正面集合。我们在模拟数据集和真实数据集上进行自动标记和概括测试，其中RGB图像或点云作为输入。结果表明，我们的方法在召回率，鲁棒性和标题多样性方面优于自我监督的基准，这是我们为VPR提出的新型指标。可以在https://ai4ce.github.io/tf-vpr/上找到我们的代码和数据集

Visual place recognition (VPR) using deep networks has achieved state-of-the-art performance. However, most of them require a training set with ground truth sensor poses to obtain positive and negative samples of each observation's spatial neighborhood for supervised learning. When such information is unavailable, temporal neighborhoods from a sequentially collected data stream could be exploited for self-supervised training, although we find its performance suboptimal. Inspired by noisy label learning, we propose a novel self-supervised framework named TF-VPR that uses temporal neighborhoods and learnable feature neighborhoods to discover unknown spatial neighborhoods. Our method follows an iterative training paradigm which alternates between: (1) representation learning with data augmentation, (2) positive set expansion to include the current feature space neighbors, and (3) positive set contraction via geometric verification. We conduct auto-labeling and generalization tests on both simulated and real datasets, with either RGB images or point clouds as inputs. The results show that our method outperforms self-supervised baselines in recall rate, robustness, and heading diversity, a novel metric we propose for VPR. Our code and datasets can be found at https://ai4ce.github.io/TF-VPR/

下载PDF全文

下载文献需遵守相关版权规定

论文标题