论文标题
通过精炼全景数据的时间和特征伪标签,自我监督的地方识别
Self-Supervised Place Recognition by Refining Temporal and Featural Pseudo Labels from Panoramic Data
论文作者
论文摘要
使用深网的Visual Place识别(VPR)已实现了最先进的性能。但是,他们中的大多数都需要一个带有地面真相传感器姿势的培训,以获取每个观察的空间邻里的正面和负面样本,以进行监督学习。当不可用的信息不可用时,尽管我们发现其性能次优训练,但可以利用来自依次收集的数据流的时间社区进行自我监督训练。受嘈杂的标签学习的启发,我们提出了一个名为TF-VPR的新颖的自我监督框架,该框架使用时间社区和可学习的特征社区来发现未知的空间社区。我们的方法遵循一个迭代训练范式,该范式在以下方面交替:(1)与数据增强的表示学习,(2)正设置膨胀以包括当前的特征空间邻居,以及(3)通过几何验证进行正面集合。我们在模拟数据集和真实数据集上进行自动标记和概括测试,其中RGB图像或点云作为输入。结果表明,我们的方法在召回率,鲁棒性和标题多样性方面优于自我监督的基准,这是我们为VPR提出的新型指标。可以在https://ai4ce.github.io/tf-vpr/上找到我们的代码和数据集
Visual place recognition (VPR) using deep networks has achieved state-of-the-art performance. However, most of them require a training set with ground truth sensor poses to obtain positive and negative samples of each observation's spatial neighborhood for supervised learning. When such information is unavailable, temporal neighborhoods from a sequentially collected data stream could be exploited for self-supervised training, although we find its performance suboptimal. Inspired by noisy label learning, we propose a novel self-supervised framework named TF-VPR that uses temporal neighborhoods and learnable feature neighborhoods to discover unknown spatial neighborhoods. Our method follows an iterative training paradigm which alternates between: (1) representation learning with data augmentation, (2) positive set expansion to include the current feature space neighbors, and (3) positive set contraction via geometric verification. We conduct auto-labeling and generalization tests on both simulated and real datasets, with either RGB images or point clouds as inputs. The results show that our method outperforms self-supervised baselines in recall rate, robustness, and heading diversity, a novel metric we propose for VPR. Our code and datasets can be found at https://ai4ce.github.io/TF-VPR/