论文标题
无监督的人重新识别的预培训
Unsupervised Pre-training for Person Re-identification
论文作者
论文摘要
在本文中,我们提出了一个大规模的未标记的人重新识别(RE-ID)数据集“ Luperson”,并首次尝试进行无监督的预训练,以提高学识渊博的人重新ID特征代表的概括能力。这是为了解决所有现有人员重新ID数据集的问题,因为数据注释所需的昂贵努力。先前的研究试图利用在Imagenet上预先训练的模型来减轻人重新ID数据的短缺,但遭受了ImageNet和人重新ID数据之间较大的域间隙。 Luperson是一个未标记的数据集,该数据集的4M图像超过200K身份,比最大的现有RE-ID数据集大30倍。它还涵盖了各种各样的捕获环境(例如,相机设置,场景等)。基于此数据集,我们从两个角度系统地研究了学习重新ID特征的关键因素:数据增强和对比度损失。在此大规模数据集上进行的无监督预训练有效地导致通用的重新ID功能,该功能可以使所有现有的人重新ID方法受益。在某些基本框架中,使用我们的预训练模型,我们的方法在四个广泛使用的重新ID数据集上实现了最先进的结果:CUHK03,Market1501,Dukemtmc和MSMT17。我们的结果还表明,在小规模的目标数据集或在几次射击设置下,性能提高更为重要。
In this paper, we present a large scale unlabeled person re-identification (Re-ID) dataset "LUPerson" and make the first attempt of performing unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation. This is to address the problem that all existing person Re-ID datasets are all of limited scale due to the costly effort required for data annotation. Previous research tries to leverage models pre-trained on ImageNet to mitigate the shortage of person Re-ID data but suffers from the large domain gap between ImageNet and person Re-ID data. LUPerson is an unlabeled dataset of 4M images of over 200K identities, which is 30X larger than the largest existing Re-ID dataset. It also covers a much diverse range of capturing environments (eg, camera settings, scenes, etc.). Based on this dataset, we systematically study the key factors for learning Re-ID features from two perspectives: data augmentation and contrastive loss. Unsupervised pre-training performed on this large-scale dataset effectively leads to a generic Re-ID feature that can benefit all existing person Re-ID methods. Using our pre-trained model in some basic frameworks, our methods achieve state-of-the-art results without bells and whistles on four widely used Re-ID datasets: CUHK03, Market1501, DukeMTMC, and MSMT17. Our results also show that the performance improvement is more significant on small-scale target datasets or under few-shot setting.