论文标题
MET数据集:实例级别的艺术品识别
The Met Dataset: Instance-level Recognition for Artworks
论文作者
论文摘要
这项工作介绍了一个数据集,用于在艺术品领域中进行大规模实例级别的识别。拟议的基准表现出许多不同的挑战,例如大型级别的相似性,较长的尾巴分布和许多类别。我们依靠大都会博物馆的开放访问集合组成了大约224K课程的大型培训集,每个班级都对应于博物馆展览,并在工作室条件下拍摄照片。测试主要是在描绘展览的博物馆客人拍摄的照片上进行的,该照片引入了培训和测试之间的分配转变。还对一组与MET展览品无关的图像进行测试,这使得该任务类似于分布外检测问题。提出的基准测试遵循其他最近数据集的范式,以便在不同领域的实例级别识别,以鼓励对域独立方法进行研究。评估了许多合适的方法,以提供测试床以进行将来的比较。有效地组合了自我监督和监督的对比学习,以训练用于非参数分类的骨干,这被视为有希望的方向。数据集网页:http://cmp.felk.cvut.cz/met/
This work introduces a dataset for large-scale instance-level recognition in the domain of artworks. The proposed benchmark exhibits a number of different challenges such as large inter-class similarity, long tail distribution, and many classes. We rely on the open access collection of The Met museum to form a large training set of about 224k classes, where each class corresponds to a museum exhibit with photos taken under studio conditions. Testing is primarily performed on photos taken by museum guests depicting exhibits, which introduces a distribution shift between training and testing. Testing is additionally performed on a set of images not related to Met exhibits making the task resemble an out-of-distribution detection problem. The proposed benchmark follows the paradigm of other recent datasets for instance-level recognition on different domains to encourage research on domain independent approaches. A number of suitable approaches are evaluated to offer a testbed for future comparisons. Self-supervised and supervised contrastive learning are effectively combined to train the backbone which is used for non-parametric classification that is shown as a promising direction. Dataset webpage: http://cmp.felk.cvut.cz/met/