论文标题
SEATURTLEID2022:可靠的海龟重新识别的长跨度数据集
SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification
论文作者
论文摘要
本文介绍了第一个公共大规模的长期大型数据集,其中包含在野外捕获的海龟照片 - \ href {https://www.kaggle.com/datasets/wildlifedatasets/seaturtleid2022} {seaturtleid2022}。该数据集包含8729张438个独特个体的照片,该照片在13年内收集,使其成为动物重新识别的最长跨度数据集。所有照片包括各种注释,例如身份,遇到时间戳和身体部位分割面具。该数据集代替标准的“随机”拆分,而是进行两个现实和生态动机的拆分:(i)通过培训,验证和测试数据的a \ textit {time-wairawaware封闭式},以及(ii)A \ textit {a \ time-time-Awawaweaweawaweawawaweape awawaweawawawawawaweawawawawawawawawawawawe n nevawa inter-Awawa pareation-aweawa n new New Newnect and New Unnewns new Newness in test和versed and vertagiation集合集。我们表明,随机分割导致性能高估,时间感知的分裂对于基准测试重新识别方法至关重要。此外,还提供了各个身体部位的基线实例细分和重新识别性能。最后,提出和评估了海龟再识别的端到端系统。提出的基于Hybrid Task级联的系统用于头部实例分割和Arcface训练的功能提取器的精度为86.8 \%。
This paper introduces the first public large-scale, long-span dataset with sea turtle photographs captured in the wild -- \href{https://www.kaggle.com/datasets/wildlifedatasets/seaturtleid2022}{SeaTurtleID2022}. The dataset contains 8729 photographs of 438 unique individuals collected within 13 years, making it the longest-spanned dataset for animal re-identification. All photographs include various annotations, e.g., identity, encounter timestamp, and body parts segmentation masks. Instead of standard "random" splits, the dataset allows for two realistic and ecologically motivated splits: (i) a \textit{time-aware closed-set} with training, validation, and test data from different days/years, and (ii) a \textit{time-aware open-set} with new unknown individuals in test and validation sets. We show that time-aware splits are essential for benchmarking re-identification methods, as random splits lead to performance overestimation. Furthermore, a baseline instance segmentation and re-identification performance over various body parts is provided. Finally, an end-to-end system for sea turtle re-identification is proposed and evaluated. The proposed system based on Hybrid Task Cascade for head instance segmentation and ArcFace-trained feature-extractor achieved an accuracy of 86.8\%.