TJU-DHD：一个多样的高分辨率数据集用于对象检测

论文标题

TJU-DHD：一个多样的高分辨率数据集用于对象检测

TJU-DHD: A Diverse High-Resolution Dataset for Object Detection

论文作者

Pang, Yanwei, Cao, Jiale, Li, Yazhao, Xie, Jin, Sun, Hanqing, Gong, Jinfeng

论文摘要

车辆，行人和骑手是自动驾驶车辆和视频监视的感知模块的最重要，最有趣的对象。但是，检测此类重要对象（尤其是小对象）的最新性能远远不令人满足实用系统的需求。大规模，丰富多样性和高分辨率数据集在开发更好的对象检测方法以满足需求方面起着重要作用。现有的公共大规模数据集，例如从网站收集的可可MS，并不关注特定方案。此外，从特定方案中收集的流行数据集（例如Kitti和Citypersons）在图像和实例，分辨率和多样性的数量上受到限制。为了解决问题，我们构建了一个多样化的高分辨率数据集（称为TJU-DHD）。该数据集包含115,354个高分辨率图像（52％的图像的分辨率为1624 $ \ times $ 1200 $ 1200像素和48％的图像的分辨率至少为2,560美元$ \ times $ 1,440像素）和709,330个标记的对象，具有较大的尺度和尺度和外观差异。同时，数据集在季节差异，照明差异和天气方差方面具有丰富的多样性。此外，进一步构建了新的不同行人数据集。使用四个不同的检测器（即，一阶段视网膜，无锚的FCO，两阶段的FPN和Cascade R-CNN）进行了有关对象检测和行人检测的实验。我们希望新建的数据集可以在这两个场景中帮助促进对象检测和行人检测的研究。该数据集可从https://github.com/tjubiit/tju-dhd获得。

Vehicles, pedestrians, and riders are the most important and interesting objects for the perception modules of self-driving vehicles and video surveillance. However, the state-of-the-art performance of detecting such important objects (esp. small objects) is far from satisfying the demand of practical systems. Large-scale, rich-diversity, and high-resolution datasets play an important role in developing better object detection methods to satisfy the demand. Existing public large-scale datasets such as MS COCO collected from websites do not focus on the specific scenarios. Moreover, the popular datasets (e.g., KITTI and Citypersons) collected from the specific scenarios are limited in the number of images and instances, the resolution, and the diversity. To attempt to solve the problem, we build a diverse high-resolution dataset (called TJU-DHD). The dataset contains 115,354 high-resolution images (52% images have a resolution of 1624$\times$1200 pixels and 48% images have a resolution of at least 2,560$\times$1,440 pixels) and 709,330 labeled objects in total with a large variance in scale and appearance. Meanwhile, the dataset has a rich diversity in season variance, illumination variance, and weather variance. In addition, a new diverse pedestrian dataset is further built. With the four different detectors (i.e., the one-stage RetinaNet, anchor-free FCOS, two-stage FPN, and Cascade R-CNN), experiments about object detection and pedestrian detection are conducted. We hope that the newly built dataset can help promote the research on object detection and pedestrian detection in these two scenes. The dataset is available at https://github.com/tjubiit/TJU-DHD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题