半监督对象检测的建议学习

论文标题

半监督对象检测的建议学习

Proposal Learning for Semi-Supervised Object Detection

论文作者

Tang, Peng, Ramaiah, Chetan, Wang, Yan, Xu, Ran, Xiong, Caiming

论文摘要

在本文中，我们专注于半监督对象检测，以提高基于建议的对象检测器（又称两个阶段对象探测器）的性能，并在标记和未标记的数据上训练。但是，由于地面真相标签不可用，因此在未标记的数据上训练对象探测器是不平凡的。为了解决这个问题，我们提出了一种提案学习方法，以从标记和未标记的数据中学习提案功能和预测。该方法由一个自我监督的建议学习模块和基于一致性的建议学习模块组成。在自我监督的提案学习模块中，我们提出了一个建议位置损失和分别学习背景意识和噪音般的建议特征的对比损失。在基于一致性的提案学习模块中，我们将一致性损失应用于边界框分类和提案的回归预测，以学习噪声的建议特征和预测。我们的方法享有以下好处：1）鼓励更多的上下文信息在提案学习程序中提供； 2）嘈杂的建议特征和执行一致性以允许噪声对象检测； 3）构建一般且高性能的半监督对象检测框架，该框架可以很容易地适应具有不同骨架架构的基于建议的对象检测器。实验是在可可数据集上进行的，所有可用标记和未标记的数据。结果表明，我们的方法始终如一地提高了完全监督的基线的性能。特别是，与完全监督的基准和数据蒸馏基线相比，在与数据蒸馏结合后，我们的方法平均将AP平均提高约2.0％和0.9％。

In this paper, we focus on semi-supervised object detection to boost performance of proposal-based object detectors (a.k.a. two-stage object detectors) by training on both labeled and unlabeled data. However, it is non-trivial to train object detectors on unlabeled data due to the unavailability of ground truth labels. To address this problem, we present a proposal learning approach to learn proposal features and predictions from both labeled and unlabeled data. The approach consists of a self-supervised proposal learning module and a consistency-based proposal learning module. In the self-supervised proposal learning module, we present a proposal location loss and a contrastive loss to learn context-aware and noise-robust proposal features respectively. In the consistency-based proposal learning module, we apply consistency losses to both bounding box classification and regression predictions of proposals to learn noise-robust proposal features and predictions. Our approach enjoys the following benefits: 1) encouraging more context information to delivered in the proposals learning procedure; 2) noisy proposal features and enforcing consistency to allow noise-robust object detection; 3) building a general and high-performance semi-supervised object detection framework, which can be easily adapted to proposal-based object detectors with different backbone architectures. Experiments are conducted on the COCO dataset with all available labeled and unlabeled data. Results demonstrate that our approach consistently improves the performance of fully-supervised baselines. In particular, after combining with data distillation, our approach improves AP by about 2.0% and 0.9% on average compared to fully-supervised baselines and data distillation baselines respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题