基于损失估计的主动学习的时间输出差异

论文标题

基于损失估计的主动学习的时间输出差异

Temporal Output Discrepancy for Loss Estimation-based Active Learning

论文作者

Huang, Siyu, Wang, Tianyang, Xiong, Haoyi, Wen, Bihan, Huan, Jun, Dou, Dejing

论文摘要

尽管深度学习在广泛的任务中取得了成功，但它在很大程度上取决于昂贵且耗时的大量注释数据。为了降低数据注释的成本，已经提出了主动学习，以交互性地查询甲骨文，以注释未标记的数据集中的一小部分信息样本。受损失较高的样本通常比损失较低的样品更具信息性的启发，在本文中，我们提出了一种新型的深层积极学习方法，该方法在据信未标记的样本中以增加高损失时向Oracle查询数据注释。我们方法的核心是测量时间输出差异（TOD），它通过评估模型以不同优化步骤给出的输出的差异来估计样本损失。我们的理论研究表明，TOD降低了累积的样本损失，因此可以用于选择信息丰富的未标记样品。根据TOD，我们进一步制定了一种有效的未标记数据采样策略，以及无监督的学习标准用于主动学习。由于TOD的简单性，我们的方法是有效，灵活和任务不合时式的。广泛的实验结果表明，我们的方法比对图像分类和语义分割任务的最先进的主动学习方法取得了出色的表现。此外，我们表明可以利用TOD从候选模型库中选择潜在的最高测试精度的最佳模型。

While deep learning succeeds in a wide range of tasks, it highly depends on the massive collection of annotated data which is expensive and time-consuming. To lower the cost of data annotation, active learning has been proposed to interactively query an oracle to annotate a small proportion of informative samples in an unlabeled dataset. Inspired by the fact that the samples with higher loss are usually more informative to the model than the samples with lower loss, in this paper we present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss. The core of our approach is a measurement Temporal Output Discrepancy (TOD) that estimates the sample loss by evaluating the discrepancy of outputs given by models at different optimization steps. Our theoretical investigation shows that TOD lower-bounds the accumulated sample loss thus it can be used to select informative unlabeled samples. On basis of TOD, we further develop an effective unlabeled data sampling strategy as well as an unsupervised learning criterion for active learning. Due to the simplicity of TOD, our methods are efficient, flexible, and task-agnostic. Extensive experimental results demonstrate that our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks. In addition, we show that TOD can be utilized to select the best model of potentially the highest testing accuracy from a pool of candidate models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题