一个测量所有内容的指标：本地化回忆精度（LRP）用于评估视觉检测任务

论文标题

一个测量所有内容的指标：本地化回忆精度（LRP）用于评估视觉检测任务

One Metric to Measure them All: Localisation Recall Precision (LRP) for Evaluating Visual Detection Tasks

论文作者

Oksuz, Kemal, Cam, Baris Can, Kalkan, Sinan, Akbas, Emre

论文摘要

尽管被广泛用作视觉检测任务的性能度量，但平均精度（AP）在（i）反映本地化质量，（ii）可解释性和（iii）对其计算的设计选择及其在没有置信分数的输出的适用性的鲁棒性中受到限制。 Panoptic质量（PQ）是一种用于评估全景分割的措施（Kirillov等，2019），并不受这些局限性的困扰，而是限于全磁分割。在本文中，我们建议本地化召回精度（LRP）误差为基于其定位和分类质量计算的视觉检测器的平均匹配误差，以给定的置信度得分阈值。 LRP误差，最初仅由Oksuz等人提出用于对象检测。（2018年），不受上述限制的困扰，也适用于所有视觉检测任务。我们还引入了最佳LRP（OLRP）误差，因为在置信得分上获得的最小LRP误差以评估视觉检测器并获得用于部署的最佳阈值。 We provide a detailed comparative analysis of LRP Error with AP and PQ, and use nearly 100 state-of-the-art visual detectors from seven visual detection tasks (i.e. object detection, keypoint detection, instance segmentation, panoptic segmentation, visual relationship detection, zero-shot detection and generalised zero-shot detection) using ten datasets to empirically show that LRP Error provides richer and more discriminative information than its counterparts.代码可用：https：//github.com/kemaloksuz/lrp-error

Despite being widely used as a performance measure for visual detection tasks, Average Precision (AP) is limited in (i) reflecting localisation quality, (ii) interpretability and (iii) robustness to the design choices regarding its computation, and its applicability to outputs without confidence scores. Panoptic Quality (PQ), a measure proposed for evaluating panoptic segmentation (Kirillov et al., 2019), does not suffer from these limitations but is limited to panoptic segmentation. In this paper, we propose Localisation Recall Precision (LRP) Error as the average matching error of a visual detector computed based on both its localisation and classification qualities for a given confidence score threshold. LRP Error, initially proposed only for object detection by Oksuz et al. (2018), does not suffer from the aforementioned limitations and is applicable to all visual detection tasks. We also introduce Optimal LRP (oLRP) Error as the minimum LRP Error obtained over confidence scores to evaluate visual detectors and obtain optimal thresholds for deployment. We provide a detailed comparative analysis of LRP Error with AP and PQ, and use nearly 100 state-of-the-art visual detectors from seven visual detection tasks (i.e. object detection, keypoint detection, instance segmentation, panoptic segmentation, visual relationship detection, zero-shot detection and generalised zero-shot detection) using ten datasets to empirically show that LRP Error provides richer and more discriminative information than its counterparts. Code available at: https://github.com/kemaloksuz/LRP-Error

下载PDF全文

下载文献需遵守相关版权规定

论文标题