论文标题
序列异常检测器的统计评估
Statistical Evaluation of Anomaly Detectors for Sequences
论文作者
论文摘要
尽管精度和召回是对异常检测的标准性能指标,但它们在顺序检测设置中的统计特性知之甚少。在这项工作中,我们对顺序数据中基于点的异常检测的时间公差形式化了精确的概念。这些措施基于时间耐受性混淆矩阵,这些矩阵可用于计算许多其他标准措施的时间耐受变体。但是,必须注意保持可解释性。我们进行统计模拟研究,以证明具有时间耐受性计算时,精度和召回可能会高估检测器的性能。为了减轻此问题,我们展示了如何获得两项措施的无效分布,以评估报告结果的统计意义。
Although precision and recall are standard performance measures for anomaly detection, their statistical properties in sequential detection settings are poorly understood. In this work, we formalize a notion of precision and recall with temporal tolerance for point-based anomaly detection in sequential data. These measures are based on time-tolerant confusion matrices that may be used to compute time-tolerant variants of many other standard measures. However, care has to be taken to preserve interpretability. We perform a statistical simulation study to demonstrate that precision and recall may overestimate the performance of a detector, when computed with temporal tolerance. To alleviate this problem, we show how to obtain null distributions for the two measures to assess the statistical significance of reported results.