论文标题
时间序列异常检测算法的本地评估
Local Evaluation of Time Series Anomaly Detection Algorithms
论文作者
论文摘要
近年来,已经开发了时间序列异常检测算法的特定评估指标来处理经典精度和召回的局限性。但是,这样的指标是作为多个期望方面的总体构建的,引入参数并消除输出的解释性。在本文中,我们首先强调了经典精度/召回的局限性以及最近基于事件的指标的主要问题 - 例如,我们表明,对手算法可以达到高精度,并在虚弱的假设下几乎所有数据集中召回。为了解决上述问题,我们根据基于地面真相和预测集之间的``隶属关系''的概念提出了一个理论上扎根,坚固,无参数和可解释的扩展到精确/回忆指标。我们的指标利用了地面真理和预测之间持续时间的衡量标准,因此具有直观的解释。通过与随机抽样的进一步比较,我们获得了归一化的精度/召回,从而量化了给定的结果一组比随机基线预测更好。通过构造,我们的方法使有关地面真理事件的本地评估保持了本地,从而实现了算法结果的细粒度可视化和解释。我们将建议与各种公共时间序列检测数据集,算法和指标进行比较。我们进一步得出了隶属指标的理论特性,这些特性给出了对其行为的明确期望,并确保针对对手策略的稳健性。
In recent years, specific evaluation metrics for time series anomaly detection algorithms have been developed to handle the limitations of the classical precision and recall. However, such metrics are heuristically built as an aggregate of multiple desirable aspects, introduce parameters and wipe out the interpretability of the output. In this article, we first highlight the limitations of the classical precision/recall, as well as the main issues of the recent event-based metrics -- for instance, we show that an adversary algorithm can reach high precision and recall on almost any dataset under weak assumption. To cope with the above problems, we propose a theoretically grounded, robust, parameter-free and interpretable extension to precision/recall metrics, based on the concept of ``affiliation'' between the ground truth and the prediction sets. Our metrics leverage measures of duration between ground truth and predictions, and have thus an intuitive interpretation. By further comparison against random sampling, we obtain a normalized precision/recall, quantifying how much a given set of results is better than a random baseline prediction. By construction, our approach keeps the evaluation local regarding ground truth events, enabling fine-grained visualization and interpretation of algorithmic results. We compare our proposal against various public time series anomaly detection datasets, algorithms and metrics. We further derive theoretical properties of the affiliation metrics that give explicit expectations about their behavior and ensure robustness against adversary strategies.