论文标题
数据增强是一个超参数:无监督异常检测的樱桃挑选的自我审视正在产生成功的幻想
Data Augmentation is a Hyperparameter: Cherry-picked Self-Supervision for Unsupervised Anomaly Detection is Creating the Illusion of Success
论文作者
论文摘要
自我监督学习(SSL)已成为一种有前途的替代方法,可以为现实世界中的问题创建监督信号,从而避免了手动标签的广泛成本。 SSL对于无监督的任务(例如异常检测(AD))特别有吸引力,在该任务(AD)中,标记的异常很少见或不存在。基于SSL的AD(SSAD)在图像数据上使用了大量的增强功能目录,最近的工作报告说,增强类型对准确性有重大影响。这项工作是由这些动机促进的,将基于图像的SSAD放在更大的镜头下,并调查SSAD数据增强的作用。通过对3个不同检测器模型和420个AD任务进行的广泛实验,我们提供了全面的数值和视觉证据,即数据增强和产生异常的机制之间的对齐是SSAD成功的关键,而缺乏SSL可能会损害精度。据我们所知,这是关于SSAD数据增强作用的第一次荟萃分析。
Self-supervised learning (SSL) has emerged as a promising alternative to create supervisory signals to real-world problems, avoiding the extensive cost of manual labeling. SSL is particularly attractive for unsupervised tasks such as anomaly detection (AD), where labeled anomalies are rare or often nonexistent. A large catalog of augmentation functions has been used for SSL-based AD (SSAD) on image data, and recent works have reported that the type of augmentation has a significant impact on accuracy. Motivated by those, this work sets out to put image-based SSAD under a larger lens and investigate the role of data augmentation in SSAD. Through extensive experiments on 3 different detector models and across 420 AD tasks, we provide comprehensive numerical and visual evidences that the alignment between data augmentation and anomaly-generating mechanism is the key to the success of SSAD, and in the lack thereof, SSL may even impair accuracy. To the best of our knowledge, this is the first meta-analysis on the role of data augmentation in SSAD.