论文标题
无监督离群检测的超参数优化
Hyperparameter Optimization for Unsupervised Outlier Detection
论文作者
论文摘要
给定无监督的离群检测(OD)算法,我们如何在没有任何标签的新数据集上优化其超参数(S)(hp)?在这项工作中,我们解决了针对无监督的OD问题的具有挑战性的超参数优化,并提出了基于元学习的第一种系统方法,称为HPOD。 HPOD利用了现有的OD基准数据集中大量HP的先前性能,并传输此信息以在没有标签的新数据集上启用HP评估。此外,HPOD适应了突出的抽样范式,以有效地识别出有希望的HP。广泛的实验表明,HPOD可以与Deep(例如,健壮的自动编码器)和浅层(例如,本地离群因素(LOF)和隔离林(Iforest)(iforest)(Iforest forest(Iforest))在离散且连续的HP空间上进行,并且平均超过58%和66%的绩效效果,并且均超过66%的绩效,并且均超过58%的效果。
Given an unsupervised outlier detection (OD) algorithm, how can we optimize its hyperparameter(s) (HP) on a new dataset, without any labels? In this work, we address this challenging hyperparameter optimization for unsupervised OD problem, and propose the first systematic approach called HPOD that is based on meta-learning. HPOD capitalizes on the prior performance of a large collection of HPs on existing OD benchmark datasets, and transfers this information to enable HP evaluation on a new dataset without labels. Moreover, HPOD adapts a prominent sampling paradigm to identify promising HPs efficiently. Extensive experiments show that HPOD works with both deep (e.g., Robust AutoEncoder) and shallow (e.g., Local Outlier Factor (LOF) and Isolation Forest (iForest)) OD algorithms on discrete and continuous HP spaces, and outperforms a wide range of baselines with on average 58% and 66% performance improvement over the default HPs of LOF and iForest.