统计模型的生命周期：模型故障检测，识别和改装

论文标题

统计模型的生命周期：模型故障检测，识别和改装

The Lifecycle of a Statistical Model: Model Failure Detection, Identification, and Refitting

论文作者

Ali, Alnur, Cauchois, Maxime, Duchi, John C.

论文摘要

多年来，统计机器学习社区在开发高表现力的工具来估算，预测和推理方面表现出了相当大的足智多谋。这些事态发展的基础假设是数据来自固定人群，几乎没有任何异质性。但是现实更为复杂：现在将这种假设持续存在的统计模型释放到现实世界中的系统和科学应用中时，统计模型通常会失败。因此，我们在本文中采取了不同的途径，相对于破坏的估计和预测的新方法的经历。在本文中，我们开发了用于检测和识别模型性能已经开始降级的协变空间（亚种）区域（亚群）的工具和理论，并研究干预以通过改进来解决这些故障。我们通过三个现实世界数据集提出了经验结果 - 包括涉及预测Covid-19的发生率的时间序列 - 表明我们的方法论会产生可解释的结果，可用于跟踪模型性能，并且可以通过制装来提高模型性能。我们将这些经验结果与理论进行了补充，证明我们的方法是最小化的最佳选择，用于恢复异常的亚群，并在结构化的正常均值设置中提高准确性。

The statistical machine learning community has demonstrated considerable resourcefulness over the years in developing highly expressive tools for estimation, prediction, and inference. The bedrock assumptions underlying these developments are that the data comes from a fixed population and displays little heterogeneity. But reality is significantly more complex: statistical models now routinely fail when released into real-world systems and scientific applications, where such assumptions rarely hold. Consequently, we pursue a different path in this paper vis-a-vis the well-worn trail of developing new methodology for estimation and prediction. In this paper, we develop tools and theory for detecting and identifying regions of the covariate space (subpopulations) where model performance has begun to degrade, and study intervening to fix these failures through refitting. We present empirical results with three real-world data sets -- including a time series involving forecasting the incidence of COVID-19 -- showing that our methodology generates interpretable results, is useful for tracking model performance, and can boost model performance through refitting. We complement these empirical results with theory proving that our methodology is minimax optimal for recovering anomalous subpopulations as well as refitting to improve accuracy in a structured normal means setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题