干预措施矛盾地引入偏见后的模型更新

论文标题

干预措施矛盾地引入偏见后的模型更新

Model updating after interventions paradoxically introduces bias

论文作者

Liley, James, Emerson, Samuel R, Mateen, Bilal A, Vallejos, Catalina A, Aslett, Louis J M, Vollmer, Sebastian J

论文摘要

机器学习越来越多地用于生成预测模型，用于在许多现实世界中，从信用风险评估到临床决策支持。当现有预测分数构成标准工作流程（驱动干预措施）的一部分时，最近的讨论突出了对二进制结果的预测分数更新的潜在问题。在这种情况下，现有的分数会导致另一种病因途径，从而导致原始分数更换时导致错误校准。我们提出了一个一般的因果框架来描述和解决这个问题，并证明了等效的表述，作为部分观察到的马尔可夫决策过程。我们使用此模型来证明这种“幼稚更新”的影响。也就是说，我们表明连续的预测分数可能会收敛到他们预测自己的效果的程度，或者最终可能倾向于两个值之间的稳定振荡，我们认为这两种结果均不可取。此外，我们证明，即使模型拟合程序有所改善，实际性能也可能会恶化。我们通过讨论克服这些问题的几种潜在途径来补充这些发现。

Machine learning is increasingly being used to generate prediction models for use in a number of real-world settings, from credit risk assessment to clinical decision support. Recent discussions have highlighted potential problems in the updating of a predictive score for a binary outcome when an existing predictive score forms part of the standard workflow, driving interventions. In this setting, the existing score induces an additional causative pathway which leads to miscalibration when the original score is replaced. We propose a general causal framework to describe and address this problem, and demonstrate an equivalent formulation as a partially observed Markov decision process. We use this model to demonstrate the impact of such `naive updating' when performed repeatedly. Namely, we show that successive predictive scores may converge to a point where they predict their own effect, or may eventually tend toward a stable oscillation between two values, and we argue that neither outcome is desirable. Furthermore, we demonstrate that even if model-fitting procedures improve, actual performance may worsen. We complement these findings with a discussion of several potential routes to overcome these issues.

下载PDF全文

下载文献需遵守相关版权规定

论文标题