论文标题

下颚:协变量偏移下的审计预测不确定性

JAWS: Auditing Predictive Uncertainty Under Covariate Shift

论文作者

Prinster, Drew, Liu, Anqi, Saria, Suchi

论文摘要

We propose \textbf{JAWS}, a series of wrapper methods for distribution-free uncertainty quantification tasks under covariate shift, centered on the core method \textbf{JAW}, the \textbf{JA}ckknife+ \textbf{W}eighted with data-dependent likelihood-ratio weights.下颌还包括使用高阶影响函数的JAW的计算有效\ TextBf {a} pproximations:\ textbf {jawa}。从理论上讲,我们表明JAW放宽了Jackknife+对数据交换性的假设,即使在协变量偏移下,也可以实现相同的有限样本覆盖范围保证。 Jawa进一步在共同规律性假设下以样本量或影响函数命令的限制接近JAW保证。此外,我们提出了一种通用方法来重新利用预测间隔生成方法及其对反向任务的保证:估计预测是错误的概率,基于用户指定的错误标准,例如围绕真实标签的安全或可接受的公差阈值。然后,我们将\ textbf {jaw-e}和\ textbf {jawa-e}作为此\ textbf {e} rror评估任务的重新提议的方法。实际上,在各种有偏见的现实世界数据集中,下颌优于最先进的预测推理基准,用于间隔生成和错误评估预测性不确定性审核任务。

We propose \textbf{JAWS}, a series of wrapper methods for distribution-free uncertainty quantification tasks under covariate shift, centered on the core method \textbf{JAW}, the \textbf{JA}ckknife+ \textbf{W}eighted with data-dependent likelihood-ratio weights. JAWS also includes computationally efficient \textbf{A}pproximations of JAW using higher-order influence functions: \textbf{JAWA}. Theoretically, we show that JAW relaxes the jackknife+'s assumption of data exchangeability to achieve the same finite-sample coverage guarantee even under covariate shift. JAWA further approaches the JAW guarantee in the limit of the sample size or the influence function order under common regularity assumptions. Moreover, we propose a general approach to repurposing predictive interval-generating methods and their guarantees to the reverse task: estimating the probability that a prediction is erroneous, based on user-specified error criteria such as a safe or acceptable tolerance threshold around the true label. We then propose \textbf{JAW-E} and \textbf{JAWA-E} as the repurposed proposed methods for this \textbf{E}rror assessment task. Practically, JAWS outperform state-of-the-art predictive inference baselines in a variety of biased real world data sets for interval-generation and error-assessment predictive uncertainty auditing tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源