论文标题
评估个性化治疗效果预测:基于模型的歧视和校准评估的观点
Evaluating individualized treatment effect predictions: a model-based perspective on discrimination and calibration assessment
论文作者
论文摘要
近年来,人们对预测个性化治疗效应的兴趣越来越大。尽管关于这种模型的发展有了迅速增长的文献,但关于其表现的评估很少。在本文中,我们旨在促进对个性化治疗效果的预测模型的验证。感兴趣的估计定义为基于潜在结果框架,这有助于对现有和新颖措施的比较。特别是,我们检查了现有的歧视措施措施(c-for-beenfit的变化),并提出基于模型的扩展,以扩展对歧视和校准指标的治疗效果设置,这些设置在结果风险预测方面具有很强的基础。主要重点是具有二进制端点的随机试验数据以及提供个性化治疗效果预测和潜在结果预测的模型。我们使用模拟数据来深入了解所考虑的歧视和校准统计数据的特征,并进一步说明了急性缺血性中风治疗试验中的所有方法。结果表明,基于模型的统计数据在偏见和准确性方面具有最佳特征。在针对开发数据中绩效估计的乐观量调整后,重新采样方法的复制差异很大,从而限制了其准确性。因此,在独立数据中最好验证个性化的治疗效果模型。为了帮助实施,R。
In recent years, there has been a growing interest in the prediction of individualized treatment effects. While there is a rapidly growing literature on the development of such models, there is little literature on the evaluation of their performance. In this paper, we aim to facilitate the validation of prediction models for individualized treatment effects. The estimands of interest are defined as based on the potential outcomes framework, which facilitates a comparison of existing and novel measures. In particular, we examine existing measures of measures of discrimination for benefit (variations of the c-for-benefit), and propose model-based extensions to the treatment effect setting for discrimination and calibration metrics that have a strong basis in outcome risk prediction. The main focus is on randomized trial data with binary endpoints and on models that provide individualized treatment effect predictions and potential outcome predictions. We use simulated data to provide insight into the characteristics of the examined discrimination and calibration statistics under consideration, and further illustrate all methods in a trial of acute ischemic stroke treatment. The results show that the proposed model-based statistics had the best characteristics in terms of bias and accuracy. While resampling methods adjusted for the optimism of performance estimates in the development data, they had a high variance across replications that limited their accuracy. Therefore, individualized treatment effect models are best validated in independent data. To aid implementation, a software implementation of the proposed methods was made available in R.