论文标题

用于个性化医疗干预的数据模拟增强学习

Data-pooling Reinforcement Learning for Personalized Healthcare Intervention

论文作者

Chen, Xinyun, Shi, Pengyi, Pu, Shanwen

论文摘要

在许多医疗保健应用中的个性化预防性干预的新兴需求中,我们考虑了在线环境中具有未知模型参数的多阶段,动态决策问题。为了处理个性化计划中的小样本量的普遍问题,我们基于一般的扰动价值迭代框架开发了一种新颖的数据模拟增强学习(RL)算法。我们的算法自适应地汇集了历史数据,并具有三个主要创新:(i)直接汇总联系与决策绩效(以遗憾衡量)的权重,而不是常规方法的估计准确性; (ii)在历史数据和当前数据之间不需要参数假设; (iii)仅通过总统计数据来共享数据,而不是患者级数据。我们的数据模拟算法框架适用于各种流行的RL算法,我们建立了一种理论性能保证,表明我们的汇总版本比无用的同类版本实现了严格的遗憾。我们通过案例研究在分期后干预的背景下通过案例研究来证实理论发展,以防止计划外的再入院,从而为医疗保健管理提供实践见解。特别是,我们的算法减轻了有关共享健康数据的隐私问题,该数据(i)为各个组织打开了杠杆公共数据集或已发表研究以更好地管理自己的患者的大门; (ii)为公共政策制定者提供了基础,以鼓励组织共享汇总数据,以改善更广泛的社区的人口健康成果。

Motivated by the emerging needs of personalized preventative intervention in many healthcare applications, we consider a multi-stage, dynamic decision-making problem in the online setting with unknown model parameters. To deal with the pervasive issue of small sample size in personalized planning, we develop a novel data-pooling reinforcement learning (RL) algorithm based on a general perturbed value iteration framework. Our algorithm adaptively pools historical data, with three main innovations: (i) the weight of pooling ties directly to the performance of decision (measured by regret) as opposed to estimation accuracy in conventional methods; (ii) no parametric assumptions are needed between historical and current data; and (iii) requiring data-sharing only via aggregate statistics, as opposed to patient-level data. Our data-pooling algorithm framework applies to a variety of popular RL algorithms, and we establish a theoretical performance guarantee showing that our pooling version achieves a regret bound strictly smaller than that of the no-pooling counterpart. We substantiate the theoretical development with empirically better performance of our algorithm via a case study in the context of post-discharge intervention to prevent unplanned readmissions, generating practical insights for healthcare management. In particular, our algorithm alleviates privacy concerns about sharing health data, which (i) opens the door for individual organizations to levering public datasets or published studies to better manage their own patients; and (ii) provides the basis for public policy makers to encourage organizations to share aggregate data to improve population health outcomes for the broader community.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源