论文标题
纵向电子健康记录的连续时间概率模型
Continuous-Time Probabilistic Models for Longitudinal Electronic Health Records
论文作者
论文摘要
纵向电子健康记录(EHR)数据的分析是精确医学的重要目标。应用机器学习(ML)方法(预测性或无监督)的难度部分源于EHR数据的异质性和不规则采样。我们提出了一个无监督的概率模型,该模型在连续时间内捕获变量之间的非线性关系。该方法可用于任意采样模式,并捕获可变测量值与它们之间的时间间隔之间的关节概率分布。得出了推论算法,该算法可用于评估训练有素的模型下使用未来的可能性。例如,我们考虑美国退伍军人卫生管理局(VHA)在糖尿病和抑郁症地区的数据。产生了可能性比率图,显示出患者健康问卷-9(PHQ-9)衡量的中度重度与最小抑郁症的风险。
Analysis of longitudinal Electronic Health Record (EHR) data is an important goal for precision medicine. Difficulty in applying Machine Learning (ML) methods, either predictive or unsupervised, stems in part from the heterogeneity and irregular sampling of EHR data. We present an unsupervised probabilistic model that captures nonlinear relationships between variables over continuous-time. This method works with arbitrary sampling patterns and captures the joint probability distribution between variable measurements and the time intervals between them. Inference algorithms are derived that can be used to evaluate the likelihood of future using under a trained model. As an example, we consider data from the United States Veterans Health Administration (VHA) in the areas of diabetes and depression. Likelihood ratio maps are produced showing the likelihood of risk for moderate-severe vs minimal depression as measured by the Patient Health Questionnaire-9 (PHQ-9).