论文标题

将生存数据纳入病例对照研究中

Incorporating survival data into case-control studies with incident and prevalent cases

论文作者

Mandal, Soutrik, Qin, Jing, Pfeiffer, Ruth M.

论文摘要

通常,病例对照研究以估计将危险因素与逻辑回归发生的疾病发病率相关的赔率,仅包括新诊断的疾病的病例。最近提出的方法允许将有关普遍病例的信息,从疾病诊断到抽样生存的个体,纳入诊断后生存时间的参数假设下的横截面采样病例对照研究。在这里,我们提出和研究方法是在流行和事件病例中使用前瞻性观察到的生存时间来调整疾病诊断和抽样之间的时间(向后时间)的逻辑模型。这种调整得出包括普遍病例的病例对照研究的无偏见率估计值。我们提出了一个计算简单的两步通用方法估计过程。首先,我们使用预期最大化算法估算了基于半参数COX模型的生存分布,该算法可产生完全有效的估计,并适应普遍的情况和右审查的左截断。然后,我们在逻辑模型扩展到三组(对照,事件和普遍情况)的扩展中使用估计的生存分布来适应普遍的情况下的生存偏差。在模拟中,当审查的量适中时,两步过程中的赔率比例同样有效,与通过在参数假设下共同优化逻辑和生存数据可能性估计的估计。即使有90%的审查,它们也与仅在参数假设下仅使用横截面可用信息获得的估计值一样有效。这表明从病例中利用前瞻性生存数据可以降低模型的依赖性,并提高病例对照研究的关联估计精度。

Typically, case-control studies to estimate odds-ratios associating risk factors with disease incidence from logistic regression only include cases with newly diagnosed disease. Recently proposed methods allow incorporating information on prevalent cases, individuals who survived from disease diagnosis to sampling, into cross-sectionally sampled case-control studies under parametric assumptions for the survival time after diagnosis. Here we propose and study methods to additionally use prospectively observed survival times from prevalent and incident cases to adjust logistic models for the time between disease diagnosis and sampling, the backward time, for prevalent cases. This adjustment yields unbiased odds-ratio estimates from case-control studies that include prevalent cases. We propose a computationally simple two-step generalized method-of-moments estimation procedure. First, we estimate the survival distribution based on a semi-parametric Cox model using an expectation-maximization algorithm that yields fully efficient estimates and accommodates left truncation for the prevalent cases and right censoring. Then, we use the estimated survival distribution in an extension of the logistic model to three groups (controls, incident and prevalent cases), to accommodate the survival bias in prevalent cases. In simulations, when the amount of censoring was modest, odds-ratios from the two-step procedure were equally efficient as those estimated by jointly optimizing the logistic and survival data likelihoods under parametric assumptions. Even with 90% censoring they were as efficient as estimates obtained using only cross-sectionally available information under parametric assumptions. This indicates that utilizing prospective survival data from the cases lessens model dependency and improves precision of association estimates for case-control studies with prevalent cases.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源