论文标题

了解连续结局的二分法对地统计推断的影响

Understanding the effects of dichotomization of continuous outcomes on geostatistical inference

论文作者

Kyomuhangi, Irene, Abeku, Tarekegn A., Kirby, Matthew J., Tesfaye, Gezahegn, Giorgi, Emanuele

论文摘要

诊断通常是基于预定义临界值的超出或不连续的健康指标,以便将患者分类为正在研究的疾病的阳性和负面因素。在本文中,我们研究了空间引用的连续结果变量对地统计推断的二分法的影响。尽管这个问题在其他领域进行了广泛的研究,但在流行病学研究中,二分法仍然是一种普遍的做法。此外,在流行映射的背景下,这种实践的效果尚未完全理解。在这里,我们证明了空间相关如何影响由于二分法化引起的信息丢失,如何使用线性地理模型来绘制疾病的患病率并避免二分法,最后,二分法如何影响我们对流行的预测性推断。为了实现这些目标,我们基于复合可能性开发了一个度量标准,该指标可用于量化二分法后潜在的信息损失,而无需拟合二项式地统计模型。通过一项模拟研究和非洲疾病映射的两种应用,我们表明,由于用于二分法的阈值远离基础过程的平均值,二项式地统计模型的性能大大减少。我们还发现,二分法会导致疾病患病率的细小特征的丧失,并且参数估计值的不确定性增加,尤其是在存在较大的噪声与信号比的情况下。这些发现强烈支持先前研究的结论,即在可行的情况下应始终避免二分法化。

Diagnosis is often based on the exceedance or not of continuous health indicators of a predefined cut-off value, so as to classify patients into positives and negatives for the disease under investigation. In this paper, we investigate the effects of dichotomization of spatially-referenced continuous outcome variables on geostatistical inference. Although this issue has been extensively studied in other fields, dichotomization is still a common practice in epidemiological studies. Furthermore, the effects of this practice in the context of prevalence mapping have not been fully understood. Here, we demonstrate how spatial correlation affects the loss of information due to dichotomization, how linear geostatistical models can be used to map disease prevalence and thus avoid dichotomization, and finally, how dichotomization affects our predictive inference on prevalence. To pursue these objectives, we develop a metric, based on the composite likelihood, which can be used to quantify the potential loss of information after dichotomization without requiring the fitting of Binomial geostatistical models. Through a simulation study and two applications on disease mapping in Africa, we show that, as thresholds used for dichotomization move further away from the mean of the underlying process, the performance of binomial geostatistical models deteriorates substantially. We also find that dichotomization can lead to the loss of fine scale features of disease prevalence and increased uncertainty in the parameter estimates, especially in the presence of a large noise to signal ratio. These findings strongly support the conclusions from previous studies that dichotomization should be always avoided whenever feasible.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源