论文标题
用于不同的多指数系数模型的可变选择,并应用于协同GXE相互作用
Variable selection for varying multi-index coefficients models with applications to synergistic GxE interactions
论文作者
论文摘要
流行病学证据表明,同时暴露于多种环境风险因素(ES)可以增加疾病的风险,而不是单独作用的个人暴露效应。基因与多个ES在疾病风险上的相互作用称为协同基因 - 环境相互作用(SYNG $ \ times $ e)。不同的多指数系数模型(VMICM)已成为建模协同g $ \ times $ e效应的有前途的工具,并了解多个ES如何共同影响遗传风险对疾病结果。在这项工作中,我们为VMICM提出了一种三步变量选择方法,以估计基因变量的不同影响:不同,非零常数和零效应,分别对应于非线性Syng $ \ times $ e,no syng $ \ times $ e \ times $ e和无遗传效应。对于多个环境暴露变量,我们还估计并选择了有助于协同相互作用效应的重要环境变量。我们从理论上评估了所提出的变量选择方法的Oracle属性。进行了广泛的仿真研究,以评估该方法的有限样本性能,考虑到连续和离散的基因变量。应用于真实数据集进一步证明了该方法的实用性。我们的方法在目的是确定协同互动效果的领域中具有广泛的应用。
Epidemiological evidence suggests that simultaneous exposures to multiple environmental risk factors (Es) can increase disease risk larger than the additive effect of individual exposure acting alone. The interaction between a gene and multiple Es on a disease risk is termed as synergistic gene-environment interactions (synG$\times$E). Varying multi-index coefficients models (VMICM) have been a promising tool to model synergistic G$\times$E effect and to understand how multiple Es jointly influence genetic risks on a disease outcome. In this work, we proposed a 3-step variable selection approach for VMICM to estimate different effects of gene variables: varying, non-zero constant and zero effects which respectively correspond to nonlinear synG$\times$E, no synG$\times$E and no genetic effect. For multiple environmental exposure variables, we also estimated and selected important environmental variables that contribute to the synergistic interaction effect. We theoretically evaluated the oracle property of the proposed variable selection approach. Extensive simulation studies were conducted to evaluate the finite sample performance of the method, considering both continuous and discrete gene variables. Application to a real dataset further demonstrated the utility of the method. Our method has broad applications in areas where the purpose is to identify synergistic interaction effect.