论文标题

用于放射基因组学数据的多元稀疏组拉索联合模型

Multivariate Sparse Group Lasso Joint Model for Radiogenomics Data

论文作者

Zeng, Tiantian, Selim, Md, Zhang, Jie, Stromberg, Arnold, Chen, Jin, Wang, Chi

论文摘要

放射基因组学是癌症研究中的一个新兴领域,将医学成像数据与基因组数据结合在一起,以预测患者的临床结果。在本文中,我们提出了一个多元稀疏组拉索联合模型,以整合建筑预测模型的成像和基因组数据。具体而言,我们共同考虑了两个模型,一种模型会在基因组特征上回归成像特征,而其他模型会在基因组特征上回归患者的临床结果。通过稀疏组套索进行的正规化惩罚允许纳入内在的组信息,例如生物途径和成像类别,以选择一个组内的重要内在组和重要特征。为了整合两个模型的信息,在每个模型中,我们在每个单个基因组特征的惩罚项中引入了一个权重,其中权重与其他模型中该功能的模型系数成反比。如果该功能由另一个模型选择,则该重量可以通过一个模型进行选择。我们的模型适用于连续的事件结果和时间。它还允许使用两个单独的数据集适合两个模型,从而解决了许多基因组数据集没有可用成像数据的实用挑战。模拟和实际数据分析表明,我们的方法表现优于文献中的现有方法。

Radiogenomics is an emerging field in cancer research that combines medical imaging data with genomic data to predict patients clinical outcomes. In this paper, we propose a multivariate sparse group lasso joint model to integrate imaging and genomic data for building prediction models. Specifically, we jointly consider two models, one regresses imaging features on genomic features, and the other regresses patients clinical outcomes on genomic features. The regularization penalties through sparse group lasso allow incorporation of intrinsic group information, e.g. biological pathway and imaging category, to select both important intrinsic groups and important features within a group. To integrate information from the two models, in each model, we introduce a weight in the penalty term of each individual genomic feature, where the weight is inversely correlated with the model coefficient of that feature in the other model. This weight allows a feature to have a higher chance of selection by one model if it is selected by the other model. Our model is applicable to both continuous and time to event outcomes. It also allows the use of two separate datasets to fit the two models, addressing a practical challenge that many genomic datasets do not have imaging data available. Simulations and real data analyses demonstrate that our method outperforms existing methods in the literature.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源