论文标题
基于网络的疾病基因预测方法的最新进展
Recent Advances in Network-based Methods for Disease Gene Prediction
论文作者
论文摘要
通过全基因组协会研究(GWAS),疾病与基因的关联是研究人员的一项艰巨任务。研究与特定疾病相关的单核苷酸多态性(SNP)需要对关联的统计分析。考虑到大量可能的突变,除了其高成本外,GWAS分析的另一个重要缺点是大量的假阳性。因此,研究人员寻求更多证据,通过不同的来源交叉检查其结果。为了为研究人员提供替代性低成本疾病 - 基因协会的证据,计算方法开始发挥作用。由于分子网络能够捕获疾病中分子之间的复杂相互作用,因此它们成为疾病 - 基因关联预测的最广泛使用的数据之一。在这项调查中,我们旨在对基于网络的疾病基因预测方法进行全面的和最新的评论。我们还对14种最新方法进行了经验分析。总而言之,我们首先阐明了疾病基因预测的任务定义。其次,我们将现有的基于网络的工作分类为网络扩散方法,传统的机器学习方法,具有手工制作的图形特征和图表的学习方法。第三,进行了经验分析,以评估七种疾病中所选方法的性能。我们还根据我们的经验分析提供了有关讨论方法的区分发现。最后,我们重点介绍了未来疾病基因预测研究的潜在研究方向。
Disease-gene association through Genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms (SNPs) that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false-positives. Thus, researchers search for more evidence to cross-check their results through different sources. To provide the researchers with alternative low-cost disease-gene association evidence, computational approaches come into play. Since molecular networks are able to capture complex interplay among molecules in diseases, they become one of the most extensively used data for disease-gene association prediction. In this survey, we aim to provide a comprehensive and an up-to-date review of network-based methods for disease gene prediction. We also conduct an empirical analysis on 14 state-of-the-art methods. To summarize, we first elucidate the task definition for disease gene prediction. Secondly, we categorize existing network-based efforts into network diffusion methods, traditional machine learning methods with handcrafted graph features and graph representation learning methods. Thirdly, an empirical analysis is conducted to evaluate the performance of the selected methods across seven diseases. We also provide distinguishing findings about the discussed methods based on our empirical analysis. Finally, we highlight potential research directions for future studies on disease gene prediction.