论文标题
将基于网络的蛋白质复杂发现纳入自动模型构建中
Incorporating network based protein complex discovery into automated model construction
论文作者
论文摘要
我们提出了一种基于基因表达的方法分析癌症表型,通过无监督的计算图构造结合了网络生物学知识。计算图的结构构建是由在蛋白质蛋白网络上使用拓扑聚类算法驱动的,蛋白质 - 蛋白质网络上融合了蛋白质复杂发现中网络生物学研究的电感偏见。该结构从结构上限制了可能的计算图分解的假设空间,然后可以通过监督或无监督的任务设置来学习参数。计算图的稀疏构造实现了差异蛋白复合活性分析,同时还可以解释每个单个蛋白质复合物所涉及的基因/蛋白的个体贡献。在分析各种癌症表型的实验中,我们表明所提出的方法在所有任务中都超过了SVM,完全连接的MLP和随机连接的MLP。我们的工作介绍了一种可扩展的方法,用于将大型交互网络作为先验知识,以推动具有内省研究的强大计算模型的构建。
We propose a method for gene expression based analysis of cancer phenotypes incorporating network biology knowledge through unsupervised construction of computational graphs. The structural construction of the computational graphs is driven by the use of topological clustering algorithms on protein-protein networks which incorporate inductive biases stemming from network biology research in protein complex discovery. This structurally constrains the hypothesis space over the possible computational graph factorisation whose parameters can then be learned through supervised or unsupervised task settings. The sparse construction of the computational graph enables the differential protein complex activity analysis whilst also interpreting the individual contributions of genes/proteins involved in each individual protein complex. In our experiments analysing a variety of cancer phenotypes, we show that the proposed methods outperform SVM, Fully-Connected MLP, and Randomly-Connected MLPs in all tasks. Our work introduces a scalable method for incorporating large interaction networks as prior knowledge to drive the construction of powerful computational models amenable to introspective study.