论文标题
莫雷尔:多词的关系学习
MoReL: Multi-omics Relational Learning
论文作者
论文摘要
多词数据分析有可能发现隐藏的分子相互作用,从而揭示了研究生命和疾病系统时潜在的调节性和/或信号转导途径。处理现实世界中的多态数据时,关键的挑战之一是它们可能表现出异质结构和数据质量,因为在每种类型的OMIC数据的不同条件下可能会从不同的主题中收集现有数据。我们提出了一种新型的深贝叶斯生成模型,以使用融合的Gromov-Wasserstein(FGW)的正则化,以有效地推断出跨这种异质观点的分子相互作用的多目标图,以进行集成分析。通过在深贝叶斯生成模型中进行如此最佳的传输正则化,它不仅允许将特定视图的侧面信息与图形结构化或非结构化数据合并到不同的视图中,而且还可以通过基于分布的正则化来增加模型的灵活性。与现有的基于基于点的图形嵌入方法相比,这允许有效地对齐异质潜在变量分布,从而得出可靠的相互作用预测。我们在几个现实世界数据集上的实验表明,与现有基准相比,在推断有意义的相互作用时,莫雷尔的性能提高了。
Multi-omics data analysis has the potential to discover hidden molecular interactions, revealing potential regulatory and/or signal transduction pathways for cellular processes of interest when studying life and disease systems. One of critical challenges when dealing with real-world multi-omics data is that they may manifest heterogeneous structures and data quality as often existing data may be collected from different subjects under different conditions for each type of omics data. We propose a novel deep Bayesian generative model to efficiently infer a multi-partite graph encoding molecular interactions across such heterogeneous views, using a fused Gromov-Wasserstein (FGW) regularization between latent representations of corresponding views for integrative analysis. With such an optimal transport regularization in the deep Bayesian generative model, it not only allows incorporating view-specific side information, either with graph-structured or unstructured data in different views, but also increases the model flexibility with the distribution-based regularization. This allows efficient alignment of heterogeneous latent variable distributions to derive reliable interaction predictions compared to the existing point-based graph embedding methods. Our experiments on several real-world datasets demonstrate enhanced performance of MoReL in inferring meaningful interactions compared to existing baselines.