论文标题
使用图表网络从单细胞数据中的疾病状态预测
Disease State Prediction From Single-Cell Data Using Graph Attention Networks
论文作者
论文摘要
单细胞RNA测序(SCRNA-SEQ)已彻底改变了生物学发现,提供了组织中细胞异质性的无偏见。尽管Scrna-Seq已被广泛用于洞悉健康系统和疾病,但尚未用于疾病预测或诊断。通过从原始功能和图形结构中学习,图形注意力网络(GAT)已被证明是多种任务的通用性。在这里,我们提出了一个图形注意模型,用于预测多发性硬化症(MS)患者的大数据集中的单细胞数据中的疾病状态。 MS是一种可能难以诊断的中枢神经系统的疾病。我们将模型训练从血液和脑脊液(CSF)获得的单细胞数据,分别为7名MS患者和6名健康成年人(HA),导致66,667个单个细胞。我们在预测MS方面达到了92%的精度,超过了其他最先进的方法,例如图形卷积网络和随机森林分类器。此外,我们使用学习的图形注意模型来深入了解对于该预测很重要的特征(细胞类型和基因)。图形注意模型还使我们能够为强调两种条件之间差异的单元推理一个新的特征空间。最后,我们使用注意力重量来学习可以可视化的新的低维嵌入。据我们所知,这是使用图形关注和深度学习的第一个努力,从单细胞数据中预测疾病状态。我们设想将此方法应用于其他疾病的单细胞数据。
Single-cell RNA sequencing (scRNA-seq) has revolutionized biological discovery, providing an unbiased picture of cellular heterogeneity in tissues. While scRNA-seq has been used extensively to provide insight into both healthy systems and diseases, it has not been used for disease prediction or diagnostics. Graph Attention Networks (GAT) have proven to be versatile for a wide range of tasks by learning from both original features and graph structures. Here we present a graph attention model for predicting disease state from single-cell data on a large dataset of Multiple Sclerosis (MS) patients. MS is a disease of the central nervous system that can be difficult to diagnose. We train our model on single-cell data obtained from blood and cerebrospinal fluid (CSF) for a cohort of seven MS patients and six healthy adults (HA), resulting in 66,667 individual cells. We achieve 92 % accuracy in predicting MS, outperforming other state-of-the-art methods such as a graph convolutional network and a random forest classifier. Further, we use the learned graph attention model to get insight into the features (cell types and genes) that are important for this prediction. The graph attention model also allow us to infer a new feature space for the cells that emphasizes the differences between the two conditions. Finally we use the attention weights to learn a new low-dimensional embedding that can be visualized. To the best of our knowledge, this is the first effort to use graph attention, and deep learning in general, to predict disease state from single-cell data. We envision applying this method to single-cell data for other diseases.