论文标题

Molnet:用于预测分子特性的化学直觉图神经网络

MolNet: A Chemically Intuitive Graph Neural Network for Prediction of Molecular Properties

论文作者

Kim, Yeji, Jeong, Yoonho, Kim, Jihoo, Lee, Eok Kyun, Kim, Won June, Choi, Insung S.

论文摘要

图形神经网络(GNN)由于与分子图的密切联系而成为化学领域中强大的深度学习工具。大多数GNN模型从FED原子(以及在某些情况下是键)特征收集和更新原子和分子特征,这些特征基本上基于3D分子的二维(2D)图表示。相应地,包含有关共价键或等效数据结构(例如列表)的信息的邻接矩阵已是特征升级过程(例如图形卷积)中的主要核心。但是,基于2D的模型并不能忠实地代表3D分子及其物理化学特性,以“通过空间”的效果而不是“跨键”效应的被忽视的场效应举例说明。本文提出的GNN模型(表示为molnet)是化学直觉的,可容纳分子中的3D非键信息,具有非共价邻接矩阵$ \ bf {\ bar a} $,也来自加权债券$ \ bf {bf {bf {bf {b} $。 The noncovalent atoms, not directly bonded to a given atom in a molecule, are identified within 5 $\unicode{x212B}$ of cut-off range for the construction of $\bf{\bar A}$, and $\bf{B}$ has edge weights of 1, 1.5, 2, and 3 for single, aromatic, double, and triple bonds, respectively.比较研究表明,MOLNET的表现要优于各种基线GNN模型,并在bace数据集的分类任务和ESOL数据集的回归任务中提供了最先进的性能。这项工作提出了深度学习化学的未来方向,该方向在化学直观且与现有的化学概念和工具相当的深度学习模型中。

The graph neural network (GNN) has been a powerful deep-learning tool in chemistry domain, due to its close connection with molecular graphs. Most GNN models collect and update atom and molecule features from the fed atom (and, in some cases, bond) features, which are basically based on the two-dimensional (2D) graph representation of 3D molecules. Correspondingly, the adjacency matrix, containing the information on covalent bonds, or equivalent data structures (e.g., lists) have been the main core in the feature-updating processes, such as graph convolution. However, the 2D-based models do not faithfully represent 3D molecules and their physicochemical properties, exemplified by the overlooked field effect that is a "through-space" effect, not a "through-bond" effect. The GNN model proposed herein, denoted as MolNet, is chemically intuitive, accommodating the 3D non-bond information in a molecule, with a noncovalent adjacency matrix $\bf{\bar A}$, and also bond-strength information from a weighted bond matrix $\bf{B}$. The noncovalent atoms, not directly bonded to a given atom in a molecule, are identified within 5 $\unicode{x212B}$ of cut-off range for the construction of $\bf{\bar A}$, and $\bf{B}$ has edge weights of 1, 1.5, 2, and 3 for single, aromatic, double, and triple bonds, respectively. Comparative studies show that MolNet outperforms various baseline GNN models and gives a state-of-the-art performance in the classification task of BACE dataset and regression task of ESOL dataset. This work suggests a future direction of deep-learning chemistry in the construction of deep-learning models that are chemically intuitive and comparable with the existing chemistry concepts and tools.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源