论文标题
一个变压器可以理解2D和3D分子数据
One Transformer Can Understand Both 2D & 3D Molecular Data
论文作者
论文摘要
与通常具有独特格式的视觉和语言数据不同,分子可以自然地使用不同的化学配方来表征。一个人可以将分子视为2D图或将其定义为位于3D空间中的原子集合。对于分子表示学习,大多数以前的作品仅针对特定的数据格式设计神经网络,从而使学习模型可能会因其他数据格式而失败。我们认为,用于化学的通用神经网络模型应该能够处理跨数据模式的分子任务。为了实现这一目标,在这项工作中,我们开发了一种新型的基于变压器的分子模型,称为Transformer-M,该模型可以将2D或3D格式的分子数据作为输入并生成有意义的语义表示。 Transformer-M使用标准变压器作为骨干结构,开发了两个分离的通道,以编码2D和3D结构信息,并将它们与网络模块中的原子特征合并在一起。当输入数据以特定格式为特定时,将激活相应的通道,另一个将被禁用。通过对2D和3D分子数据进行培训,并具有正确设计的监督信号,Transformer-M自动学会从不同的数据模式中利用知识并正确捕获表示形式。我们对变压器M进行了广泛的实验。所有经验结果都表明,变形金刚M可以同时在2D和3D任务上实现强大的性能,这表明其广泛的适用性。代码和模型将在https://github.com/lsj2408/transformer-m上公开提供。
Unlike vision and language data which usually has a unique format, molecules can naturally be characterized using different chemical formulations. One can view a molecule as a 2D graph or define it as a collection of atoms located in a 3D space. For molecular representation learning, most previous works designed neural networks only for a particular data format, making the learned models likely to fail for other data formats. We believe a general-purpose neural network model for chemistry should be able to handle molecular tasks across data modalities. To achieve this goal, in this work, we develop a novel Transformer-based Molecular model called Transformer-M, which can take molecular data of 2D or 3D formats as input and generate meaningful semantic representations. Using the standard Transformer as the backbone architecture, Transformer-M develops two separated channels to encode 2D and 3D structural information and incorporate them with the atom features in the network modules. When the input data is in a particular format, the corresponding channel will be activated, and the other will be disabled. By training on 2D and 3D molecular data with properly designed supervised signals, Transformer-M automatically learns to leverage knowledge from different data modalities and correctly capture the representations. We conducted extensive experiments for Transformer-M. All empirical results show that Transformer-M can simultaneously achieve strong performance on 2D and 3D tasks, suggesting its broad applicability. The code and models will be made publicly available at https://github.com/lsj2408/Transformer-M.