论文标题

学习分子产生的潜在空间基于能量的先验模型

Learning Latent Space Energy-Based Prior Model for Molecule Generation

论文作者

Pang, Bo, Han, Tian, Wu, Ying Nian

论文摘要

深层生成模型最近已应用于分子设计。如果分子用线性微笑字符串编码,则建模将变得方便。但是,依赖字符串表示的模型倾向于生成无效的样本和重复。先前的工作通过建立有关化学磁变碎片或在生成过程中明确执行化学规则的模型来解决这些问题。我们认为,即使分子用简单的字符级微笑字符串编码,也可以从数据中隐含和自动学习复杂的化学规则。我们建议学习带有微笑表示分子建模的潜在空间能量的先验模型。我们的实验表明,我们的方法能够与最先进的模型产生具有有效性和唯一性竞争性的分子。有趣的是,产生的分子具有结构和化学特征,它们的分布几乎与真实分子的分布相匹配。

Deep generative models have recently been applied to molecule design. If the molecules are encoded in linear SMILES strings, modeling becomes convenient. However, models relying on string representations tend to generate invalid samples and duplicates. Prior work addressed these issues by building models on chemically-valid fragments or explicitly enforcing chemical rules in the generation process. We argue that an expressive model is sufficient to implicitly and automatically learn the complicated chemical rules from the data, even if molecules are encoded in simple character-level SMILES strings. We propose to learn latent space energy-based prior model with SMILES representation for molecule modeling. Our experiments show that our method is able to generate molecules with validity and uniqueness competitive with state-of-the-art models. Interestingly, generated molecules have structural and chemical features whose distributions almost perfectly match those of the real molecules.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源