多任务机器学习集体变量，以增强稀有事件的采样

论文标题

多任务机器学习集体变量，以增强稀有事件的采样

Multitask machine learning of collective variables for enhanced sampling of rare events

论文作者

Sun, Lixin, Vandermause, Jonathan, Batzner, Simon, Xie, Yu, Clark, David, Chen, Wei, Kozinsky, Boris

论文摘要

计算准确的反应速率是计算化学和生物学的核心挑战，因为没有偏分子动力学的自由能估计成本很高。在这项工作中，设计了一种数据驱动的机器学习算法是为了学习具有多任务神经网络的集体变量，在该网络中，一个常见的上游部分将原子配置的高维度降低到低维度的潜在空间，并将下游零件分开地映射到Basin Class类标签和电势能的潜在空间。所产生的潜在空间被证明是有效的低维表示，可捕获反应进展并指导有效的伞采样以获得准确的自由能景观。该方法成功地应用于模型系统，包括5DMüllerBrown模型，5D三孔模型和真空中的丙氨酸二肽。这种方法可以使复杂系统中能量控制反应的自动化维度降低，提供了一个统一的框架，可以用有限的数据培训，并且胜过包括自动编码器在内的单任务学习方法。

Computing accurate reaction rates is a central challenge in computational chemistry and biology because of the high cost of free energy estimation with unbiased molecular dynamics. In this work, a data-driven machine learning algorithm is devised to learn collective variables with a multitask neural network, where a common upstream part reduces the high dimensionality of atomic configurations to a low dimensional latent space, and separate downstream parts map the latent space to predictions of basin class labels and potential energies. The resulting latent space is shown to be an effective low-dimensional representation, capturing the reaction progress and guiding effective umbrella sampling to obtain accurate free energy landscapes. This approach is successfully applied to model systems including a 5D Müller Brown model, a 5D three-well model, and alanine dipeptide in vacuum. This approach enables automated dimensionality reduction for energy controlled reactions in complex systems, offers a unified framework that can be trained with limited data, and outperforms single-task learning approaches, including autoencoders.

下载PDF全文

下载文献需遵守相关版权规定

论文标题