使用神经网络组合的分子X射线吸收光谱的不确定性感知预测

论文标题

使用神经网络组合的分子X射线吸收光谱的不确定性感知预测

Uncertainty-aware predictions of molecular X-ray absorption spectra using neural network ensembles

论文作者

Ghose, Animesh, Segal, Mikhail, Meng, Fanchen, Liang, Zhu, Hybertsen, Mark S., Qu, Xiaohui, Stavitski, Eli, Yoo, Shinjae, Lu, Deyu, Carbone, Matthew R.

论文摘要

随着机器学习（ML）方法继续应用于物理科学中的广泛问题，不确定性量化对于其稳健应用越来越重要。不确定性意识到的机器学习方法已在某些应用中使用，但主要用于标量属性。在这项工作中，我们展示了一项模范研究，其中使用神经网络集合来预测小分子的X射线吸收光谱及其在局部原子环境中的X射线吸收光谱。由此产生的替代物的性能清楚地表明了相对于地面真理与预测的不确定性估计的错误之间的定量相关性。值得注意的是，该模型在预期误差上提供了上限。具体而言，这种不确定性感知模型的重要质量是它可以指示该模型何时预测样本外数据。这允许将结构的大规模采样与主动学习或其他用于结构改进的技术集成。此外，我们的模型可以比用于训练的模型更大的分子，并且由于测试分子中随机变形而成功地跟踪不确定性。尽管我们在一个特定示例中演示了此工作流程，但集合学习完全是一般的。我们认为，它可能会对庞大的分子和材料特性的启用ML的前向建模产生重大影响。

As machine learning (ML) methods continue to be applied to a broad scope of problems in the physical sciences, uncertainty quantification is becoming correspondingly more important for their robust application. Uncertainty aware machine learning methods have been used in select applications, but largely for scalar properties. In this work, we showcase an exemplary study in which neural network ensembles are used to predict the X-ray absorption spectra of small molecules, as well as their point-wise uncertainty, from local atomic environments. The performance of the resulting surrogate clearly demonstrates quantitative correlation between errors relative to ground truth and the predicted uncertainty estimates. Significantly, the model provides an upper bound on the expected error. Specifically, an important quality of this uncertainty-aware model is that it can indicate when the model is predicting on out-of-sample data. This allows for its integration with large scale sampling of structures together with active learning or other techniques for structure refinement. Additionally, our models can be generalized to larger molecules than those used for training, and also successfully track uncertainty due to random distortions in test molecules. While we demonstrate this workflow on a specific example, ensemble learning is completely general. We believe it could have significant impact on ML-enabled forward modeling of a broad array of molecular and materials properties.

下载PDF全文

下载文献需遵守相关版权规定

论文标题