论文标题
通过特征提取和监督学习,氧化物缺陷识别的配对分布函数分析
Pair distribution function analysis for oxide defect identification through feature extraction and supervised learning
论文作者
论文摘要
功能提取和神经网络模型被应用以预测实验TIO $ _2 $样本中的缺陷类型和浓度。建立了带有空缺和氧气间隙的Tio $ _2 $结构的数据集,并使用能量最小化放松结构。这些缺陷结构的计算成对分布函数(PDF)的特征是使用线性方法(主成分分析,非负矩阵分解)和非线性方法(自动编码器,卷积神经网络)提取的。提取的特征用作神经网络的输入,将特征权重映射到每种缺陷类型的浓度。通过基于实验测量的TIO $ _2 $ PDFS预测缺陷浓度,并将结果与蛮力预测进行比较,可以通过预测缺陷浓度来验证该机器学习管道的性能。基于物理的自动编码器的初始化在预测缺陷浓度方面具有最高的精度。该模型结合了材料特性的物理解释性和可预测性,从而实现了具有散射数据的更有效的材料表征。
Feature extraction and a neural network model are applied to predict the defect types and concentrations in experimental TiO$_2$ samples. A dataset of TiO$_2$ structures with vacancies and interstitials of oxygen and titanium is built and the structures are relaxed using energy minimization. The features of the calculated pair distribution functions (PDFs) of these defected structures are extracted using linear methods (principal component analysis, non-negative matrix factorization) and non-linear methods (autoencoder, convolutional neural network). The extracted features are used as the inputs to a neural network that maps the feature weights to the concentration of each defect type. The performance of this machine learning pipeline is validated by predicting the defect concentrations based on experimentally-measured TiO$_2$ PDFs and comparing the results to brute-force predictions. A physics-based initialization of the autoencoder has the highest accuracy in predicting the defect concentrations. This model incorporates physical interpretability and predictability of material properties, enabling a more efficient material characterization process with scattering data.