多块变量对正交预测（MB-VIOP）的影响，以增强对ONPLS模型中总，全球，本地和独特变化的解释

论文标题

多块变量对正交预测（MB-VIOP）的影响，以增强对ONPLS模型中总，全球，本地和独特变化的解释

Multiblock variable influence on orthogonal projections (MB-VIOP) for enhanced interpretation of total, global, local and unique variations in OnPLS models

论文作者

Galindo-Prieto, B., Geladi, P., Trygg, J.

论文摘要

对于仅涉及两个输入矩阵的多变量数据分析，以前发表的对投影影响的可变方法（例如vipopls或vipo2pls）被广泛用于可变选择目的，包括（i）可变的重要性评估，（i）（II）降低大数据和（iii）大数据和（iii）的解释pls props props props op pls和o2 plos和o2 plos and o2 and o2 and o2 and o2 and o2 and o2 and o2 and o2 and o2 and o2 and o2。对于多块分析，ONPLS模型通过计算潜在变量来找到多个数据矩阵之间的关系。但是，到目前为止，通过评估输入变量的重要性来改善这些潜在变量的解释方法。本文解释了一种多块分析中可变选择的方法，称为多块变量对正交投影（MB-VIOP）的影响。 MB-VIOP是一种基于模型的可变选择方法，它使用数据矩阵，ONPLS模型的分数和归一化载荷，以根据其简化和解释总多嵌段模型的重要性以及独特的，本地和全球模型的组件，以简化和解释的重要性对两个以上数据矩阵进行分类。 MB-VIOP已使用三个多板数据集对MB-VIOP进行了测试。 MB-VIOP评估任何类型的数据中的重要性。 MB-VIOP根据不同数据矩阵的输入变量与解释每个潜在变量的相关性，为每个ONPLS模型组件提供了增强的可解释性。此外，MB-VIOP可以处理变化类型的强烈重叠，以及许多具有不同维度的数据块。 MB-VIOP生成维数的能力降低了具有高解释性的模型，因此该方法非常适合大数据挖掘，多摩卡数据集成以及任何需要探索和解释大量数据流的研究。

For multivariate data analysis involving only two input matrices, the previously published methods for variable influence on projection (e.g., VIPOPLS or VIPO2PLS) are widely used for variable selection purposes, including (i) variable importance assessment, (ii) dimensionality reduction of big data and (iii) interpretation enhancement of PLS, OPLS and O2PLS models. For multiblock analysis, the OnPLS models find relationships among multiple data matrices by calculating latent variables; however, a method for improving the interpretation of these latent variables by assessing the importance of the input variables was not available up to now. A method for variable selection in multiblock analysis, called multiblock variable influence on orthogonal projections (MB-VIOP) is explained in this paper. MB-VIOP is a model based variable selection method that uses the data matrices, the scores and the normalized loadings of an OnPLS model in order to sort the input variables of more than two data matrices according to their importance for both simplification and interpretation of the total multiblock model, and also of the unique, local and global model components separately. MB-VIOP has been tested using three multiblock datasets. MB-VIOP assesses the variable importance in any type of data. MB-VIOP connects the input variables of different data matrices according to their relevance for the interpretation of each latent variable, yielding enhanced interpretability for each OnPLS model component. Besides, MB-VIOP can deal with strong overlapping of types of variation, as well as with many data blocks with very different dimensionality. The ability of MB-VIOP for generating dimensionality reduced models with high interpretability makes this method ideal for big data mining, multi-omics data integration and any study that requires exploration and interpretation of large streams of data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题