论文标题
从图形信号处理的角度扩展组成数据分析
Extending compositional data analysis from a graph signal processing perspective
论文作者
论文摘要
分析组成数据的传统方法考虑了所有不同权重的变量对之间的对数比率,通常以汇总贡献的形式。这在知道仅在非常具体的变量之间存在关系(例如,对于代谢组途径)的上下文中没有意义,而对于其他对,则不存在关系。在图理论中进行建模缺失或存在关系,其中顶点代表变量,并且连接是指关系。本文将组成数据分析与图形信号处理联系起来,并将Aitchison几何形状扩展到只能考虑选定的log-Ratios的设置。提出的框架保留了比例不变性和组成连贯性的理想特性。额外的扩展名很容易列出绝对信息。与组成数据分析的标准方法相比,来自生物信息学和地球化学的示例强调了此项援助的实用性。
Traditional methods for the analysis of compositional data consider the log-ratios between all different pairs of variables with equal weight, typically in the form of aggregated contributions. This is not meaningful in contexts where it is known that a relationship only exists between very specific variables (e.g.~for metabolomic pathways), while for other pairs a relationship does not exist. Modeling absence or presence of relationships is done in graph theory, where the vertices represent the variables, and the connections refer to relations. This paper links compositional data analysis with graph signal processing, and it extends the Aitchison geometry to a setting where only selected log-ratios can be considered. The presented framework retains the desirable properties of scale invariance and compositional coherence. An additional extension to include absolute information is readily made. Examples from bioinformatics and geochemistry underline the usefulness of thisapproach in comparison to standard methods for compositional data analysis.