论文标题

用于分类和数值类型的混合数据的图谱特征学习

Graph Spectral Feature Learning for Mixed Data of Categorical and Numerical Type

论文作者

Sahoo, Saswata, Chakraborty, Souradip

论文摘要

在存在混合类型的变量,数值和分类类型的情况下,特征学习是相关建模问题的重要问题。对于混合数据空间下的简单邻域查询,标准实践是分别考虑数值和分类变量,并根据一些合适的距离功能组合它们。诸如内核学习或主组件之类的替代方案并未明确考虑混合变量类型之间的相互依赖性结构。在这项工作中,我们提出了一种新型策略,以通过无方向的图在变量混合类型之间的概率依赖性结构进行明确模拟。图laplacian的光谱分解提供了所需的特征转换。转化的特征空间的本特征谱显示观测值之间的分离性增加和更突出的凝聚性。我们论文的主要新颖性在于使用图形模型在无监督的框架中捕获混合特征类型的相互作用。我们在数值上验证了功能学习策略的含义

Feature learning in the presence of a mixed type of variables, numerical and categorical types, is an important issue for related modeling problems. For simple neighborhood queries under mixed data space, standard practice is to consider numerical and categorical variables separately and combining them based on some suitable distance functions. Alternatives, such as Kernel learning or Principal Component do not explicitly consider the inter-dependence structure among the mixed type of variables. In this work, we propose a novel strategy to explicitly model the probabilistic dependence structure among the mixed type of variables by an undirected graph. Spectral decomposition of the graph Laplacian provides the desired feature transformation. The Eigen spectrum of the transformed feature space shows increased separability and more prominent clusterability among the observations. The main novelty of our paper lies in capturing interactions of the mixed feature type in an unsupervised framework using a graphical model. We numerically validate the implications of the feature learning strategy

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源