小组的加强功能生成，用于最佳和可解释的代表空间重建

论文标题

小组的加强功能生成，用于最佳和可解释的代表空间重建

Group-wise Reinforcement Feature Generation for Optimal and Explainable Representation Space Reconstruction

论文作者

Wang, Dongjie, Fu, Yanjie, Liu, Kunpeng, Li, Xiaolin, Solihin, Yan

论文摘要

表示（特征）空间是一个环境，在该环境中，数据点矢量化，计算距离，表征图案并嵌入几何结构。提取良好的表示空间对于解决维度的诅咒，改善模型概括，克服数据稀疏性并增加经典模型的可用性至关重要。现有文学（例如功能工程和代表性学习）在实现完全自动化方面的限制（例如，大大依赖密集劳动和经验经验），可解释的明确性（例如，可跟踪的可追溯重建过程和可解释的新功能），以及灵活的最佳（例如，最佳的特征空间重新构造都不是嵌入了downsperemed depseddsspleams的任务）。我们可以同时解决机器学习任务的表示空间重建方面的自动化，明确性和最佳挑战吗？为了回答这个问题，我们提出了一个团体的加强生成观点。我们将代表空间重建重新构造为嵌套特征生成和选择的交互过程，其中特征生成是生成新的有意义和明确的特征，并且功能选择是消除冗余功能以控制功能尺寸。我们开发了一种级联增强学习方法，该方法利用三个级联的马尔可夫决策过程来学习最佳的生成政策，以自动化功能和操作以及功能交叉的选择。我们设计了一个小组生成的策略，以跨越功能组，操作和另一个功能组，以生成新功能，并找到可以提高勘探效率和增强级联代理的奖励信号的策略。最后，我们提出了广泛的实验，以证明我们系统的有效性，效率，可追溯性和显性性。

Representation (feature) space is an environment where data points are vectorized, distances are computed, patterns are characterized, and geometric structures are embedded. Extracting a good representation space is critical to address the curse of dimensionality, improve model generalization, overcome data sparsity, and increase the availability of classic models. Existing literature, such as feature engineering and representation learning, is limited in achieving full automation (e.g., over heavy reliance on intensive labor and empirical experiences), explainable explicitness (e.g., traceable reconstruction process and explainable new features), and flexible optimal (e.g., optimal feature space reconstruction is not embedded into downstream tasks). Can we simultaneously address the automation, explicitness, and optimal challenges in representation space reconstruction for a machine learning task? To answer this question, we propose a group-wise reinforcement generation perspective. We reformulate representation space reconstruction into an interactive process of nested feature generation and selection, where feature generation is to generate new meaningful and explicit features, and feature selection is to eliminate redundant features to control feature sizes. We develop a cascading reinforcement learning method that leverages three cascading Markov Decision Processes to learn optimal generation policies to automate the selection of features and operations and the feature crossing. We design a group-wise generation strategy to cross a feature group, an operation, and another feature group to generate new features and find the strategy that can enhance exploration efficiency and augment reward signals of cascading agents. Finally, we present extensive experiments to demonstrate the effectiveness, efficiency, traceability, and explicitness of our system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题