多类可学习性的特征

论文标题

多类可学习性的特征

A Characterization of Multiclass Learnability

论文作者

Brukhim, Nataly, Carmon, Daniel, Dinur, Irit, Moran, Shay, Yehudayoff, Amir

论文摘要

学习理论的开创性结果是通过Vapnik-Chervonenkis维度来表征二元类的PAC可学习性。自从1980年代后期的多类PAC学习工作以来，将这种表征扩展到一般的多类环境已经开放。这项工作解决了这个问题：我们通过DS维度来表征多类PAC可学习性，DS维度是Daniely和Shalev-Shwartz（2014）定义的组合维度。二进制案例的经典表征归结为经验风险最小化。相反，我们对多类案例的特征涉及多种算法思想。其中包括一个自然设置，我们称列表PAC学习。在列表学习设置中，目标是提供简短的预测菜单，而不是预测给定的看不见的输入的单一结果。我们的第二个主要结果涉及Natarajan维度，该维度一直是表征多类可学习性的核心候选人。 Natarajan（1988）引入了这个维度，作为PAC学习的障碍。从那以后，Natarajan维度是否表征了PAC可学习性的特征。这项工作提供了一个负面答案：我们用Natarajan维度构建了一个非可学习的类。对于构造，我们确定概念类和拓扑之间的基本联系（即五颜六色的简单络合物）。我们至关重要的是依赖于Januszkiewicz和Swiatkowski的双曲线伪模型构造。有趣的是，双曲线与学习问题直接相关，尽管不存在明显的障碍，但难以解决的学习问题。这是机器学习与数学领域不同领域的富有成果的联系的另一个演示。

A seminal result in learning theory characterizes the PAC learnability of binary classes through the Vapnik-Chervonenkis dimension. Extending this characterization to the general multiclass setting has been open since the pioneering works on multiclass PAC learning in the late 1980s. This work resolves this problem: we characterize multiclass PAC learnability through the DS dimension, a combinatorial dimension defined by Daniely and Shalev-Shwartz (2014). The classical characterization of the binary case boils down to empirical risk minimization. In contrast, our characterization of the multiclass case involves a variety of algorithmic ideas; these include a natural setting we call list PAC learning. In the list learning setting, instead of predicting a single outcome for a given unseen input, the goal is to provide a short menu of predictions. Our second main result concerns the Natarajan dimension, which has been a central candidate for characterizing multiclass learnability. This dimension was introduced by Natarajan (1988) as a barrier for PAC learning. Whether the Natarajan dimension characterizes PAC learnability in general has been posed as an open question in several papers since. This work provides a negative answer: we construct a non-learnable class with Natarajan dimension one. For the construction, we identify a fundamental connection between concept classes and topology (i.e., colorful simplicial complexes). We crucially rely on a deep and involved construction of hyperbolic pseudo-manifolds by Januszkiewicz and Swiatkowski. It is interesting that hyperbolicity is directly related to learning problems that are difficult to solve although no obvious barriers exist. This is another demonstration of the fruitful links machine learning has with different areas in mathematics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题