灵活的依赖感知多标签损失功能

论文标题

灵活的依赖感知多标签损失功能

A Flexible Class of Dependence-aware Multi-Label Loss Functions

论文作者

Hüllermeier, Eyke, Wever, Marcel, Mencia, Eneldo Loza, Fürnkranz, Johannes, Rapp, Michael

论文摘要

多标签分类是将标签子集分配给给定查询实例的任务。为了评估此类预测，需要将一组预测标签与与该实例相关的地面标签集进行比较，并为此目的提出了各种损失函数。除了评估预测精度外，这方面的关键问题是培养和分析学习者捕获标签依赖性的能力。在本文中，我们引入了用于多标签分类的新一类损失功能，该功能克服了常用损失的缺点，例如锤子和子集0/1。为此，我们利用了非加性措施和积分的数学框架。粗略地说，一种非加性措施允许对标签子集正确预测的重要性（而不是单个标签）进行建模，从而以灵活的方式对整体评估的影响 - 通过对单个标签和整个标签集的重视，在此方面相当极端。我们介绍了该类别的具体实例，其中包括锤子和子集0/1作为特殊情况，从建模的角度来看，这似乎特别有吸引力。一项经验研究说明了对这些损失的多标签分类器的评估。

Multi-label classification is the task of assigning a subset of labels to a given query instance. For evaluating such predictions, the set of predicted labels needs to be compared to the ground-truth label set associated with that instance, and various loss functions have been proposed for this purpose. In addition to assessing predictive accuracy, a key concern in this regard is to foster and to analyze a learner's ability to capture label dependencies. In this paper, we introduce a new class of loss functions for multi-label classification, which overcome disadvantages of commonly used losses such as Hamming and subset 0/1. To this end, we leverage the mathematical framework of non-additive measures and integrals. Roughly speaking, a non-additive measure allows for modeling the importance of correct predictions of label subsets (instead of single labels), and thereby their impact on the overall evaluation, in a flexible way - by giving full importance to single labels and the entire label set, respectively, Hamming and subset 0/1 are rather extreme in this regard. We present concrete instantiations of this class, which comprise Hamming and subset 0/1 as special cases, and which appear to be especially appealing from a modeling perspective. The assessment of multi-label classifiers in terms of these losses is illustrated in an empirical study.

下载PDF全文

下载文献需遵守相关版权规定

论文标题