论文标题
灵活的依赖感知多标签损失功能
A Flexible Class of Dependence-aware Multi-Label Loss Functions
论文作者
论文摘要
多标签分类是将标签子集分配给给定查询实例的任务。为了评估此类预测,需要将一组预测标签与与该实例相关的地面标签集进行比较,并为此目的提出了各种损失函数。除了评估预测精度外,这方面的关键问题是培养和分析学习者捕获标签依赖性的能力。在本文中,我们引入了用于多标签分类的新一类损失功能,该功能克服了常用损失的缺点,例如锤子和子集0/1。为此,我们利用了非加性措施和积分的数学框架。粗略地说,一种非加性措施允许对标签子集正确预测的重要性(而不是单个标签)进行建模,从而以灵活的方式对整体评估的影响 - 通过对单个标签和整个标签集的重视,在此方面相当极端。我们介绍了该类别的具体实例,其中包括锤子和子集0/1作为特殊情况,从建模的角度来看,这似乎特别有吸引力。一项经验研究说明了对这些损失的多标签分类器的评估。
Multi-label classification is the task of assigning a subset of labels to a given query instance. For evaluating such predictions, the set of predicted labels needs to be compared to the ground-truth label set associated with that instance, and various loss functions have been proposed for this purpose. In addition to assessing predictive accuracy, a key concern in this regard is to foster and to analyze a learner's ability to capture label dependencies. In this paper, we introduce a new class of loss functions for multi-label classification, which overcome disadvantages of commonly used losses such as Hamming and subset 0/1. To this end, we leverage the mathematical framework of non-additive measures and integrals. Roughly speaking, a non-additive measure allows for modeling the importance of correct predictions of label subsets (instead of single labels), and thereby their impact on the overall evaluation, in a flexible way - by giving full importance to single labels and the entire label set, respectively, Hamming and subset 0/1 are rather extreme in this regard. We present concrete instantiations of this class, which comprise Hamming and subset 0/1 as special cases, and which appear to be especially appealing from a modeling perspective. The assessment of multi-label classifiers in terms of these losses is illustrated in an empirical study.