论文标题

二进制分类:通过将回归模型与单方面标签变化相结合,平衡阶级不平衡

Binary Classification: Counterbalancing Class Imbalance by Applying Regression Models in Combination with One-Sided Label Shifts

论文作者

Bellmann, Peter, Hihn, Heinke, Braun, Daniel A., Schwenker, Friedhelm

论文摘要

在许多实际模式识别方案(例如在医学应用中)中,相应的分类任务可能具有不平衡的性质。在当前的研究中,我们专注于二进制分类任务,即〜二进制分类任务,其中两个类别中的一个与其他类别(多数类)相比,这两个类别中的一个人数不足(少数族裔)。在文献中,已经提出了许多不同的方法,例如不足或过度采样,以抵消阶级失衡。在当前的工作中,我们引入了一种新颖的方法,该方法解决了阶级失衡问题。为此,我们首先将二进制分类任务转移到等效回归任务。随后,我们生成一组负目标标签,以使相应的回归任务相对于重新定义的目标标签集变得平衡。我们对许多公开可用的数据集评估了我们的方法,并结合支持向量机。此外,我们将提出的方法与最受欢迎的过采样技术之一(SMOTE)进行了比较。基于对我们实验评估的提出结果的详细讨论,我们为未来的研究方向提供了有希望的想法。

In many real-world pattern recognition scenarios, such as in medical applications, the corresponding classification tasks can be of an imbalanced nature. In the current study, we focus on binary, imbalanced classification tasks, i.e.~binary classification tasks in which one of the two classes is under-represented (minority class) in comparison to the other class (majority class). In the literature, many different approaches have been proposed, such as under- or oversampling, to counter class imbalance. In the current work, we introduce a novel method, which addresses the issues of class imbalance. To this end, we first transfer the binary classification task to an equivalent regression task. Subsequently, we generate a set of negative and positive target labels, such that the corresponding regression task becomes balanced, with respect to the redefined target label set. We evaluate our approach on a number of publicly available data sets in combination with Support Vector Machines. Moreover, we compare our proposed method to one of the most popular oversampling techniques (SMOTE). Based on the detailed discussion of the presented outcomes of our experimental evaluation, we provide promising ideas for future research directions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源