论文标题

基于卷积神经网络的基于阶级不平衡问题的局灶性损失:犬红细胞形态分类的案例研究

Convolutional Neural Networks based Focal Loss for Class Imbalance Problem: A Case Study of Canine Red Blood Cells Morphology Classification

论文作者

Pasupa, Kitsuchart, Vatathanavaro, Supawit, Tungjitnob, Suchat

论文摘要

红细胞的形态通常由病理学家解释。这是耗时且费力的。此外,错误分类的红细胞形态将导致错误的疾病诊断和不当治疗。因此,一名体面的病理学家必须真正是对红细胞形态进行分类的专家。在过去的十年中,已经提出了许多用于对人类红细胞形态进行分类的方法。但是,这些方法尚未解决分类中的类不平衡问题。班级不平衡问题---在课堂中的样本数量大不相同的问题 - 是可能导致对多数班级的有偏见模型的问题之一。由于每种类型的异常血细胞形态的罕见性,收集过程的数据通常会失衡。在这项研究中,我们旨在通过使用卷积神经网络(CNN)(CNN)(一种众所周知的深度学习技术)与局灶性损失功能一起解决此问题,以解决狗红细胞形态的分类,并擅长处理类不平衡问题。提出的技术是在精心设计的框架上进行的:使用两个不同的CNN来验证焦点损失函数的有效性,并通过5倍的交叉验证确定最佳的超参数。实验结果表明,与使用常规的跨透明损失函数增强的模型相比,这两个CNN模型都以焦点损失函数的增加均达到了更高的$ f_ {1} $ - 得分,该模型无法解决类别不平衡问题。换句话说,局灶性损失函数确实使CNNS模型对多数类的偏见比跨肠道的分类任务不平衡的狗红血细胞数据的分类任务要少。

Morphologies of red blood cells are normally interpreted by a pathologist. It is time-consuming and laborious. Furthermore, a misclassified red blood cell morphology will lead to false disease diagnosis and improper treatment. Thus, a decent pathologist must truly be an expert in classifying red blood cell morphology. In the past decade, many approaches have been proposed for classifying human red blood cell morphology. However, those approaches have not addressed the class imbalance problem in classification. A class imbalance problem---a problem where the numbers of samples in classes are very different---is one of the problems that can lead to a biased model towards the majority class. Due to the rarity of every type of abnormal blood cell morphology, the data from the collection process are usually imbalanced. In this study, we aimed to solve this problem specifically for classification of dog red blood cell morphology by using a Convolutional Neural Network (CNN)---a well-known deep learning technique---in conjunction with a focal loss function, adept at handling class imbalance problem. The proposed technique was conducted on a well-designed framework: two different CNNs were used to verify the effectiveness of the focal loss function and the optimal hyper-parameters were determined by 5-fold cross-validation. The experimental results show that both CNNs models augmented with the focal loss function achieved higher $F_{1}$-scores, compared to the models augmented with a conventional cross-entropy loss function that does not address class imbalance problem. In other words, the focal loss function truly enabled the CNNs models to be less biased towards the majority class than the cross-entropy did in the classification task of imbalanced dog red blood cell data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源