论文标题

脱机聚类方法用于班级失去平衡图像数据的自我监督学习

Offline Clustering Approach to Self-supervised Learning for Class-imbalanced Image Data

论文作者

Chang, Hye-min, Chang, Sungkyun

论文摘要

众所周知,类不平衡的数据集会导致模型偏向多数类的问题。在这个项目中,我们设定了两个研究问题:1)阶级不平衡问题何时在自我监管的预训练中更普遍? 2)特征表示的脱机聚类可以帮助预先培训类失去平衡的数据吗?我们的实验通过调整{\ it class-falbalance}的程度来研究以前的问题,当训练基线模型(即CIFAR-10数据库上的Simclr和Simsiam)时。为了回答后一个问题,我们在功能簇的每个子集上训练每个专家模型。然后,我们将专家模型的知识提炼成单个模型,以便我们能够将该模型的性能与基准进行比较。

Class-imbalanced datasets are known to cause the problem of model being biased towards the majority classes. In this project, we set up two research questions: 1) when is the class-imbalance problem more prevalent in self-supervised pre-training? and 2) can offline clustering of feature representations help pre-training on class-imbalanced data? Our experiments investigate the former question by adjusting the degree of {\it class-imbalance} when training the baseline models, namely SimCLR and SimSiam on CIFAR-10 database. To answer the latter question, we train each expert model on each subset of the feature clusters. We then distill the knowledge of expert models into a single model, so that we will be able to compare the performance of this model to our baselines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源