可区分性可履行损失以改善代表性学习

论文标题

可区分性可履行损失以改善代表性学习

Discriminability-enforcing loss to improve representation learning

论文作者

Croitoru, Florinel-Alin, Grigore, Diana-Nicoleta, Ionescu, Radu Tudor

论文摘要

在训练过程中，深层神经网络隐含地学会通过特征层次结构来表示输入数据样本，其中层次结构的大小由层数确定。在本文中，我们专注于强制执行高级表示的判别能力，而高级表示通常是由较深的层（更接近输出）学习的。为此，我们介绍了一个受Gini杂质启发的新损失术语，该期限旨在最大程度地减少各个高级特征相对于类标签的熵（增加判别能力）。尽管我们的Gini损失会引起高度歧视的特征，但它不能确保高级特征的分布与类的分布相匹配。因此，我们介绍了另一个损失术语，以最大程度地减少两个分布之间的kullback-leibler差异。我们对两个图像分类数据集（CIFAR-100和CALTECH 101）进行实验，考虑到从卷积网络（RESNET-17，RESNET-18，RESNET-50）到变形金刚（CVT）的多个神经体系结构。我们的经验结果表明，将我们的新型损失项整合到训练目标中始终优于单独使用交叉凝聚训练的模型，而根本不增加推理时间。

During the training process, deep neural networks implicitly learn to represent the input data samples through a hierarchy of features, where the size of the hierarchy is determined by the number of layers. In this paper, we focus on enforcing the discriminative power of the high-level representations, that are typically learned by the deeper layers (closer to the output). To this end, we introduce a new loss term inspired by the Gini impurity, which is aimed at minimizing the entropy (increasing the discriminative power) of individual high-level features with respect to the class labels. Although our Gini loss induces highly-discriminative features, it does not ensure that the distribution of the high-level features matches the distribution of the classes. As such, we introduce another loss term to minimize the Kullback-Leibler divergence between the two distributions. We conduct experiments on two image classification data sets (CIFAR-100 and Caltech 101), considering multiple neural architectures ranging from convolutional networks (ResNet-17, ResNet-18, ResNet-50) to transformers (CvT). Our empirical results show that integrating our novel loss terms into the training objective consistently outperforms the models trained with cross-entropy alone, without increasing the inference time at all.

下载PDF全文

下载文献需遵守相关版权规定

论文标题