激活图适应有效知识蒸馏

论文标题

激活图适应有效知识蒸馏

Activation Map Adaptation for Effective Knowledge Distillation

论文作者

Wu, Zhiyuan, Qi, Hong, Jiang, Yu, Zhao, Minghao, Cui, Chupeng, Yang, Zongmin, Xue, Xinhui

论文摘要

由于需要在嵌入式和移动设备上部署神经网络，因此模型压缩成为最近的趋势。因此，准确性和效率都至关重要。为了探索它们之间的平衡，为一般视觉表示学习提出了一种知识蒸馏策略。它利用我们精心设计的激活图自适应模块来替换教师网络的某些块，在培训过程中自适应地探索最合适的监督功能。使用教师的隐藏层输出来提示学生网络训练以传输有效的语义信息。要验证我们策略的有效性，本文将我们的方法应用于CIFAR-10数据集。结果表明，该方法可以将学生网络的准确性提高0.6％，减少6.5％，并显着提高其训练速度。

Model compression becomes a recent trend due to the requirement of deploying neural networks on embedded and mobile devices. Hence, both accuracy and efficiency are of critical importance. To explore a balance between them, a knowledge distillation strategy is proposed for general visual representation learning. It utilizes our well-designed activation map adaptive module to replace some blocks of the teacher network, exploring the most appropriate supervisory features adaptively during the training process. Using the teacher's hidden layer output to prompt the student network to train so as to transfer effective semantic information.To verify the effectiveness of our strategy, this paper applied our method to cifar-10 dataset. Results demonstrate that the method can boost the accuracy of the student network by 0.6% with 6.5% loss reduction, and significantly improve its training speed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题