CWP：实例复杂性加权频道软掩模，用于网络修剪

论文标题

CWP：实例复杂性加权频道软掩模，用于网络修剪

CWP: Instance complexity weighted channel-wise soft masks for network pruning

论文作者

Wang, Jiapeng, Ma, Ming, Yu, Zhenhua

论文摘要

现有的可区分通道修剪方法通常将缩放因子或掩模附加在通道后面，以减少重要性的修剪过滤器，并隐式地假设输入样品均匀贡献了对过滤重要性的贡献。具体而言，实例复杂性对修剪性能的影响尚未在静态网络修剪中进行全面研究。在本文中，我们提出了一个基于实例复杂性加权滤波器重要性得分的简单而有效的网络修剪方法CWP。我们通过对硬性实例赋予更高的权重来定义实例复杂性相关的权重，并测量实例特定软蒙版的加权总和，以模拟不同输入的非均匀贡献，这鼓励了艰难的实例来支配修剪过程和模型性能得到很好的保存。此外，我们还引入了一个常规化器，以最大程度地提高口罩的极化，从而可以轻松找到甜点以识别要修剪的过滤器。各种网络体系结构和数据集的绩效评估表明，在修剪大型网络方面，CWP比最新的CWP具有优势。例如，CWP在删除64.11％的拖鞋后，将CIFAR-10数据集的RESNET56的准确性提高了0.32％，而Imagenet数据集上的RESNET50的PRUNES 87.75％，只有0.93％的TOP-1准确度损失。

Existing differentiable channel pruning methods often attach scaling factors or masks behind channels to prune filters with less importance, and implicitly assume uniform contribution of input samples to filter importance. Specifically, the effects of instance complexity on pruning performance are not yet fully investigated in static network pruning. In this paper, we propose a simple yet effective differentiable network pruning method CWP based on instance complexity weighted filter importance scores. We define instance complexity related weight for each instance by giving higher weights to hard instances, and measure the weighted sum of instance-specific soft masks to model non-uniform contribution of different inputs, which encourages hard instances to dominate the pruning process and the model performance to be well preserved. In addition, we introduce a regularizer to maximize polarization of the masks, such that a sweet spot can be easily found to identify the filters to be pruned. Performance evaluations on various network architectures and datasets demonstrate CWP has advantages over the state-of-the-arts in pruning large networks. For instance, CWP improves the accuracy of ResNet56 on CIFAR-10 dataset by 0.32% aftering removing 64.11% FLOPs, and prunes 87.75% FLOPs of ResNet50 on ImageNet dataset with only 0.93% Top-1 accuracy loss.

下载PDF全文

下载文献需遵守相关版权规定

论文标题