论文标题

反馈的动态模型修剪

Dynamic Model Pruning with Feedback

论文作者

Lin, Tao, Stich, Sebastian U., Barba, Luis, Dmitriev, Daniil, Jaggi, Martin

论文摘要

深神经网络通常具有数百万个参数。这可能会阻碍他们对低端设备的部署,这不仅是由于高内存需求,而且还因为推理时的延迟增加。我们提出了一种新型的模型压缩方法,该方法生成了一个稀疏的训练模型而无需其他开销:允许(i)(i)稀疏模式的动态分配,(ii)将反馈信号合并以重新激活过度修剪的权重,我们在一次训练中获得了表现的稀疏模型(不需要重新训练(不需要重新训练),但可以进一步提高性能)。我们在CIFAR-10和Imagenet上评估了我们的方法,并表明所获得的稀疏模型可以达到密集模型的最新性能。此外,它们的性能超过了所有先前提出的修剪方案产生的模型的性能。

Deep neural networks often have millions of parameters. This can hinder their deployment to low-end devices, not only due to high memory requirements but also because of increased latency at inference. We propose a novel model compression method that generates a sparse trained model without additional overhead: by allowing (i) dynamic allocation of the sparsity pattern and (ii) incorporating feedback signal to reactivate prematurely pruned weights we obtain a performant sparse model in one single training pass (retraining is not needed, but can further improve the performance). We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models. Moreover, their performance surpasses that of models generated by all previously proposed pruning schemes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源