论文标题

彩票英雄的崛起:为什么零射修剪很难

The rise of the lottery heroes: why zero-shot pruning is hard

论文作者

Tartaglione, Enzo

论文摘要

深度学习优化的最新进展表明,仅仅一部分参数才能成功训练模型。潜在地,这种发现从理论到应用都产生了广泛的影响。但是,众所周知,找到这些可训练的子网络通常是一个昂贵的过程。这抑制了实际应用:在训练时可以找到深度学习模型中学习的子图结构吗?在这项工作中,我们探讨了这种可能性,观察和激励为什么常见方法通常在极端的感兴趣的情况下失败,并提出一种方法,该方法有可能通过减少计算工作来培训。有关具有挑战性的体系结构和数据集的实验表明,在这种计算增益上的算法可访问性,尤其是在实现的准确性和部署的培训复杂性之间的权衡。

Recent advances in deep learning optimization showed that just a subset of parameters are really necessary to successfully train a model. Potentially, such a discovery has broad impact from the theory to application; however, it is known that finding these trainable sub-network is a typically costly process. This inhibits practical applications: can the learned sub-graph structures in deep learning models be found at training time? In this work we explore such a possibility, observing and motivating why common approaches typically fail in the extreme scenarios of interest, and proposing an approach which potentially enables training with reduced computational effort. The experiments on either challenging architectures and datasets suggest the algorithmic accessibility over such a computational gain, and in particular a trade-off between accuracy achieved and training complexity deployed emerges.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源