进行性骨骼化：初始化时从网络中修剪更多的脂肪

论文标题

进行性骨骼化：初始化时从网络中修剪更多的脂肪

Progressive Skeletonization: Trimming more fat from a network at initialization

论文作者

de Jorge, Pau, Sanyal, Amartya, Behl, Harkirat S., Torr, Philip H. S., Rogez, Gregory, Dokania, Puneet K.

论文摘要

最近的研究表明，网络的骨架化（修剪参数）\ textit {在初始化}提供了推理和训练时间的稀疏性的所有实际好处，同时只会降低其性能。但是，我们观察到，超过一定程度的稀疏性（约合95美元\％$），这些方法无法保留网络性能，令我们惊讶的是，在许多情况下，比琐碎的随机修剪还差。为此，我们提出了一个目标，可以找到一个具有最大{\ em远距离连接敏感性}（力）的骨架化网络，从而考虑了训练性，从而考虑了修剪的网络的训练性。然后，我们提出了两个近似程序，以最大化我们的目标（1）迭代剪切：允许在骨骼化早期阶段不重要的参数在后期变得很重要；（2）力：迭代过程，通过允许已经修剪的参数在骨骼化的后期阶段复活，允许探索。对大量实验的经验分析表明，我们的方法至少提供与中等修剪水平的其他方法一样出色的表现，但在较高的修剪水平上提供了明显改善的性能（可以删除高达$ 99.5 \％$ $的参数，同时保持网络可训练）。代码可以在https://github.com/naver/force中找到。

Recent studies have shown that skeletonization (pruning parameters) of networks \textit{at initialization} provides all the practical benefits of sparsity both at inference and training time, while only marginally degrading their performance. However, we observe that beyond a certain level of sparsity (approx $95\%$), these approaches fail to preserve the network performance, and to our surprise, in many cases perform even worse than trivial random pruning. To this end, we propose an objective to find a skeletonized network with maximum {\em foresight connection sensitivity} (FORCE) whereby the trainability, in terms of connection sensitivity, of a pruned network is taken into consideration. We then propose two approximate procedures to maximize our objective (1) Iterative SNIP: allows parameters that were unimportant at earlier stages of skeletonization to become important at later stages; and (2) FORCE: iterative process that allows exploration by allowing already pruned parameters to resurrect at later stages of skeletonization. Empirical analyses on a large suite of experiments show that our approach, while providing at least as good a performance as other recent approaches on moderate pruning levels, provides remarkably improved performance on higher pruning levels (could remove up to $99.5\%$ parameters while keeping the networks trainable). Code can be found in https://github.com/naver/force.

下载PDF全文

下载文献需遵守相关版权规定

论文标题