通过双侧耦合网络搜索网络宽度

论文标题

通过双侧耦合网络搜索网络宽度

Searching for Network Width with Bilaterally Coupled Network

论文作者

Su, Xiu, You, Shan, Xie, Jiyang, Wang, Fei, Qian, Chen, Zhang, Changshui, Xu, Chang

论文摘要

在硬件约束下，搜索更紧凑的网络宽度最近是一种有效的渠道修剪方式来部署卷积神经网络（CNN）的方法。为了实现搜索，通常会利用一个单发的超级网，以有效评估性能\ wrt〜不同的网络宽度。但是，当前方法主要遵循\ textit {单侧增强}（UA）原理，以评估每个宽度，这会引起超级网中通道的不公平性。在本文中，我们引入了一个名为双侧耦合网络（BCNET）的新超网，以解决此问题。在BCNET中，每个通道都是经过相当训练的，并负责相同数量的网络宽度，因此可以更准确地评估每个网络宽度。此外，我们建议减少冗余搜索空间，并将BCNETV2作为增强的超级网，以确保对通道的严格训练。此外，我们利用一种随机互补策略来训练BCNET，并提出了一种先前的初始种群抽样方法来提高进化搜索的性能。我们还提出了名为Channel-Bench-MaCro的宏观结构上的第一个开源宽度基准，以更好地比较宽度搜索算法。基准CIFAR-10和Imagenet数据集的广泛实验表明，我们的方法可以在其他基线方法上实现最先进或竞争性能。此外，我们的方法事实证明，通过完善其网络宽度，进一步提高了NAS模型的性能。例如，通过相同的拖球预算，我们获得的有效网络-B0在Imagenet数据集上实现了77.53 \％TOP-1的精度，超过了原始设置的性能0.65 \％。

Searching for a more compact network width recently serves as an effective way of channel pruning for the deployment of convolutional neural networks (CNNs) under hardware constraints. To fulfill the searching, a one-shot supernet is usually leveraged to efficiently evaluate the performance \wrt~different network widths. However, current methods mainly follow a \textit{unilaterally augmented} (UA) principle for the evaluation of each width, which induces the training unfairness of channels in supernet. In this paper, we introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue. In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately. Besides, we propose to reduce the redundant search space and present the BCNetV2 as the enhanced supernet to ensure rigorous training fairness over channels. Furthermore, we leverage a stochastic complementary strategy for training the BCNet, and propose a prior initial population sampling method to boost the performance of the evolutionary search. We also propose the first open-source width benchmark on macro structures named Channel-Bench-Macro for the better comparison of width search algorithms. Extensive experiments on benchmark CIFAR-10 and ImageNet datasets indicate that our method can achieve state-of-the-art or competing performance over other baseline methods. Moreover, our method turns out to further boost the performance of NAS models by refining their network widths. For example, with the same FLOPs budget, our obtained EfficientNet-B0 achieves 77.53\% Top-1 accuracy on ImageNet dataset, surpassing the performance of original setting by 0.65\%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题