论文标题
使用图形卷积网络拟合体重分享NAS的搜索空间
Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks
论文作者
论文摘要
神经建筑搜索引起了学术界和工业的广泛关注。为了加速它,研究人员提出了首先训练超级网络以重新使用不同运算符的计算的体重分享方法,从中可以对许多子网络进行取样并有效评估。这些方法在计算成本方面具有很大的优势,但是除非采用单个培训过程,否则不能准确地估计采样的子网络。本文归功于组装网络层之间不可避免的不匹配,因此每个估计都添加了一个随机错误项。我们通过训练图形卷积网络以适合采样子网络的性能来减轻此问题,从而使随机错误的影响变得很小。通过这种策略,我们在选定的一组候选人中实现了更高的等级相关系数,从而导致最终体系结构的性能更好。此外,我们的方法还具有在不同硬件约束下使用的灵活性,因为图形卷积网络为整个搜索空间中体系结构的性能提供了有效的查找表。
Neural architecture search has attracted wide attentions in both academia and industry. To accelerate it, researchers proposed weight-sharing methods which first train a super-network to reuse computation among different operators, from which exponentially many sub-networks can be sampled and efficiently evaluated. These methods enjoy great advantages in terms of computational costs, but the sampled sub-networks are not guaranteed to be estimated precisely unless an individual training process is taken. This paper owes such inaccuracy to the inevitable mismatch between assembled network layers, so that there is a random error term added to each estimation. We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates, which consequently leads to better performance of the final architecture. In addition, our approach also enjoys the flexibility of being used under different hardware constraints, since the graph convolutional network has provided an efficient lookup table of the performance of architectures in the entire search space.