DAS：通过区分激活评分进行神经建筑搜索

论文标题

DAS：通过区分激活评分进行神经建筑搜索

DAS: Neural Architecture Search via Distinguishing Activation Score

论文作者

Liu, Yuqiao, Li, Haipeng, Sun, Yanan, Liu, Shuaicheng

论文摘要

神经体系结构搜索（NAS）是一种自动技术，可以为特定任务搜索良好的体系结构。尽管NAS在许多领域都超过了人类设计的建筑，但建筑评估的高计算成本需要阻碍其发展。一个可行的解决方案是在体系结构的初始阶段直接评估一些指标，而无需进行任何培训。没有训练的NAS（WOT）得分就是这样的度量，它通过区分激活层中不同输入的能力来估计体系结构的最终训练精度。但是，WOT分数不是原子度量标准，这意味着它不代表该体系结构的基本指标。本文的贡献分为三倍。首先，我们将WOT分解为两个原子指标，它们代表网络的区别能力和激活单元的数量，并探索称为“激活分数”的更好组合规则（区分激活得分）。我们证明了理论上解耦的正确性，并通过实验确认了规则的有效性。其次，为了提高DA的预测准确性以满足实际搜索要求，我们提出了快速培训策略。当DA与快速训练策略结合使用时，它会产生更多的改进。第三，我们提出了一个称为DARTS-Training Bench（DTB）的数据集，该数据集填补了现有数据集中没有架构状态的空白。我们提出的方法具有1.04 $ \ times $ -1.56 $ \ times $ $改进，NAS-Bench-101，网络设计空间和拟议的DTB。

Neural Architecture Search (NAS) is an automatic technique that can search for well-performed architectures for a specific task. Although NAS surpasses human-designed architecture in many fields, the high computational cost of architecture evaluation it requires hinders its development. A feasible solution is to directly evaluate some metrics in the initial stage of the architecture without any training. NAS without training (WOT) score is such a metric, which estimates the final trained accuracy of the architecture through the ability to distinguish different inputs in the activation layer. However, WOT score is not an atomic metric, meaning that it does not represent a fundamental indicator of the architecture. The contributions of this paper are in three folds. First, we decouple WOT into two atomic metrics which represent the distinguishing ability of the network and the number of activation units, and explore better combination rules named (Distinguishing Activation Score) DAS. We prove the correctness of decoupling theoretically and confirmed the effectiveness of the rules experimentally. Second, in order to improve the prediction accuracy of DAS to meet practical search requirements, we propose a fast training strategy. When DAS is used in combination with the fast training strategy, it yields more improvements. Third, we propose a dataset called Darts-training-bench (DTB), which fills the gap that no training states of architecture in existing datasets. Our proposed method has 1.04$\times$ - 1.56$\times$ improvements on NAS-Bench-101, Network Design Spaces, and the proposed DTB.

下载PDF全文

下载文献需遵守相关版权规定

论文标题