神经网络修剪的超参数优化

论文标题

神经网络修剪的超参数优化

Hyperparameter Optimization with Neural Network Pruning

论文作者

Lee, Kangil, Yim, Junho

论文摘要

由于深度学习模型高度取决于超参数，因此即使花费很长时间，超参数优化对于开发基于深度学习模型的应用程序至关重要。随着使用深度学习模型的服务开发逐渐变得竞争激烈，许多开发人员高度要求快速的超参数优化算法。为了跟上更快的超参数优化算法的需求，研究人员正在专注于提高超参数优化算法的速度。但是，由于深度学习模型本身的高计算成本，高参数优化的大量时间消耗尚未得到深入处理。就像在贝叶斯优化中使用替代模型，以解决此问题一样，有必要考虑使用神经网络（N_B）的代理模型，以进行超参数优化。受神经网络修剪的主要目标（即高计算成本降低和性能保存）的启发，我们认为通过神经网络修剪获得的神经网络（N_P）将是N_B的良好代理模型。为了验证我们的想法，我们通过使用CIFAR10，CFIAR100和TINYIMAGENET数据集以及三个通常使用的神经网络以及三种代表性的超参数选择方法进行了广泛的实验。通过这些实验，我们验证了N_P可以是快速超参数优化的N_B的良好代理模型。提出的超参数优化框架可以将时间最高37％降低。

Since the deep learning model is highly dependent on hyperparameters, hyperparameter optimization is essential in developing deep learning model-based applications, even if it takes a long time. As service development using deep learning models has gradually become competitive, many developers highly demand rapid hyperparameter optimization algorithms. In order to keep pace with the needs of faster hyperparameter optimization algorithms, researchers are focusing on improving the speed of hyperparameter optimization algorithm. However, the huge time consumption of hyperparameter optimization due to the high computational cost of the deep learning model itself has not been dealt with in-depth. Like using surrogate model in Bayesian optimization, to solve this problem, it is necessary to consider proxy model for a neural network (N_B) to be used for hyperparameter optimization. Inspired by the main goal of neural network pruning, i.e., high computational cost reduction and performance preservation, we presumed that the neural network (N_P) obtained through neural network pruning would be a good proxy model of N_B. In order to verify our idea, we performed extensive experiments by using CIFAR10, CFIAR100, and TinyImageNet datasets and three generally-used neural networks and three representative hyperparameter optmization methods. Through these experiments, we verified that N_P can be a good proxy model of N_B for rapid hyperparameter optimization. The proposed hyperparameter optimization framework can reduce the amount of time up to 37%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题