DC和SA：多纸网深度学习模型的强大，有效的超参数优化

论文标题

DC和SA：多纸网深度学习模型的强大，有效的超参数优化

DC and SA: Robust and Efficient Hyperparameter Optimization of Multi-subnetwork Deep Learning Models

论文作者

Treacher, Alex H., Montillo, Albert

论文摘要

我们提出了两种新型的超参数优化策略，用于使用由多个子网构建的模块化体系结构优化深度学习模型。随着具有多个子网的复杂网络更加频繁地应用于机器学习，因此需要使用超参数优化方法来有效地优化其超参数。现有的超参数搜索是一般的，可用于优化此类网络，但是，通过利用多纸网架构，可以实质上加速这些搜索。所提出的方法为最终模型提供了更快的收敛速度。为了证明这一点，我们提出了两种独立的方法来增强这些先前的算法：1）分裂和诱导方法，其中结合了表现最佳模型的最佳子网，从而可以对超参数搜索空间进行更快速的采样。 2）一种子网自适应方法，该方法基于每个子网的重要性分发计算资源，从而允许更智能的资源分配。这些方法可以灵活地应用于许多高参数优化算法。为了说明这一点，我们将方法与常用的贝叶斯优化方法相结合。然后，我们的方法对综合示例和现实示例进行了测试，并应用于多种网络类型，包括卷积神经网络和密集的馈电神经网络。我们的方法显示，与可比的BO方法相比，我们的优化效率提高了23.62倍，分类的最终性能提高到3.5％的精度和4.4 MSE的回归。

We present two novel hyperparameter optimization strategies for optimization of deep learning models with a modular architecture constructed of multiple subnetworks. As complex networks with multiple subnetworks become more frequently applied in machine learning, hyperparameter optimization methods are required to efficiently optimize their hyperparameters. Existing hyperparameter searches are general, and can be used to optimize such networks, however, by exploiting the multi-subnetwork architecture, these searches can be sped up substantially. The proposed methods offer faster convergence to a better-performing final model. To demonstrate this, we propose 2 independent approaches to enhance these prior algorithms: 1) a divide-and-conquer approach, in which the best subnetworks of top-performing models are combined, allowing for more rapid sampling of the hyperparameter search space. 2) A subnetwork adaptive approach that distributes computational resources based on the importance of each subnetwork, allowing more intelligent resource allocation. These approaches can be flexibily applied to many hyperparameter optimization algorithms. To illustrate this, we combine our approaches with the commonly-used Bayesian optimization method. Our approaches are then tested against both synthetic examples and real-world examples and applied to multiple network types including convolutional neural networks and dense feed forward neural networks. Our approaches show an increased optimization efficiency of up to 23.62x, and a final performance boost of up to 3.5% accuracy for classification and 4.4 MSE for regression, when compared to comparable BO approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题