使用一千个优化任务来学习超参数搜索策略

论文标题

使用一千个优化任务来学习超参数搜索策略

Using a thousand optimization tasks to learn hyperparameter search strategies

论文作者

Metz, Luke, Maheswaranathan, Niru, Sun, Ruoxi, Freeman, C. Daniel, Poole, Ben, Sohl-Dickstein, Jascha

论文摘要

我们提出任务集，这是用于培训和评估优化器的任务数据集。任务集的大小和多样性是独一无二的，其中包含一千多个任务，从具有完全连接或卷积神经网络的图像分类到变异自动编码器，再到各种数据集上的非数量保留流量。作为此类数据集的示例应用程序，我们探索了元学习的超参数列表，以依次尝试。通过使用任务集生成的数据学习此超参数列表，我们可以通过随机搜索实现示例效率的巨大加速。接下来，我们使用任务集的多样性和方法来学习超参数列表，以经验探索这些列表中对各种设置中新的优化任务的概括，包括使用resnet50和resnet50和lm1b语言建模使用变压器进行分类。作为这项工作的一部分，我们为所有任务开放了代码，以及约2900万个问题的培训曲线和相应的超参数。

We present TaskSet, a dataset of tasks for use in training and evaluating optimizers. TaskSet is unique in its size and diversity, containing over a thousand tasks ranging from image classification with fully connected or convolutional neural networks, to variational autoencoders, to non-volume preserving flows on a variety of datasets. As an example application of such a dataset we explore meta-learning an ordered list of hyperparameters to try sequentially. By learning this hyperparameter list from data generated using TaskSet we achieve large speedups in sample efficiency over random search. Next we use the diversity of the TaskSet and our method for learning hyperparameter lists to empirically explore the generalization of these lists to new optimization tasks in a variety of settings including ImageNet classification with Resnet50 and LM1B language modeling with transformers. As part of this work we have opensourced code for all tasks, as well as ~29 million training curves for these problems and the corresponding hyperparameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题