小村庄 - 一个学习曲线的多臂匪徒，用于选择算法

论文标题

小村庄 - 一个学习曲线的多臂匪徒，用于选择算法

HAMLET -- A Learning Curve-Enabled Multi-Armed Bandit for Algorithm Selection

论文作者

Schmidt, Mischa, Gastinger, Julia, Nicolas, Sébastien, Schülke, Anett

论文摘要

自动化算法选择和超参数调整有助于机器学习的应用。传统的多军匪徒策略探讨了观察到的奖励的历史，以确定最有希望的武器，从长远来看，以优化预期的总奖励。在考虑有限的时间预算和计算资源时，这种奖励的落后视图是不合适的，因为土匪应考虑未来，以期在指定的时间预算结束时预期最高的最终奖励。这项工作通过引入哈姆雷特（Hamlet）来解决该洞察力，该洞察力通过学习曲线的外推和计算时间意识扩展了强盗方法，以在一组机器学习算法之间进行选择。结果表明，在大多数考虑的时间预算中，在具有记录的超参数调音痕迹的实验中，Hamlet变体1-3表现出与其他基于BANDIT的算法选择策略相同或更好的性能。表现最好的小村庄变体3结合了学习曲线的外推与众所周知的上限界探索奖金。该变体的性能优于所有非宪法策略，在1,485次运行中，在95％的水平上具有统计学意义。

Automated algorithm selection and hyperparameter tuning facilitates the application of machine learning. Traditional multi-armed bandit strategies look to the history of observed rewards to identify the most promising arms for optimizing expected total reward in the long run. When considering limited time budgets and computational resources, this backward view of rewards is inappropriate as the bandit should look into the future for anticipating the highest final reward at the end of a specified time budget. This work addresses that insight by introducing HAMLET, which extends the bandit approach with learning curve extrapolation and computation time-awareness for selecting among a set of machine learning algorithms. Results show that the HAMLET Variants 1-3 exhibit equal or better performance than other bandit-based algorithm selection strategies in experiments with recorded hyperparameter tuning traces for the majority of considered time budgets. The best performing HAMLET Variant 3 combines learning curve extrapolation with the well-known upper confidence bound exploration bonus. That variant performs better than all non-HAMLET policies with statistical significance at the 95% level for 1,485 runs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题