Zero-shot Transfer Learning for Gray-box Hyper-parameter Optimization

Hadi Samer Jomaa,Lars Schmidt-Thieme,Josif Grabocka

user-5ed732bc4c775e09d87b4c18（2021）

引用 0|浏览39

暂无评分

摘要

Zero-shot hyper-parameter optimization refers to the process of selecting hyper- parameter configurations that are expected to perform well for a given dataset upfront, without access to any observations of the losses of the target response. Existing zero-shot approaches are posed as initialization strategies for Bayesian Optimization and they often rely on engineered meta-features to measure dataset similarity, operating under the assumption that the responses of similar datasets behaves similarly with respect to the same hyper-parameters. Solutions for zero- shot HPO are embarrassingly parallelizable and thus can reduce vastly the required wallclock time of learning a single model. We propose a very simple HPO model called Gray-box Zero(0)-Shot Initialization (GROSI) as a conditional parametric surrogate that learns a universal response model by exploiting the relationship between the hyper-parameters and the dataset meta-features directly. In contrast to existing HPO solutions, we achieve transfer of knowledge without engineered meta- features, but rather through a shared model that is trained simultaneously across all datasets. We design and optimize a novel loss function that allows us to regress from the dataset/hyper-parameter pair unto the response. Experiments on 120 datasets demonstrate the strong performance of GROSI, compared to conventional initialization strategies. We also show that by fine-tuning GROSI to the target dataset, we can outperform state-of-the-art sequential HPO algorithms.

查看译文

关键词

Initialization,Bayesian optimization,Parametric statistics,Hyperparameter,Gray box testing,Algorithm,Transfer of learning,Parallelizable manifold,Computer science,Response model

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要