Rethinking the Role of Hyperparameter Tuning in Optimizer Benchmarking

Yuanhao Xiong,Xuanqing Liu,Li-Cheng Lan,Yang You,Si Si,Cho-Jui Hsieh

semanticscholar（2021）

Cited 0|Views46

No score

Abstract

Many optimizers have been proposed for training deep neural networks, and 1 they often have multiple hyperparameters, which make it tricky to benchmark 2 their performance. In this work, we propose a new benchmarking protocol to 3 evaluate both end-to-end efficiency (training a model from scratch without knowing 4 the best hyperparameter configuration) and data-addition training efficiency (the 5 previously selected hyperparameters are used for periodically re-training the model 6 with newly collected data). For end-to-end efficiency, unlike previous work that 7 assumes random hyperparameter tuning, which may over-emphasize the tuning 8 time, we propose to evaluate with a bandit hyperparameter tuning strategy. For 9 data-addition training, we design a new protocol for assessing the hyperparameter 10 sensitivity to data shift. We then apply the proposed benchmarking framework 11 to 7 optimizers on various tasks, including computer vision, natural language 12 processing, reinforcement learning, and graph mining. Our results show that there 13 is no clear winner across all the tasks. 14

Translated text

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined