Differential Description Length for Hyperparameter Selection in Supervised Learning.

Mojtaba Abolfazli,Anders Høst-Madsen,June Zhang

ISITA（2020）

Cited 0|Views11

No score

Abstract

Minimum description length (MDL) is an established method for model selection. For supervised learning problems, cross-validation is often used for model selection in practice. Reasons are 1) MDL is difficult to apply directly to data; 2) MDL may make restrictive statistical assumptions that decrease performance; and 3) MDL does not directly aim to minimize generalization error. In this paper, we introduce a modification to MDL, which we call differential description length (DDL). DDL partitions the data so that the codelength(s) it computes, reflects the conditional probability of seeing ‘new’ data given ‘old’ data. This differential codelength is what allows DDL to estimate generalization error like cross-validation. DDL is also better than cross-validation because it allows the learning algorithm to use the entire data without having to withhold subsets for validation and testing. Compared with MDL, DDL has both better performance (in finding models with smaller generalization error) and is easier to compute. Experiments with linear regression and deep neural networks show that DDL also outperforms cross-validation.

Translated text

Key words

hyperparameter selection,learning,length,description

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined