 # Model Selection ## Definitions

• Model selection is the task of choosing a model from a set of potential models with the best inductive bias, which in practice means selecting parameters in an attempt to create a model of optimal complexity given (finite) training data.
Sewell (2006)

• "Thus learning is not possible without inductive bias, and now the question is how to choose the right bias. This is called model selection."
Alpaydin (2004) p33

• "Model selection is the task of choosing a model of optimal complexity for the given (finite) data."
Cherkassky and Mulier (1998) p73

• "Model selection: estimating the performance of different models in order to choose the (approximate) best one."
Hastie, Tibshirani and Friedman (2001)

• "The term model selection (see e.g. [Forster, 2000]) refers to the problem of selecting good learning parameters from a small set of choices based on training data."
Joachims (2002) p105

• "...it is important to find the optimal model complexity h to avoid overfitting. This process is called model selection and can be done using different criteria. In model selection quantities like the kernel width for radial basis functions, the number of neurons in a neural network or regularization parameters are chosen."
Rychetsky

• "Model selection is the task of selecting a mathematical model from a set of potential models, given evidence."
Wikipedia (2006)

## Bibliography

'Model selection (variable selection in regression is a special case) is a bias versus variance trade-off and this is the statistical principle of parsimony. Inference under models with too few parameters (variables) can be biased, while with models having too many parameters (variables) there may be poor precision or identification of effects that are, in fact, spurious. These considerations call for a balance between under- and over-fitted models -- the so-called "model selection problem" (see Forster 2000, 2001).'
Burnham and Anderson (2004)

## Empirical

### Cross-validation (Stone (1974), Geisser (1975))

• Generalized cross-validation (GCV) (Craven and Wahba, 1979)
• k-fold crossvalidation
• leave-one-out crossvalidation

## Theoretical

### Akaike information criterion (AIC)

• AIC (Akaike, 1974)
• AICc (Hurvich and Tsai, 1989)
• QAIC
• QAICc
• AICW (Wilks, 1995)