Generalization

Generalization: Ability to Apply what was learned during Training to new (Test) Data

Reasons for bad generalization

Reduce Complexity of Network through Regularization

💡Idea: Certain connections are removed from the network to reduce complexity and to avoide overfitting
Remove those connections that have the least effect on the Error (MSE, ..), i.e. are the least important.
- But this is time consuming (difficult) 🤪

Iteratively Increasing/Growing a Network (construktive) starting from a very small one

Adding a hidden unit
- Input connections from all input units and from all already existing hidden units
- First only these connections are adapted
- Maximize the correlation between the activation of the candidate units and the residual error of the net
Not necessary to determine the number of hidden units empirically
Can produce deep networks without dramatic slowdown (bottom up, constructive learning)
At each point only one layer of connections is trained
Learning is fast
Learning is incremental

Popular and very effective method for generalization
💡Idea
- Randomly drop out (zero) hidden units and input features during training
- Prevents feature co-adaptation
Illustration
Dropout training & test

Last updated on Apr 3, 2022