Weight Initialisation
The values given to weights before training begins.
The values given to weights before training begins. Bad initialisation (e.g. all zeros) causes all neurons to learn the same thing — no diversity. The 1986 paper used small random weights. Modern initialisation schemes (Xavier, He) are carefully designed to keep gradients from vanishing or exploding at the start of training.