Tag: glorot initialization
-
Xavier Initialization
Training deep networks requires careful initialization. If weights are too small, activations and gradients can shrink toward zero; if too large, they can grow out of control. Also known as the Glorot Initialization, the Xavier Initialization was proposed by Xavier Glorot and Yoshua Bengio in 2010. It is designed to combat both, vanishing and exploding…