Tag: anu bhatia’s ai compendium
-
He Initialization
Also known as Kaiming Initialization, this technique was developed for neural networks that used the ReLU activation function. ReLU is a non-linear function that clips all negative inputs to zero. This results in a non-zero, positive mean and a reduction in the signal’s overall magnitude. When we apply Xavier initialization to deep networks using ReLU,…
-
Solving Gradient Problems in Neural Networks
Back-propagation calculates the gradient of the loss function with respect to the weights and updates the weights to reduce the error. The main mathematical principle used here is the chain rule. However, the repeated multiplication inherent in this process can lead to either the vanishing or the exploding gradient problem. Solutions For Both Issues Intelligent…