Optimization techniques in Deep Learning
In the deep learning world, the neural network is connected to all the layers(Input layer, Hidden layer and the output layer). In the front propagation we get the Y^ and calculate the error function. Error function is also called as the lost function or cost function. To reduce the loss function the optimizers are used. They update the weight in the back propagation. Gradient Descent: The foremost optimizer used was the Gradient Descent. It works as follows 1. Calculate what a small change in each individual weight would do to the loss function 2. Adjust each individual weight based on its gradient 3. Keep doing steps #1 and #2 until the loss function gets as low as possible During optimization there could be problem in getting stuck on local minima . To avoid this need to make use of learning rate. The learning rate variable is used to multiply the gradients to scale them and need to ensure by changing weights at the right pace, not m...