Extending Stochastic Gradient Optimization with ADAM
Briefly

"Gradient descent is like hiking downhill with your eyes closed, following the slope until you hit the bottom, minimizing an objective function by updating model parameters in the opposite direction of the gradient."
"An objective function measures how far off your model's predictions are from the desired outcomes, like a fitness tracker for your model. It identifies the numerical value that signifies predictive performance."
Read at hackernoon.com
[
|
]