TutORial: Stochastic Gradient Descent: Recent Trends

By Raghu Pasupathy, Farzad Yousefian, and David Newton.

Stochastic Gradient Descent (SGD), also known as stochastic approximation, refers to certain iterative structures used for solving stochastic optimization and root finding problems. Owing to several factors, SGD has become the leading method to solve optimization problems arising within large-scale machine learning and “big data” contexts such as classification and regression. This tutorial will cover the basics of SGD with an emphasis on modern developments. The tutorial starts with examples and problem variations where SGD is applicable, and then details important flavors of SGD that are currently in use. The presentation of this tutorial will include numerical examples.