Some Mathematical Aspects of Deep Learning and Stochastic Gradient Descent

April 10, 2023 - 04:30 - April 10, 2023 - 05:30

Lexing Ying, Stanford University

Fine Hall 214

In-Person Talk

This talk concerns several mathematical aspects of deep learning and stochastic gradient descent. The first aspect is why deep neural networks trained with stochastic gradient descent often generalize. We will make a connection between the generalization and the stochastic stability of the stochastic gradient descent dynamics. The second aspect is to understand the training process of stochastic gradient descent. Here, we use several simple mathematical examples to explain several key empirical observations, including the edge of stability, exploration of flat minimum, and learning rate decay.

Based on joint work with Chao Ma.