Main source:
Course Description:
Deep learning is a relatively recent field and mathematical work on deep learning, which
is the focus of this class, is even more recent. Key mathematical results will be presented in
class and often will be assigned as readings. The material of the class will be based on notes
that will be provided to the participants of the class.
This course is a rigorous introduction to the mathematical foundations of deep learning.
In particular, the course will introduce the students to the theory of universal approximation,
stochastic gradient type of optimizers and their convergence properties, statistical learning
bounds, approximation theory, depth separation results, neural tangent kernel, mean field
overparametrized regime, implicit regularization and noisy dynamics, different deep neural
network architectures and their properties, reinforcement learning, Q-learning, several computational aspects such as back propagation, batch normalization and dropout, deep
learning for dynamical systems and variational methods.
The course material will be based on theory, methods (both theoretical and computational) and examples from various scientific disciplines. The class will focus on supervised
learning.
Course Prerequisites: Introduction to probability and/or stochastic processes (MA 581 or MA583 or equivalents), Differential Equations (MA226 or MA231 or equivalent), Linear algebra (MA242 or equivalent), basic statistical theory and basic programming skills. Some exposure to real analysis at the undergraduate level will also be useful. PDEs, graduate level probability and statistics will be helpful but not necessary. Students are expected to have the knowledge equivalent to undergraduate level: (a) probability and/or stochastic processes, (b) basic statistical theory, (c) linear algebra, (d) differential equations, and (e) basic coding skills (ideally in Python).