Last Updated:
29/11/2019 - 15:23

IAM771 - Special Topics: Optimization Methods for Machine Learning

Credit: 3(3-0); ECTS: 8.0
Instructor(s): Hamdullah Yücel
Prerequisites: Consent to the Instructor

Course Catalogue Description

Convexity; Gradient Descent; Stochastic Gradient Methods; Noise Reduction Methods; Second-Order Methods; Adaptive Methods; Methods for Regularized Models; Introduction to Machine Learning: support machine vector, neural network.

Course Objectives

The interplay between optimization and machine learning is one of the most important developments in modern computational science. Optimization formulations and methods are proving to be vital in designing algorithms to extract essential knowledge from huge volumes of data. Machine learning, however, is not simply a consumer of optimization technology but a rapidly evolving field that is itself generating new optimization ideas. Optimization approaches have enjoyed prominence in machine learning because of their wide applicability and attractive theoretical properties. The increasing complexity, size, and variety of today's machine learning models call for the reassessment of existing assumptions. This course is designed for graduate students majoring in mathematics as well as mathematically inclined graduate engineering students. At the end of this course, the student will:

  • capture the state of the art of the interaction between optimization and machine learning.
  • understand the various optimization methods that underlie machine learning methods that have become so popular today in real-world applications.
  • use the computational tools available to solving optimization problems on computers once a mathematical formulation has been found.

Course Learning Outcomes

Upon successful completion of this course, the student will be able to

  • assess/evaluate the most important algorithms, function classes, and algorithm convergence guarantees.
  • compose existing theoretical analysis with new aspects and algorithm variants.
  • formulate scalable and accurate implementations of the most important optimization algorithms for machine learning applications.

Tentative (Weekly) Outline

  1. Background of machine learning
  2. Convexity and nonsmooth calculus tools
  3. Gradient descent
  4. Projected gradient descent
  5. Stochastic gradient methods
  6. Variance reduction methods: SAG, SAGA, SVRG
  7. Accelerated gradient descent
  8. Mirror descent
  9. Second-order methods
  10. Second-order methods: stochastic quasi-Newton
  11. Dual methods: stochastic dual coordinate ascent methods
  12. Conditional gradient method
  13. Adaptive methods
  14. Methods for regularized models

Course Textbook(s)

There will be no explicit textbook for the course; rather, the instructor will provide some hand-written lecture notes along with the progress of the course. However, the following monographs on this topic are recommended:

  • S. Suvrit, S. Nowozin, and S. J. Wright, Optimization for Machine Learning, MIT Press, 2012.
  • J. Nocedal and S. J. Wright, Numerical Optimization, Second Edition, Springer, 2006.
  • S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
  • L. Bottou, F. E. Curtis, J. Nocedal: Optimization Methods for Large-Scale Machine Learning, arXiv:1606.04838v3
  • W. Hu, Nonlinear Optimization in Machine Learning, Lecture Notes
  • E. Hazan, Optimization for Machine Learning, arXiv:1909.03550v1
  • S. Bubeck, Convex Optimization: Algorithms and Complexity, arXiv:1405.4980v2
  • M. Jaggi and B. Gärtner, Optimization for Machine Learning, Lecture Notes

Supplementary Materials and Resources

  • MATLAB Student Version is available to download on MathWorks website,, or METU FTP Severs (Licenced)
  • Python:

More Info on METU Catalogue