Optimization For Machine Learning (IE 1187/IE 2187) Spring 2025


Modern machine learning involves fitting predictive models on huge data sets using optimization methods. The choice of optimization method is critical in these problems. For example, using traditional (factorization based) methods to solve regression with ten thousand data points and features will fail - a tiny dataset by modern standards. Moreover, modern machine learning methods such as stochastic gradient descent are not plug-and-play: they require user expertise to select tuning parameters and interpret results. The goal of this course is to teach students how to use modern first-order methods to solve large-scale machine learning problems. Coding will be done in python using pytorch.


Topics covered: Convexity, nonconvexity, critical points and saddle points. Gradient descent descent. First-order methods vs second-order methods. Training vs test error. Stochastic gradient descent. Hyperparameter tuning. Explicit and implicit regularization. Batch sizes, parallelization, and GPUs. Fine tuning. Large language models.


Requirements: Multivariate calculus (e.g., MATH 240), linear algebra (e.g., MATH 0280), probability (e.g., IE 1070), and programming experience (e.g., IE 0015). 

Learning objectives

ABET outcomes

(1) Identify, formulate, and solve complex engineering problems by applying principles of engineering, science, and mathematics 

(2) Apply engineering design to produce solutions that meet specified needs with consideration of public health, safety, and welfare, as well as global, cultural, social, environmental, and economic factors 

(5) Function effectively on a team whose members together provide leadership, create a collaborative and inclusive environment, establish goals, plan tasks, and meet objectives 

(6) Develop and conduct appropriate experimentation, analyze and interpret data, and use engineering judgment to draw conclusions  

Assessment:


Final exam & midterm will test: 

Late HW policy

Late penalties: less than 1 hour late 2% penalty, less than two days late 5% penalty. Any later no points except in extraordinary circumstances.

Supplementary material

There is no textbook for this course (slides and colabs will contain all material that needs to be known) but useful supplementary references include:

HW Collaboration and ChatGPT Policies

Students may collaborate on homeworks but should understand their answers and write them up themselves. The use of large language models is not prohibited but I recommend that students use them sparingly. The most important thing is that students use HWs to learn. Over reliance on tools like ChatGPT or friends may lead to poor performance in the midterm or final exam.

Learning tools

Tentative lecture schedule