Optimization For Machine Learning (IE 1187/IE 2187) Spring 2024
Modern machine learning involves fitting predictive models on huge data sets using optimization methods. The choice of optimization method is critical in these problems. For example, using traditional (factorization based) methods to solve regression with ten thousand data points and features will fail - a tiny dataset by modern standards. Moreover, modern machine learning methods such as stochastic gradient descent are not plug-and-play: they require user expertise to select tuning parameters and interpret results. The goal of this course is to teach students how to use modern first-order methods to solve large-scale machine learning problems. Coding will be done in python using pytorch.
Topics covered: Convexity, nonconvexity, critical points and saddle points. Gradient descent descent. First-order methods vs second-order methods. Training vs test error. Stochastic gradient descent. Hyperparameter tuning. Explicit and implicit regularization. Batch sizes, parallelization, and GPUs. Fine tuning.
Requirements: Multivariate calculus (e.g., MATH 240), linear algebra (e.g., MATH 0280), probability (e.g., IE 1070), and programming experience (e.g., IE 0015).
Learning objectives
Students should be able to explain how optimization is used in machine learning
Students should be able to explain the difference between train and test error. They should be able to avoid overfitting.
Students should be able to explain at a high level how stochastic gradient descent works and why it is popular for machine learning
Students should be able to train machine learning models including
Choose appropriate loss functions
Understanding and debugging possible failure cases
Tune hyperparameters including step size routines, batch sizes
Understand how they can reduce training times
How to fine tune models
ABET outcomes
(1) Identify, formulate, and solve complex engineering problems by applying principles of engineering, science, and mathematics
(2) Apply engineering design to produce solutions that meet specified needs with consideration of public health, safety, and welfare, as well as global, cultural, social, environmental, and economic factors
(5) Function effectively on a team whose members together provide leadership, create a collaborative and inclusive environment, establish goals, plan tasks, and meet objectives
(6) Develop and conduct appropriate experimentation, analyze and interpret data, and use engineering judgment to draw conclusions
Assessment for IE 1187
In class group exercises (5% of grade). Roughly one in three lectures will be in class exercises devoted to solving exercises in teams with guidance from the professor. These are performed in groups of 2-3. Due at the end of lecture. Full points for good faith effort (i.e., no need to complete all questions but try your best). No credit for team members who do not show up to class. Worst scoring in group exercise can be dropped.
Live in class questions using tophat (5% of grade). Ten worst scoring questions can be dropped.
Five HWs (20% of grade). Lowest scoring HW will be dropped (only the best four scoring HWs will be counted). HWs are equally weighted.
Midterm exam (30% of grade).
Final exam (40% of grade).
Assessment for IE 2187
In class group exercises (5% of grade). Roughly one in three lectures will be in class exercises devoted to solving exercises in teams with guidance from the professor. These are performed in groups of 2-3. Due at the end of lecture. Full points for good faith effort (i.e., no need to complete all questions but try your best). No credit for team members who do not show up to class. Worst scoring in group exercise can be dropped.
Live in class questions using tophat (5% of grade). Ten worst scoring questions can be dropped.
Five HWs (15% of grade). Lowest scoring HW will be dropped (only the best four scoring HWs will be counted). HWs are equally weighted.
Midterm exam (25% of grade).
Final exam (35% of grade).
Project (10% of grade). The project will involve finding an interesting dataset and fine tuning a pretrained model to it using the techniques you have learnt in class. The code must be original and well-documented.
Final exam & midterm will test:
Conceptual understanding of material and how to use in practice
Coding by asking comprehension questions about sample code or to write short pseudo-code
Late HW policy
Late penalties: less than 1 hour late 2% penalty, less than two days late 5% penalty. Any later no points except with extraordinary circumstances.
Supplementary material
There is no textbook for this course (slides and colabs will contain all material that needs to be known) but useful supplementary references include:
https://www.cambridge.org/core/books/optimization-for-data-analysis/C02C3708905D236AA354D1CE1739A6A2
HW Collaboration Policies
Students may collaborate on HW but should understand their answers and write them up themselves. The most important thing is that students use HWs to learn.
Project Collaboration Policies
Students can only collaborate on projects with others if they are given prior approval from the instructor.
Learning tools
Canvas for posting course content and submitting HW
Tophat for in class questions
Google Colab for coding
Tentative lecture schedule
Standard University PoLicies
Academic integrity
Students in this course will be expected to comply with the University of Pittsburgh’s Policy on Academic Integrity. Any student suspected of violating this obligation for any reason during the semester will be required to participate in the procedural process, initiated at the instructor level, as outlined in the University Guidelines on Academic Integrity. This may include, but is not limited to, the confiscation of the examination of any individual suspected of violating University Policy. Furthermore, no student may bring any unauthorized materials to an exam, including dictionaries and programmable calculators.
To learn more about Academic Integrity, visit the Academic Integrity Guide for an overview of the topic. For hands-on practice, complete the Academic Integrity Modules.
The Swanson School’s Academic Integrity Guide can be found here: SSOE_AI_Policy.pdf
Disability services
If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources and Services (DRS), 140 William Pitt Union, (412) 648-7890, drsrecep@pitt.edu, (412) 228-5347 for P3 ASL users, as early as possible in the term. DRS will verify your disability and determine reasonable accommodation for this course.
Students must contact DRS each term to initiate their accommodations.