Instructor: Daniel Brown
Description: This course will cover a range of topics related to the problem of how to get AI systems to do what we, as humans, actually want them to do. We will explore a range of topics including active learning, human-in-the-loop reinforcement learning, human intent and preference learning, algorithmic teaching, and AI safety. Classes will be a mix of lectures covering foundational materials as well as hands-on analysis and exploration of both seminal and recent research readings. Students will also be engaged in a novel research project, culminating in a final presentation and written technical report. By taking this course, students will develop a broad understanding of the common techniques and unique research challenges involved in building AI systems that learn from, interact with, and assist humans. Additionally, students will learn and practice fundamental research skills, including how to read, write, and review research readings, how to quickly prototype and test research ideas, and how to give technical presentations.
Format: This course combines lectures with paper presentations and analyses by the students, encouraging both fundamental knowledge acquisition as well as open-ended discussions. There will be a series of short homework assignments that test concepts learned in class and give students an opportunity to gain hands-on experience with these ideas. Each student will also carry out an individual or group research project. Weekly paper analyses/presentations will follow a roleplaying model, where students will take turns participating in different rolls.
Syllabus: Available on Canvas.
Schedule: Subject to change
# | Date | Topic | Reading | Supplemental | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 | Mon Aug 21 | Class intro and logistics | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2 | Wed Aug 23 | Sequential Decision Making |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3 | Mon Aug 28 | Imitation Learning via Behavioral Cloning |
|
4 |
Wed Aug 30 |
Interactive Imitation Learning |
|
|
5 |
Mon Sep 4 |
Labor Day |
NA
|
|
6 |
Wed Sep 6 |
Interactive Reinforcement Learning 1 |
|
|
| Homework 1 Released |
Homework link: BC and BCO |
|
7 |
Mon Sept 11 |
Interactive Reinforcement Learning 2 |
|
| 8 |
Wed Sept 13 |
Inverse RL 1 |
|
|
9 |
Mon Sept 18 |
Inverse RL 2 |
|
|
|
| Homework 2 Released |
Homework link: Bayesian IRL
|
| 10 |
Wed Sept 20 |
Adversarial Imitation Learning |
|
|
11 |
Mon Sept 25 |
RL from Human Preferences 1 |
|
|
12 |
Wed Sept 27 |
RL from Human Preferences 2 |
|
|
|
| Homework 3 Released |
RLHF |
13 |
Mon Oct 2 |
Alignment 1 |
|
| 14 |
Wed Oct 4 |
Alignment 2 |
|
|
| Oct 9-13 |
Fall Break |
No Class
|
| 15 |
Mon Oct 16 |
Shared Autonomy and Assistance 1 |
|
| 16 |
Wed Oct 18 |
Shared Autonomy and Assistance 2 |
|
|
17 |
Mon Oct 23 |
Self-Calibrating Interfaces 1 |
|
| 18 |
Wed Oct 25 |
Self-Calibrating Interfaces 2 |
|
| 19 |
Mon Oct 30 |
Optimal Teaching 1 |
|
|
20 |
Wed Nov 1 |
Optimal Teaching 2 |
|
|
21 |
Mon Nov 6 |
Multiple Forms of Feedback |
|
| 22 |
Wed Nov 8 |
Alignment Verification |
|
|
23 |
Mon Nov 13 |
Reward Specification Issues |
|
|
24 |
Wed Nov 15 |
Ethics |
|
|
25 |
Mon Nov 20 |
Existential AI Risk |
|
|
| Wed Nov 22 |
Day Before Thanksgiving |
No Class
|
| 26 |
Mon Nov 27 |
Guest Lecture (virtual) |
Yuchen Cui
|
|
27 |
Wed Nov 29 |
Guest Lecture (virtual) |
Dylan Hadfield-Menell
|
|
28 |
Mon Dec 4 |
Project presentations (virtual) |
|
29 |
Wed Dec 6 |
Project presentations |
|
| 30 |
Fri Dec 15 |
Final Project Report Due |
Overleaf template Just go to Menu and select copy project. |
|