CS 6960 Human-AI Alignment

Instructor: Daniel Brown 

Description: This course will cover a range of topics related to the problem of how to get AI systems to do what we, as humans, actually want them to do. We will explore a range of topics including active learning, human-in-the-loop reinforcement learning, human intent and preference learning, algorithmic teaching, and AI safety. Classes will be a mix of lectures covering foundational materials as well as hands-on analysis and exploration of both seminal and recent research readings. Students will also be engaged in a novel research project, culminating in a final presentation and written technical report. By taking this course, students will develop a broad understanding of the common techniques and unique research challenges involved in building AI systems that learn from, interact with, and assist humans. Additionally, students will learn and practice fundamental research skills, including how to read, write, and review research readings, how to quickly prototype and test research ideas, and how to give technical presentations.

Format: This course combines lectures with paper presentations and analyses by the students, encouraging both fundamental knowledge acquisition as well as open-ended discussions. There will be a series of short homework assignments that test concepts learned in class and give students an opportunity to gain hands-on experience with these ideas. Each student will also carry out an individual or group research project. Weekly paper analyses/presentations will follow a roleplaying model, where students will take turns participating in different rolls.

Syllabus: Available on Canvas.

Schedule: Tentative schedule below. Note this may change.

#      Date Topic Reading Supplemental
1      Mon Aug 22 Class intro and logistics MDP and RL primers:
  • Russell Norvig MDP chapter
  • Sutton Barto Book Sections 1.1, 1.3, 2.4, 2.6.
  • 2 Wed Aug 24 Intro/Surveys
  • Scalable agent alignment via reward modeling: a research direction
  • Recent advances in robot learning from demonstrations
  • 3 Mon Aug 29 Imitation Learning via Behavioral Cloning
  • Behavioral Cloning from Observation
  • Implicit Behavioral Cloning
  • ALVINN
  • 4 Wed Aug 31 Interactive Imitation Learning
  • DAgger
  • ThriftyDAgger
  • SafeDAgger
  • HG-DAgger
  • Homework 1 Released Homework link: BC and BCO
  • OpenAI Gym
  • PyTorch
  • 5 Wed Sept 7 Interactive Reinforcement Learning 1
  • Trial without Error: Towards Safe Reinforcement Learning via Human Intervention
  • Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning
  • 6 Mon Sept 12 Interactive Reinforcement Learning 2
  • TAMER
  • Deep TAMER
  • COACH
  • Deep COACH
  • 7 Wed Sept 14 Goal Inference
  • Goal Inference as Inverse Planning
  • End-to-End Robotic Reinforcement Learning without Reward Engineering
  • 8 Mon Sept 19 Inverse Reinforcement Learning 1
  • Apprenticeship learning via inverse reinforcement learning
  • Algorithms for inverse reinforcement learning
  • 9 Wed Sept 21 Inverse Reinforcement Learning 2
  • Bayesian inverse reinforcement learning
  • Maximum entropy inverse reinforcement learning
  • Homework 1 Due by end of day (11:59 MST)
    Homework 2 Released Homework link: Bayesian IRL
    10 Mon Sept 26 Adversarial Imitation Learning
  • Generative Adversarial Imitation Learning
  • f-IRL: Inverse Reinforcement Learning via State Marginal Matching
  • 11 Wed Sept 28 Shared Autonomy and Assistance 1
  • Formalizing Assistive Teleoperation
  • Paragraph Pitch of Final Project Due (11:59 MST)
  • Shared Autonomy via Hidsight Optimization
  • 12 Mon Oct 3 Shared Autonomy and Assistance 2
  • Human-in-the-Loop Optimization of Shared Autonomy in Assistive Robotics
  • X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback
  • 13 Wed Oct 5 Shared Autonomy and Assistance 3
  • Learning to share autonomy from repeated HRI
  • ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning
  • Fall Break Homework 2 Due by end of day on Wednesday October 12th (11:59 MST).
    14 Mon Oct 17 Optimal Teaching 1
  • Algorithmic and Human Teaching of Sequential Decision Tasks
  • Machine teaching for IRL
  • 15 Wed Oct 19 Optimal Teaching 2
  • Cooperative IRL
  • Pragmatic Pedagogic Value Alignment
  • 16 Mon Oct 24 Preferences and Active Learning 1
  • Deep Reinforcement Learning from Human Preferences
  • Final Project Proposal and Lit Review Due (11:59 MST)
  • Learning to summarize from human feedback
  • PEBBLE
  • 17 Wed Oct 26 Preferences and Active Learning 2
  • Asking Easy Questions
  • Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences
  • Asking questions that reveal reward learning
  • 18 Mon Oct 31 AI Safety, Alignment, and Ethics 1
  • Concrete problems in AI Safety
  • The off-switch game
  • User Tampering in Reinforcement Learning Recommender Systems
  • Current and Near-Term AI as a Potential Existential Risk Factor
  • 19  Wed Nov 2 AI Safety, Alignment, and Ethics 2
  • Computational Ethics
  • AI, Values, and Alignment
  • 20 Mon Nov 7 Guest lecture David Krueger
    21 Wed Nov 9 Verification, Trust, and Transparency 1
  • A Quality Diversity Approach to Automatically Generating Human-Robot Interaction Scenarios in Shared Autonomy
  • Authoring and verifying human-robot interactions
  • 22 Mon Nov 14 Verification, Trust, and Transparency 2
  • Trust calibration within a human-robot team: Comparing automatically generated explanations
  • Value alignment verification
  • 23 Wed Nov 16 Multiple forms of human feedback
  • Reward rational implicit choice
  • Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction
  • 24 Mon Nov 21 Human-AI Collaboration and HRI 1
  • Human-Robot Cross-Training
  • Decision-Making for Bidirectional Communication in Sequential Human-Robot Collaborative Tasks
  • Evaluating Fluency in human-robot collaboration
  • Learning Multi-Modal Grounded Linguistic Semantics by Playing "I Spy"
  • 25 Wed Nov 23 Guest lecture Mariah Schrum
    26 Mon Nov 28 Guest lecture Andreea Bobu
    27 Wed Nov 30 Project presentations
  • Akansha
  • Adithya
  • Anurag
  • Bao, Yixuan, Zohre
  • 28 Mon Dec 5 Project presentations
  • Brian
  • Connor
  • Iain
  • Jordan
  • 29  Wed Dec 7 Project presentations
  • Monika
  • Mike, Nancy
  • Siyeon
  • 30 Fri Dec 16 Final Project Report Due Overleaf template Just go to Menu and select copy project.