EmpiricalRL Syllabus (subject to change)



CMPUT 607: Empirical Reinforcement Learning
Term: Winter, 2021
Lecture Date and Time: Mon, Wed 12:30 - 2:00 p.m (Mountain Time). Lecture Location: Remote

Prerequisites

An undergraduate or graduate level course on Reinforcement Learning (e.g., CMPUT 397, CMPUT 366 from 2018), or successfully complete the 4-course UofA RL MOOC. This is an advanced class on Reinforcement Learning. We will not cover the basics of RL; it will be assumed students already know the material.

Description

This course will focus on doing good experiments in reinforcement learning (RL). Reinforcement Learning is a fast growing field. Learning systems are becoming more complex and are routinely applied to complex games, 3D simulators, and robots. It is challenging to evaluate these systems because performance depends on carefully setting numerous hyper-parameters and each experiment may consume vast amounts of data and compute—sometimes running for days over even weeks on super clusters. It is not secret that many of the empirical results published in the RL literature are suspect or flat out misleading. This course will focus on designing and conducting good experiments in RL. We will survey best practices and criticism of popular methodologies used in the field. The objective of the course is to train each student to be a good RL empiricist—which will be demonstrated with a final project focused on conducting a good experiment. The class will be a mix of lecture, student presentations on papers from the literature, and the final project.

Course Topics

This course is all about evaluations and methodical concerns in RL. It can be seen as a followup to CMPUT 603, where we focus on the issues specific to RL and with a more significant focus on the final project and scientific communication.

The first part of the course will be lecture based, and the second half will consist almost entirely of student presentations and discussions. The end goal is to teach you how to conduct an excellent empirical study in RL.

We will cover:

Course Work and Evaluation

Your grade will be based on three components: class participation (30%), in-class presentation (20%), project draft, final project report (50%).

Participation: To measure student participation you will be expected to ask and answer questions during lecture time (I will keep notes over the term). In addition during class two students will be assigned as moderators to monitor the video and chat and interject with questions. Think of this like a sessions chair at a conference. We want to get through the lecture material / student talk, but we also want to clarify misunderstandings and allow interesting discussion. That is the role of the discussion moderator. This will account for 15% of your mark.

The second 15% of the percent (for a total of 30% participation mark) we be based on your review of your fellow student’s project drafts. Every student will review, comment on, and provide advice on another other student(s) project drafts. Every draft will receive 3 reviews. It is important to learn how to give fair and constructive reviews—the objective is to make everyone’s project better.

Due to Covid and remote lecture, some students may have difficulty attending lecture and getting their participation marks. There are two options in this case: understand this is a discussion & participation oriented class and perhaps another class might be better for you, or contact me and we can make alternative arrangements.

Project: Projects will be done in groups: minimum 2 students, maximum 4 students. That means picking a research question. Conducting excellent and meaningful experiments and writing the report. There will be a project draft due in March that will be reviewed by your classmates. It is very important that you take their advice and improve your project. The final projects will be subjected to desk reject, just like at a real conference. If your final project meets any of the desk-reject criteria it will automatically loose 50 marks. This is be discussed in class.

Presentation: Finally, you will be required to complete one in-class presentation. The exact length will be determined by the final number of students enrolled in the class (but expect at least 15 min presentation). The topic of the presentation can either be: (1) your proposed project (research question, motivation, related work), or (2) a summary of a published paper that either discusses methodology or your analysis of the experiments in a published paper.

In summary, the relative weighting on each component will be approximately as follows (small adjustments may be made during the term).

  1. Project 50% (10% for draft, 40% for final)
  2. Participation 30% (In class questions and moderation session 15%; project draft peer-review 15%)
  3. Class presentation 20%

Course Materials

All course reading material will be available free and online: (1). Videos from the UofA RL MOOC. (2) Sutton and Barto, Reinforcement Learning: An Introduction: richsutton.com

Academic Integrity

The University of Alberta is committed to the highest standards of academic integrity and honesty. Students are expected to be familiar with these standards regarding academic honesty and to uphold the policies of the University in this respect. Students are particularly urged to familiarize themselves with the provisions of the Code of Student Behaviour and avoid any behaviour which could potentially result in suspicions of cheating, plagiarism, misrepresentation of facts and/or participation in an offence. Academic dishonesty is a serious offence and can result in suspension or expulsion from the University. (GFC 29 SEP 2003)