525.637.81 - Foundations of Reinforcement Learning

Electrical and Computer Engineering
Fall 2023

Description

The course will provide a rigorous treatment of reinforcement learning by building on the mathematical foundations laid by optimal control, dynamic programming, and machine learning. Topics include model-based methods such as deterministic and stochastic dynamic programming, LQR and LQG control, as well as model-free methods that are broadly identified as Reinforcement Learning. In particular, we will cover on and off-policy tabular methods such as Monte Carlo, Temporal Differences, n-step bootstrapping, as well as approximate solution methods, including on- and off-policy approximation, policy gradient methods, including Deep Q-Learning. The course has a final project where students are expected to formulate and solve a problem based on the techniques learned in class.

Expanded Course Description

Prerequisites: 525.614 Probability and Stochastic Processes for Engineers (or equivalent), 625.615 Introduction to Optimization (or equivalent)

Instructor

Default placeholder image. No profile image found for Enrique Mallada Garcia.

Enrique Mallada Garcia

mallada@jhu.edu

Course Structure

The course materials are divided into modules which can be accessed by clicking Modules on the left menu. A module will have several sections including the overview, content, readings, discussions, and assignments. You are encouraged to preview all sections of the module before starting. Most modules run for a period of seven (7) days, exceptions are noted in the Course Outline section of Modules/Course Introduction. You should regularly check the Calendar and Announcements for assignment due dates.

Course Topics

This course broadly covers two broad methodologies used to solve Reinforcement Learning problems.

Course Goals

The course will provide a rigorous treatment of reinforcement learning by building on the mathematical foundations laid by optimal control, dynamic programming, and machine learning.

Course Learning Outcomes (CLOs)

Textbooks

Richard S. Sutton and Andrew G. Barto, Reinforcement Learning: An Introduction (2nd Edition)
ISBN-13:978-0262039246.
Online: http://www.incompleteideas.net/book/the-book-2nd.html

Other Materials & Online Resources

Other relevant books

 

Required Software

Lab exercises require the use of Python and Jupyter Notebooks. You are welcome to use your own machine to run the labs.
Instructions on how to install Jupyter Notebook can be found in: https://jupyter.org/install

However, you will also be given access to our Jupyter Hub: https://rlhub.ece.jhu.edu


Student Coursework Requirements

It is expected that each module will take approximately 7–10 hours per week to complete. Here is an approximate breakdown: reading the assigned sections of the texts (approximately 3–4 hours per week) as well as some outside reading, watching the lecture videos (approximately 2–3 hours per week), and writing assignments (approximately 2–3 hours per week).

This course will consist of the following basic student requirements:

Preparation and Participation (10% of Final Grade Calculation)

You are responsible for carefully reading all assigned material and being prepared for discussion. The majority of readings are from the course text. Additional reading may be assigned to supplement text readings.

Post your initial response to the discussion questions by the evening of day 3 for that module week. Posting a response to the discussion question is part one of your grade for module discussions (i.e., Timeliness).

Part two of your grade for module discussion is your interaction (i.e., responding to classmate postings with thoughtful responses) with at least two classmates (i.e., Critical Thinking). Just posting your response to a discussion question is not sufficient; we want you to interact with your classmates. Be detailed in your postings and in your responses to your classmates' postings. Feel free to agree or disagree with your classmates. Please ensure that your postings are civil and constructive.

We will monitor module discussions and will respond to some of the discussions as discussions are posted. In some instances, we will summarize the overall discussions and post the summary for the module.

Evaluation of preparation and participation is based on contribution to discussions.

Preparation and participation is evaluated by the following grading elements:

Preparation and participation is graded as follows:

Assignments (30% of Final Grade Calculation)

Several modules will be accompanied by problem sets. The problems will vary in difficulty across weeks; some weeks there will be several simple exercises, while other weeks have one or two more difficult ones. Problem sets will be a mixture of exercises from the course text and problems provided by the instructor.

Assignments are evaluated by the following grading elements:

  1. Each part of question is answered (20%)
  2. Assumptions are clearly stated (20%)
  3. Intermediate derivations and calculations are provided (25%)
  4. Answer is technically correct and is clearly indicated (25%)
  5. Answer precision and units are appropriate (10%)

Assignments are graded as follows:

Labs (30% of Final Grade Calculation) 

Problems sets are complemented with programming exercises, Labs. Labs are aimed at getting practical experience with several algorithms used in Reinforcement Learning Problems. Labs are made in Jupyter Notebook. You can either run them on your computer or use the Jupyter Hub that is dedicated to this class: http://rlhub.ece.jhu.edu

Labs are graded as follows:

Course Project (30% of Final Grade Calculation)

The course also has a final group project aimed at applying reinforcement learning algorithms towards solving a practical problem. The project can be made on groups from 1 to 3 participants. Deliverables include a project proposal, a final presentation (to be recorded and submitted), and project report.

The course project is evaluated by the following grading elements:

  1. Project Proposal providing a description of the proposed problem to be solved using reinforcement learning (15%)
  2. Project Presentation video describing the proposed solution and illustrating the outcomes (35%)
  3. Project Report providing a detailed report of the outcomes of the project (50%)

Project Proposal is graded as follows:

Presentation is graded as follows:

Final Report is evaluated as follows:

Grading Policy

Score RangeLetter Grade
100-97= A+
96-94= A
93-91= A−
90-88= B+
87-85= B
84-78= B−
77-74= C+
73-69= C
68-64= C−
63-59= D+
58-55= D
<55= F
 

Course Policies

Plagiarism
Plagiarism is defined as taking the words, ideas or thoughts of another and representing them as one's own. If you use the ideas of another, provide a complete citation in the source work; if you use the words of another, present the words in the correct quotation notation (indentation or enclosed in quotation marks, as appropriate) and include a complete citation to the source.

Solutions
Solutions to homework assignments and labs will be posted on the first Monday after each module ends.

Late Submissions
 
You are allowed two late submissions in your homework assignments or labs without penalty. After those two, a penalty will be given depending on how late the submission was received. No submissions will be accepted after the solutions are released.



Academic Policies

Deadlines for Adding, Dropping and Withdrawing from Courses

Students may add a course up to one week after the start of the term for that particular course. Students may drop courses according to the drop deadlines outlined in the EP academic calendar (https://ep.jhu.edu/student-services/academic-calendar/). Between the 6th week of the class and prior to the final withdrawal deadline, a student may withdraw from a course with a W on their academic record. A record of the course will remain on the academic record with a W appearing in the grade column to indicate that the student registered and withdrew from the course.

Academic Misconduct Policy

All students are required to read, know, and comply with the Johns Hopkins University Krieger School of Arts and Sciences (KSAS) / Whiting School of Engineering (WSE) Procedures for Handling Allegations of Misconduct by Full-Time and Part-Time Graduate Students.

This policy prohibits academic misconduct, including but not limited to the following: cheating or facilitating cheating; plagiarism; reuse of assignments; unauthorized collaboration; alteration of graded assignments; and unfair competition. Course materials (old assignments, texts, or examinations, etc.) should not be shared unless authorized by the course instructor. Any questions related to this policy should be directed to EP’s academic integrity officer at ep-academic-integrity@jhu.edu.

Students with Disabilities - Accommodations and Accessibility

Johns Hopkins University values diversity and inclusion. We are committed to providing welcoming, equitable, and accessible educational experiences for all students. Students with disabilities (including those with psychological conditions, medical conditions and temporary disabilities) can request accommodations for this course by providing an Accommodation Letter issued by Student Disability Services (SDS). Please request accommodations for this course as early as possible to provide time for effective communication and arrangements.

For further information or to start the process of requesting accommodations, please contact Student Disability Services at Engineering for Professionals, ep-disability-svcs@jhu.edu.

Student Conduct Code

The fundamental purpose of the JHU regulation of student conduct is to promote and to protect the health, safety, welfare, property, and rights of all members of the University community as well as to promote the orderly operation of the University and to safeguard its property and facilities. As members of the University community, students accept certain responsibilities which support the educational mission and create an environment in which all students are afforded the same opportunity to succeed academically. 

For a full description of the code please visit the following website: https://studentaffairs.jhu.edu/policies-guidelines/student-code/

Classroom Climate

JHU is committed to creating a classroom environment that values the diversity of experiences and perspectives that all students bring. Everyone has the right to be treated with dignity and respect. Fostering an inclusive climate is important. Research and experience show that students who interact with peers who are different from themselves learn new things and experience tangible educational outcomes. At no time in this learning process should someone be singled out or treated unequally on the basis of any seen or unseen part of their identity. 
 
If you have concerns in this course about harassment, discrimination, or any unequal treatment, or if you seek accommodations or resources, please reach out to the course instructor directly. Reporting will never impact your course grade. You may also share concerns with your program chair, the Assistant Dean for Diversity and Inclusion, or the Office of Institutional Equity. In handling reports, people will protect your privacy as much as possible, but faculty and staff are required to officially report information for some cases (e.g. sexual harassment).

Course Auditing

When a student enrolls in an EP course with “audit” status, the student must reach an understanding with the instructor as to what is required to earn the “audit.” If the student does not meet those expectations, the instructor must notify the EP Registration Team [EP-Registration@exchange.johnshopkins.edu] in order for the student to be retroactively dropped or withdrawn from the course (depending on when the "audit" was requested and in accordance with EP registration deadlines). All lecture content will remain accessible to auditing students, but access to all other course material is left to the discretion of the instructor.