705.741.8VL - Reinforcement Learning

Artificial Intelligence
Fall 2024

Description

This course will focus on both the theoretical and the practical aspects of designing, training, and testing reinforcement learning systems. The course begins with an examination of Markov decision processes (MDPs), which provide a sound mathematical basis for modeling and solving complex sequential decision problems. The more traditional analytical method for solving MDPs, dynamic programming, will be reviewed. We will then examine the major reinforcement learning approaches, such as Monte Carlo methods, temporal difference methods, policy gradient methods, and deep learning methods, comparing them as appropriate to dynamic programming techniques. Fundamental issues and limitations on the performance of reinforcement learning algorithms (e.g., the credit assignment problem, the exploration / exploitation tradeoff, on-policy learning versus off-policy learning, partial observability, and algorithm convergence properties) will be examined for each approach. Weekly exercises and discussion topics will reinforce and expand on the classroom material. In addition, students will gain practical experience during a semester-long project by programming, training, and testing various reinforcement learning algorithms.

Instructor

Course Structure

The course materials are divided into modules which can be accessed by clicking Modules on the course menu. A module will have several sections including the lecture content, readings, and assignments. Most modules run for a period of seven (7) days, with exceptions noted in the Course Outline. You should regularly check the Calendar and Announcements for assignment due dates.

Course Topics

Course Goals

To develop broad understanding of the issues in developing, implementing, and analyzing reinforcement learning algorithms and systems.

Course Learning Outcomes (CLOs)

Textbooks

Sutton, R.S., & Barto, A.G. (2018).  Reinforcement learning: An introduction (2d ed.). MIT press.

ISBN:     9780262039246

Required Software

The projects require a number of the algorithms covered in the course to be implemented and tested.  The use of Python is strongly encouraged, but if necessary the student may choose from among other high-level computer languages (for example, C++ or Java).  MATLAB and R may not be used except for data analyses. 

Student Coursework Requirements

It is expected that each module will take approximately 9–10 hours per week to complete. Here is an approximate breakdown: reading the assigned sections of the texts (approximately 2 hours per week), solving problem sets (approximately 2 hours per week), and completing project assignments (approximately 5–6 hours per week).

This course will consist of the following basic student requirements:

Computer Projects (65% of Final Grade Calculation)

A semester-long computer project, as well as several smaller projects, will be assigned over the course of the semester. Each project will have a separate written description that includes the specific grading criteria for that project.

As noted above, each project will require the student to develop and test the machine learning algorithm(s) as specified in the project description.  It is expected that each student is proficient in at least one higher-level language. The programs must be your own work! 

The use of Python is strongly encouraged, but if necessary the student may choose from among other high-level computer languages.  For machine learning, popular languages (in addition to Python) include Java and C/C++/C#. For the programming assignments, only basic libraries are permitted to be used unless otherwise specified in the project assignment. Libraries such as scikit-learn, weka, RapidMiner, MLPack, Keras, Theano, TensorFlow, PyTorch, or similar are not permitted.  The use of MATLAB or R for algorithm implementation is not permitted, but they may be used to support analyses of the results of experiments or to generate graphics.  If there are any questions about the use of a given language, library, or tool, please ask your instructor. 

The submission requirements for each requirement, along with the grading criteria, will vary by project and will therefore be provided explicitly by each project description.

Problem Sets (30% of Final Grade Calculation)

Homework assignments will be made during selected modules in support of the material covered by the course.  The homework will be due as shown in the course schedule.  The homework solutions must be your own work.  All answers on the homework must be justified by showing the reasoning and/or the calculations used to obtain the answers.


Preparation and Participation (5% of Final Grade Calculation)

You are responsible for carefully reading all assigned material and being prepared for the class lectures and discussions. The majority of readings are from the course text. Additional reading may be assigned to supplement text readings.

Grading Policy

Assignments are due according to the dates posted on the Canvas course site. You may check these due dates in the Course Calendar.  Submissions up to one week late will be penalized by the reduction of one letter grade; submissions later than one week beyond the due date will not be accepted.  Requests for extensions that are received in advance of the corresponding deadlines will be evaluated on a case-by-case basis.

The ability to clearly and concisely communicate the hypotheses, procedures, and results of scientific research is expected of all science and engineering professionals.  Egregious violations of the rules of the English language will be will be taken as an indication of poor written communication ability and should be expected to detract from your grade.

I will strive to post grades within one week after the assignment due dates.

A grade of A indicates achievement of consistent excellence and distinction throughout the course—that is, conspicuous excellence in all aspects of assignments and discussion in every week.

A grade of B indicates work that meets all course requirements on a level appropriate for graduate academic work.

EP uses a +/- grading system (see “Grading System”, Graduate Programs catalog, p. 10).

Score RangeLetter Grade
100-97= A+
96-93= A
92-90= A−
89-87= B+
86-83= B
82-80= B−
79-77= C+
76-73= C
72-70= C−
69-67= D+
66-63= D
<63= F

Course Policies

This course is delivered in a 'virtual live' format.  The lectures are not pre-recorded as in the 'online' format, but rather are delivered live during the scheduled class time and simultaneously recorded for use by students to supplement their learning between lectures.  Students are expected to attend the class lectures during the scheduled class time and to participate by asking and answering questions as appropriate.  If you need to miss a scheduled class, please let the instructor know ahead of time.

Software developed for this course may never be posted in a public code repository or on a public website, even following the completion of the course. To do otherwise will be considered academic misconduct. 

Students are expected to do all of their own work.  The use of generative AI tools such as OpenAI’s ChatGPT or Microsoft’s Copilot is prohibited for this course.  Failure to provide citations and references to outside technical or programming resources used in the completion of assignments will be regarded as plagiarism.

Academic Policies

Deadlines for Adding, Dropping and Withdrawing from Courses

Students may add a course up to one week after the start of the term for that particular course. Students may drop courses according to the drop deadlines outlined in the EP academic calendar (https://ep.jhu.edu/student-services/academic-calendar/). Between the 6th week of the class and prior to the final withdrawal deadline, a student may withdraw from a course with a W on their academic record. A record of the course will remain on the academic record with a W appearing in the grade column to indicate that the student registered and withdrew from the course.

Academic Misconduct Policy

All students are required to read, know, and comply with the Johns Hopkins University Krieger School of Arts and Sciences (KSAS) / Whiting School of Engineering (WSE) Procedures for Handling Allegations of Misconduct by Full-Time and Part-Time Graduate Students.

This policy prohibits academic misconduct, including but not limited to the following: cheating or facilitating cheating; plagiarism; reuse of assignments; unauthorized collaboration; alteration of graded assignments; and unfair competition. Course materials (old assignments, texts, or examinations, etc.) should not be shared unless authorized by the course instructor. Any questions related to this policy should be directed to EP’s academic integrity officer at ep-academic-integrity@jhu.edu.

Students with Disabilities - Accommodations and Accessibility

Johns Hopkins University values diversity and inclusion. We are committed to providing welcoming, equitable, and accessible educational experiences for all students. Students with disabilities (including those with psychological conditions, medical conditions and temporary disabilities) can request accommodations for this course by providing an Accommodation Letter issued by Student Disability Services (SDS). Please request accommodations for this course as early as possible to provide time for effective communication and arrangements.

For further information or to start the process of requesting accommodations, please contact Student Disability Services at Engineering for Professionals, ep-disability-svcs@jhu.edu.

Student Conduct Code

The fundamental purpose of the JHU regulation of student conduct is to promote and to protect the health, safety, welfare, property, and rights of all members of the University community as well as to promote the orderly operation of the University and to safeguard its property and facilities. As members of the University community, students accept certain responsibilities which support the educational mission and create an environment in which all students are afforded the same opportunity to succeed academically. 

For a full description of the code please visit the following website: https://studentaffairs.jhu.edu/policies-guidelines/student-code/

Classroom Climate

JHU is committed to creating a classroom environment that values the diversity of experiences and perspectives that all students bring. Everyone has the right to be treated with dignity and respect. Fostering an inclusive climate is important. Research and experience show that students who interact with peers who are different from themselves learn new things and experience tangible educational outcomes. At no time in this learning process should someone be singled out or treated unequally on the basis of any seen or unseen part of their identity. 
 
If you have concerns in this course about harassment, discrimination, or any unequal treatment, or if you seek accommodations or resources, please reach out to the course instructor directly. Reporting will never impact your course grade. You may also share concerns with your program chair, the Assistant Dean for Diversity and Inclusion, or the Office of Institutional Equity. In handling reports, people will protect your privacy as much as possible, but faculty and staff are required to officially report information for some cases (e.g. sexual harassment).

Course Auditing

When a student enrolls in an EP course with “audit” status, the student must reach an understanding with the instructor as to what is required to earn the “audit.” If the student does not meet those expectations, the instructor must notify the EP Registration Team [EP-Registration@exchange.johnshopkins.edu] in order for the student to be retroactively dropped or withdrawn from the course (depending on when the "audit" was requested and in accordance with EP registration deadlines). All lecture content will remain accessible to auditing students, but access to all other course material is left to the discretion of the instructor.