605.649.82 - Introduction to Machine Learning

Computer Science
Spring 2024


Analyzing large data sets (“Big Data”), is an increasingly important skill set. One of the disciplines being relied upon for such analysis is machine learning. In this course, we will approach machine learning from a practitioner’s perspective. We will examine the issues that impact our ability to learn good models (e.g., inductive bias, the curse of dimensionality, the bias-variance dilemma, and no free lunch). We will then examine a variety of approaches to learning models, covering the spectrum from unsupervised to supervised learning, as well as parametric versus non-parametric methods. Students will explore and implement several learning algorithms, including logistic regression, nearest neighbor, decision trees, and feed-forward neural networks, and will incorporate strategies for addressing the issues impacting performance (e.g., regularization, clustering, and dimensionality reduction). In addition, students will engage in online discussions, focusing on the key questions in developing learning systems. At the end of this course, students will be able to implement and apply a variety of machine learning methods to real-world problems, as well as be able to assess the performance of these algorithms on different types of data sets. Prerequisite(s): EN.605.202 – Data Structures or equivalent.


Profile photo of Shane Strasser.

Shane Strasser


Course Structure

Details on the course structure can be found in the Course Outline. Each course module runs for a period of seven (7) days, i.e., one week. Due dates for readings and other assignments are referred to by the day of the module week in which they are due. For example, if a reading assignment is to be completed by Day 3 and the module started on Monday, then the reading assignment should be completed by Wednesday or the 3rd day of the module. Please refer to the Course Outline for the specific start and end dates for each module in this course.


All students must complete a quiz on the Syllabus and Course Information and receive a score of at least 90% before the content modules will be released. Questions about the syllabus and course information should be asked of the instructor prior to taking the quiz. Once the quiz is passed, the instructor will assume all students understand the course expectations.

Course Topics

  1. Non parametric Learning

  2. Clustering

  3. Bayesian Learning

  4. Decision Trees

  5. Ensembles

  6. Dimensionality Reduction

  7. Rule Learning

  8. Linear Models

  9. Linear Networks

  10. Multi-Layer Networks

  11. Deep Learning

  12. Reinforcement Learning

  13. Temporal Difference Learning

  14. Deep Reinforcement Learning

Course Goals

To develop broad understanding of the issues in developing and implementing machine learning algorithms and systems, especially as they related to modern, data-intensive problems.

Course Learning Outcomes (CLOs)



Alpaydin, E. (2020). Introduction to machine learning(4th ed.). Cambridge, MA: The MIT Press. ISBN 9780262043793.

Prior editions of this book are strongly discouraged and may be used only at your own risk.


Mitchell, T. M. (1997). Machine learning. New York, NY: McGraw-Hill. ISBN-10: 0070428077; ISBN-13: 9780070428072.

Bishop, C. M. (2006). Pattern recognition and machine learning. New York, NY: Springer, ISBN-10: 0387310738; ISBN-13: 9780387310732.

Other Materials & Online Resources

Speakers/audio output (or headsets), webcam, and microphones are required for this course.

Student Coursework Requirements

Course Expectations

This is a very high workload course and is designed as a graduate-level computer science course. It has also been designed to prepare students interested in a more technical/research-based experience in the design of their degree programs. All students, but

especially those in programs other than the computer science MS or postgraduate certificate program, should therefore, be aware of the following expectations.

Programming Assignments Guidelines

The programming assignments are designed to give you experience implementing key machine learning algorithms from scratch. You will implement the algorithms to test their performance on several data sets from the UCI Machine Learning Repository. For these assignments, you are required to submit source code, short videos that demonstrate proper functioning of the code, and a brief paper describing of the results of your experiments.

You may use one of the following higher-order programming language you wish (e.g., Java, Python, C#); however, you are not permitted to use available machine learning libraries such as Matlab toolbox, WEKA, RapidMiner, scikit-learn, TensorFlow, etc. You are not permitted to use the following languages for implementing the algorithms, but you may use them to support analysis of the results: SQL, Matlab, Maple, Mathematica, R. All algorithms must be implemented from scratch by you. Basic libraries for managing data structures and fundamental math operations (e.g., NumPy or Pandas) are permitted. You are not permitted to use Stack Overflow or any of the Stack Exchange forums under any circumstances.

To facilitate the grading of programming assignments, please adhere to the following:

The report you provided should be done with a word processor or Latex. Use the equation and pseudo-code editing capabilities of whatever tool you use. Make sure you submit a PDF of your report. MS Word, Open Office/Libre Office, and Latex files will not be accepted. Submit your report separately, i.e., do NOT include in in your zip file.

Your report must include the following:

A sample report is provided in Blackboard. Note that the sample report specifies requirements for a more formal report than what is specified here. You are only required to satisfy the requirements above.

Participation Grading Criteria

Active student participation is an essential part of any online course. Therefore, part of the student's grade (30%) will be based on class participation. There are two components to the class participation grade – muddy posts/responses and small group discussions.

CAUTION: Be advised that you will be assigned to two different groups – one for muddy point exercises and one for small group assignments. Members of these groups are not the same and either or both may change as the semester progresses.

Muddy Point/Response

During the last module of every module group, students will be required to post a "Muddiest Point" message to the open discussion forum associated with the topic. Specifically, the student shall post a comment that identifies a part of the module that was particularly confusing, thus needing clarification. This must be done by Day 3 of the week when the module is presented. Students will be paired with one or two other students, and one of the partners will post a clarifying response in the same open discussion forum within two days of the initial posting (i.e., by Day 5). This response must constitute a serious, substantive attempt to answer the question posed in the muddy point and will be graded accordingly. Simply referring to an external website (e.g., Wikipedia) is not sufficient. The responder must demonstrate that they have attempted to gain a solid understanding of the answer. Thus students will be evaluated based on timeliness in posting the Muddiest Point as well as their ability to provide clarifying responses to their partners' Muddiest Points. Later, the instructor will add additional clarifying information if necessary.

Types of questions that are not accepted for muddy points include the following:

For grading, each muddy response will be scored based on timeliness, completeness, and correctness (1: on time, complete, and correct, 0.5: late, incorrect, or answers a question not asked, 0: unacceptable, 0.6–0.9: late or on time but with some deficiencies). Each muddy post will also receive one of three scores (1: on time and substantive, 0.5: late or not substantive, 0: no post or unacceptable). The response is weighted 50% more than the initial post. Each week’s muddy score is calculated as the post score plus the 1.5 times the response score, and this total divided by 2.5 to obtain a percentage. This score (out of 100) is what will be posted in Canvas. All of the muddy scores are averaged for the final muddy point grade.

A muddy point may contain one, and only one, target question to be addressed. If multiple questions are posted, a penalty will be applied to the muddy point part of the grade. Furthermore, the muddy buddy responding may then choose which question to answer and is not required to address all of them.

When there are groups of three people, a “round robin” response policy is enforced. This means that everyone needs to be the primary responder to one other person in the group. Suppose a group is made up of Alice, John, and Bob. Two possible scenarios are possible: 1) Alice responds to Bob who responds to John who responds to Alice, or 2) Alice responds to John who responds to Bob who responds to Alice. The order chosen is entirely up to the group and may change from week to week if so desired. But anyone who violates round robin will automatically receive a zero on the response part of the assignment.

The muddy point/response part of the course is with 15% of the final grade.

Small Group Discussions

In addition to the muddy point exercises, during the first module in each module group, an open discussion question will be posted in the main discussion forums (except for the first module pair). Each student will be group with two or three other students and assigned to a “Group” within Canvas. During the week, the group is to engage in an ongoing discussion on the question posted. You are asked to limit the discussion to appear in one thread per assignment. Each student is required to post at least five times on at least three different days. Non-substantive posts (e.g., “I agree,” or “I need to think about that more”) will not be counted.

During each discussion, each substantive post will receive 1 point (up to 5 points total), and each day posted will receive 1 point (up to 3 points total). Thus full credit will be 8 points. The score posted in Canvas will be a percentage (a score out of 100) based on this 8-point total. Up to two additional points may be assigned at the instructor’s discretion for particularly lively discussions. These additional points will not be “extra credit” but can be used to “make up” points lost elsewhere.

The small group discussion part of the course is worth 15% of the final grade.

Short Quiz Guidelines

During the second module in each module pair (01-02, 03-04, etc.), students will be required to complete a short, 10-question objective-style quiz. The point of the quiz is to provide a “formative assessment” where both the student and the instructor can gain a sense for how well students are learning the material. Because the quizzes are formative, they only account for 10% of the final grade.

Each student will have 30 minutes to complete each quiz, and two attempts will be permitted. All of the questions will be objective-style (multiple choice or true-false), and there will only be 10 questions. No mechanism will be employed to determine if the student is using outside resources to take the quiz (e.g, the book, notes, the web, etc.); however, students are asked to take the quiz with no such resources. This is the best way for the instructor to gain a sense of the level of knowledge of the student.

Remember to answer all questions on both attempts. The system is set up to take your best attempt. It also does not average attempts.

The quizzes are designed to show all 10 questions all at once. Answers will be provided once the student submits their answers. While the time for the quiz is set for 30 minutes, it is possible to take longer. Even so, the student should strive to complete the quiz within the 30 minute timeframe.

Quiz feedback will be released automatically after the due date passes.


Grading Policy

Grading will be based on biweekly programming assignments, small group discussions, and short quizzes. Final grades will be determined by the following weighting:

% of Grade
Muddy Post Discussions (6)
Small Group Discussions (5)
Short Quizzes (6)
Programming Projects (5)

Each programming assignment will outline the specific requirements and steps to be taken to complete the assignment with associated weights.

In terms of assigning a final letter grade, the following is provided as the default scheme. If deemed appropriate, the instructor may adjust these grades downward, for example, to achieve a target of 20% A’s.

Score Range

Students at risk of receiving a C or lower in the class are identified at midterm as part of the course roster verification. If you do not receive notice from the university or the professor following midterm, you are on track to receive at least a B- in the class.

Course Policies

Late Submission Policy

Being that we are all working professionals, and time management is of critical importance, the purpose of this document is to lay out the course policy with respect to completing course assignments.

The policy of this course is that no late submissions will be accepted.

Note that I recognize exceptional circumstances may arise, and I am willing to work with students when they do.

Therefore, the following additional requirements are put in place:
  1. If you must travel for business and will have limited Internet connectivity, then you must notify the instructor at least one week prior to travel to make arrangements for handling assignments. Failure to provide this advanced notice will result in all due dates being enforced.
  2. If you are traveling on vacation, then all due dates remain enforced. Personal travel is not an excuse to relax the due dates.
  3. If there is a family emergency (e.g., a death in the family or a serious illness), then you must notify the instructor as soon as possible to make arrangements for handling the assignments. 
  4. If you become personally ill, then it is important for you to take care of your health; however, since we are not meeting in person in a classroom, meaning that spread of disease is not an issue, only illnesses or injuries that require professional medical attention will receive special handling. Otherwise, all due dates will be enforced.
  5. Under no circumstances will time management issues result in a relaxation of due dates. Poor time management is never an acceptable excuse.
  6. Special accommodations are available for students who register a disability with the university. Those accommodations will be worked out with SDS.

Policy on Using AI Large Language Models

This class will strive to create an environment that fosters learning, critical thinking, effective communication, and technology development. To achieve these goals, using AI-based tools such as ChatGPT, Co-Pilot, or similar are prohibited in this course.

While ChatGPT and other large language models can be powerful and useful tools in certain contexts, relying on them for this course undermines the learning objectives. You are being trained to understand the fundamentals of machine learning, rather than to become a user of AI or ML tools. This approach involves developing skills in independent thinking, problem-solving, and engagement with the subject matter. By restricting the use of large language models, your knowledge, creativity, and critical analysis will be used to complete all assignments and actively participate in class discussions.

It is important to note that this requirement applies to all aspects of the course, including programming assignments, writing assignments, discussion assignments, quizzes, and any form of communication related to the course content. Any use of AI large language models, including ChatGPT, during these activities will be considered a violation in the student code of conduct and will be reported as academic misconduct.

Questions about these policies should be directed to the instructor.

Policy on the Use of Code Repositories

For purposes of project management and version control, code repositories such as GitHub and GitLab have demonstrated tremendous benefit. Because of this, students are strongly encouraged to use such repositories for their projects. Be that as it may, there is evidence that such code repositories can also be abused with respect to making previous assignments available to the public, thus fostering academic misconduct. For these reasons the following policies with respect to code repositories are put in place.

1. Should a student decide to use a code repository to manage their programming projects, those repositories must be private and must remain private beyond the end of the course.

2. Re-emphasizing the above, under no circumstances may code from this course be made public, either through a public repository or through posting to another site that collects assignments or code.

3. Under no circumstances may the assignments themselves be posted to a public repository or public site.

4. Should a student discover the assignments have been posted somewhere on the web, the student is asked to report the site to the instructor as soon as possible.

5. Should a student discover the assignments in a public place, under no circumstances shall the student view, download, or otherwise use the information contained on that site.

Questions about these policies should be directed to the instructor.

Academic Policies

Deadlines for Adding, Dropping and Withdrawing from Courses

Students may add a course up to one week after the start of the term for that particular course. Students may drop courses according to the drop deadlines outlined in the EP academic calendar (https://ep.jhu.edu/student-services/academic-calendar/). Between the 6th week of the class and prior to the final withdrawal deadline, a student may withdraw from a course with a W on their academic record. A record of the course will remain on the academic record with a W appearing in the grade column to indicate that the student registered and withdrew from the course.

Academic Misconduct Policy

All students are required to read, know, and comply with the Johns Hopkins University Krieger School of Arts and Sciences (KSAS) / Whiting School of Engineering (WSE) Procedures for Handling Allegations of Misconduct by Full-Time and Part-Time Graduate Students.

This policy prohibits academic misconduct, including but not limited to the following: cheating or facilitating cheating; plagiarism; reuse of assignments; unauthorized collaboration; alteration of graded assignments; and unfair competition. Course materials (old assignments, texts, or examinations, etc.) should not be shared unless authorized by the course instructor. Any questions related to this policy should be directed to EP’s academic integrity officer at ep-academic-integrity@jhu.edu.

Students with Disabilities - Accommodations and Accessibility

Johns Hopkins University values diversity and inclusion. We are committed to providing welcoming, equitable, and accessible educational experiences for all students. Students with disabilities (including those with psychological conditions, medical conditions and temporary disabilities) can request accommodations for this course by providing an Accommodation Letter issued by Student Disability Services (SDS). Please request accommodations for this course as early as possible to provide time for effective communication and arrangements.

For further information or to start the process of requesting accommodations, please contact Student Disability Services at Engineering for Professionals, ep-disability-svcs@jhu.edu.

Student Conduct Code

The fundamental purpose of the JHU regulation of student conduct is to promote and to protect the health, safety, welfare, property, and rights of all members of the University community as well as to promote the orderly operation of the University and to safeguard its property and facilities. As members of the University community, students accept certain responsibilities which support the educational mission and create an environment in which all students are afforded the same opportunity to succeed academically. 

For a full description of the code please visit the following website: https://studentaffairs.jhu.edu/policies-guidelines/student-code/

Classroom Climate

JHU is committed to creating a classroom environment that values the diversity of experiences and perspectives that all students bring. Everyone has the right to be treated with dignity and respect. Fostering an inclusive climate is important. Research and experience show that students who interact with peers who are different from themselves learn new things and experience tangible educational outcomes. At no time in this learning process should someone be singled out or treated unequally on the basis of any seen or unseen part of their identity. 
If you have concerns in this course about harassment, discrimination, or any unequal treatment, or if you seek accommodations or resources, please reach out to the course instructor directly. Reporting will never impact your course grade. You may also share concerns with your program chair, the Assistant Dean for Diversity and Inclusion, or the Office of Institutional Equity. In handling reports, people will protect your privacy as much as possible, but faculty and staff are required to officially report information for some cases (e.g. sexual harassment).

Course Auditing

When a student enrolls in an EP course with “audit” status, the student must reach an understanding with the instructor as to what is required to earn the “audit.” If the student does not meet those expectations, the instructor must notify the EP Registration Team [EP-Registration@exchange.johnshopkins.edu] in order for the student to be retroactively dropped or withdrawn from the course (depending on when the "audit" was requested and in accordance with EP registration deadlines). All lecture content will remain accessible to auditing students, but access to all other course material is left to the discretion of the instructor.