The abridged syllabus you’re looking for has not been entered yet. We’ve provided a similar abridged syllabus as an example.
Check back later for that specific abridged syllabus. The complete syllabus will be available in your Canvas course.
685.701.81 - Data Science: Modeling and Analytics
Data Science
Fall 2025
Description
This course advances the design of data modeling as it applies to the field of data science while leveraging key concepts from AI, machine learning, and statistics. Data modeling is a combination of various fields which allow the processing of various data types, and representing the data in an expressive way that shows the relationships between data points and intrinsic patterns. The course will show how to identify, design, and implement the modeling process by outlining the framework, determining the appropriate model type, evaluating the model, and representing the outputs in an explainable way. The models used will be based on intelligent algorithms (reasoning, optimization, and pattern recognition), machine learning algorithms (supervised and unsupervised), and statistical methods (descriptive statistics, inferential statistics, multi-variate, and regression). The focus will be developing and applying models using Python-based frameworks to datasets from online resources such as Kaggle, Data.gov, and open-source repositories.
Course Structure
The course materials are divided into 14 Modules which can be accessed by clicking Modules on the course menu. A module will have several sections including the overview, content, readings, discussions, and assignments. You are encouraged to preview all sections of the module before starting. The modules run for a period of seven (7) days, a wrap-up of each of the modules is contained in the Course Outline listed under the Course Information by clicking Home on the course menu. You should regularly check the Calendar and Announcements for assignment due dates.
If you are taking the course in the Summer, please note that Modules 13 and 14 are released, however they are optional.
Course Topics
- Introduction to Models for Data Science and Analytics
- Data Processing I
- Data Processing II
- Supervised Models for Data Science
- Unsupervised Models for Data Science
- Reinforcement Learning Models for Data Science
- Deep Learning Models
- Computer Vision for Data Science
- Natural Language Processing
- Evolutionary Algorithms and Models
- Decision-Making Algorithms and Models
- Hybrid Learning Models
- Explainable AI Models
- Methods of Analytics
- Advanced Analytics
Course Goals
The goal of this course is to equip students with the foundational and advanced concepts necessary to design, implement, and evaluate data models in the field of data science. By integrating principles from artificial intelligence, machine learning, and statistical methods, students will develop a strong understanding of various modeling techniques and their applications to real-world datasets. The course emphasizes a hands-on approach, utilizing Python-based frameworks to explore supervised, unsupervised, and reinforcement learning models, as well as advanced analytical and visualization tools. Additionally, students will gain experience in feature engineering, dimensionality reduction, decision-making algorithms, and explainable AI techniques to ensure transparency and interpretability in data-driven decision-making. Through a combination of theoretical learning, programming assignments, and applied projects, students will be prepared to tackle complex data science challenges across multiple domains.
Course Learning Outcomes (CLOs)
- Select appropriate models based on the problem's nature and data characteristics, ensuring suitability for real-world applications.
- Develop models, including supervised, unsupervised, and reinforcement learning approaches, using appropriate metrics to assess performance and reliability.
- Interpret model outputs to understand their impact on decision-making and ensure transparency and interpretability through explainable AI techniques.
- Apply AI methods to specific domains, such as computer vision and natural language processing, and construct practical solutions for diverse challenges.
- Employ advanced AI techniques, including evolutionary algorithms, decision-making systems, and hybrid models, to improve effectiveness and adaptability.
- Utilize analytics and visualization tools to communicate insights effectively and solve complex, real-world problems.
Textbooks
Not required (Optional):
Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2019). Mathematics for Machine Learning. Cambridge University Press.
Other Materials & Online Resources
- Bishop, C. (2006). Pattern Recognition and Machine Learning. Springer. https://www.microsoft.com/en-us/research/wp-content/uploads/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf
- Poole, D., & Mackworth, A. (2017). Artificial Intelligence Foundations of Computational Agents. Cambridge. https://artint.info/
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. https://www.deeplearningbook.org/
- Kochenderfer, M. J., & Wheeler, T. A. (2019). Algorithms for Optimization. MIT Press. https://algorithmsbook.com/optimization/files/optimization.pdf
Required Software
In this class examples may be given in Python (consider the example code provided to be a guide), students in the Data Science and Artificial Intelligence programs mainly use Python.
Some computer literacy skills you are expected to have include creating and submitting files in a word processing program, downloading and installing software, using spreadsheets, using presentation software, and using web conferencing tools and software. Additionally, you will be expected to use online search tools for academic purposes, properly cite information sources, and prepare a presentation of such findings.
Student Coursework Requirements
Grading consists of:
- Research Discussions (20%) – Over the semester, students will post research/resources discussions from module topics
- Homework Assignments (40%) - 4 assignments related to the module learning objectives
- Programming Assignments (40%) - 2 assignments covering the module learning objectives
____________________________________________
Research Discussions (20% of Final Grade Calculation)
Each of the four research discussion assignments is worth 5% of the final grade. These assignments require students to:
- Select a research topic relevant to the module.
- Post findings and resources that demonstrate depth of research.
- Engage with peers through meaningful replies and discussion.
General Submission Requirements
- Topic Selection & Justification
- Choose a research topic that is clearly connected to module content.
- Provide a short justification for its importance in the context of Data Science Modeling & Analytics.
- Clearly state research objectives or guiding questions.
- Research Post
- Present a well-researched, self-written post that synthesizes credible resources.
- Use at least two high-quality sources (academic papers, textbooks, industry reports, reputable websites).
- Summarize findings, explain methodologies, and discuss implications.
- Provide proper citations and references.
- Engagement with Peers
- Reply to at least two classmates’ posts with meaningful engagement.
- Responses should extend the discussion (critical analysis, insights, questions, or real-world connections).
- Encourage dialogue by asking follow-up questions or suggesting directions for further research.
- Clarity, Organization & Professionalism
- Posts and replies should be clear, concise, and well-structured.
- Writing must be professional and free of grammar/spelling issues.
- Follow proper formatting and academic citation standards.
- Timeliness & Consistency
- Initial post submitted on or before the deadline.
- Peer responses posted before the discussion closes.
- Demonstrate consistent participation across discussions (not last-minute only).
Grading Criteria (100 Points per Assignment)
- Completeness & Topic Coverage — 20 pts
- Topic clearly tied to module.
- Justification for importance provided.
- Research objectives/questions well-articulated.
- Research Depth, Resources & Critical Thinking — 30 pts
- Post demonstrates depth of research.
- At least two high-quality resources used and cited.
- Key findings, methods, and implications explained thoughtfully.
- Engagement & Peer Responses — 25 pts
- At least two substantive replies to peers.
- Adds analysis, insights, or real-world applications.
- Encourages ongoing discussion (e.g., follow-up questions).
- Clarity, Organization & Professionalism — 15 pts
- Writing is clear, professional, and logically structured.
- Proper grammar, spelling, and citation formatting.
- Timeliness & Consistency — 10 pts
- Initial post on time; peer replies before close of discussion.
- Evidence of consistent participation.
Total: 100 pts
____________________________________________________________________
Homework Assignment Guidelines & Grading Criteria (40% of Final Grade Calculation)
General Submission Requirements
- Assignment Types
- Qualitative: literature reviews, model summaries, design rationales.
- Quantitative: computations, modeling exercises, numerical analysis.
- Submission Format
- Submit one Jupyter Notebook for the entire assignment or one Jupyter Notebook for each problem (.ipynb)—no external README/PDF/Word.
- Notebook order:
- Cover Page — name, course, HW #, date.
- README (Execution & Setup) — Python version; required libraries + install steps; dataset source + download instructions; end‑to‑end run steps; hardware notes (GPU/CPU) and a CPU-friendly demo when relevant.
- Grading Checklist (Embedded) — do not remove; self-audit before submitting.
- Problem Sections — each problem (and sub-parts) clearly labeled.
- References — cite datasets, external code/snippets, articles, and any AI tools you used (briefly state how).
- Inside Each Problem
- Problem statement (in your own words)
- Assumptions (preprocessing, metrics/threshold choices, modeling assumptions)
- Code (fully executable)
- Figures/Tables (titles, captions, axis labels/units)
- Conclusions/Discussion (interpretation, trade-offs, limitations)
- Re-runability (runs top-to-bottom without errors; avoid leakage; set seeds where appropriate)
- Deadlines & Late Submissions
- Due per course calendar; one letter grade per week late unless pre-approved.
- Resubmissions
- Case-by-case; include a brief changelog cell of fixes.
- Academic Integrity & AI-Use Disclosure
- You may use resources/tools (including LLMs) for ideation, editing, or debugging; cite them and state how they helped.
- You are responsible for understanding all submitted work. Be prepared to explain any part.
- Do not post or share solutions publicly.
Grading Criteria
- Completeness & Problem Coverage — default 20 pts
- All parts/sub-questions answered; qualitative & quantitative addressed where applicable; coding tasks implemented.
- Writing Quality, Technical Accuracy & Justification — default 20 pts
- Clear, concise graduate-level writing; technically correct; reasoning/justification for design choices and conclusions.
- Quantitative Work: Assumptions, Derivations & Calculations — default 20 pts
- State assumptions before solving; show derivations/calculations in Markdown or code; present final results with units/precision.
- Code Quality, Documentation & Execution — default 20 pts
- Runs top-to-bottom without errors (no “Traceback” in outputs).
- Good naming, formatting, and meaningful comments; organized and efficient.
- Examples, Test Cases & Visuals — default 10 pts
- Realistic examples/test cases; well-labeled outputs; figures/tables with titles, captions, labeled axes.
- For imbalanced classification (e.g., Credit Card Fraud, NSL-KDD), include precision/recall/F1, ROC/PR (not accuracy alone).
- Notebook README & Reproducibility — default 10 pts
- Python version; package list + install steps; dataset details + download instructions; complete run steps.
- Reproducible elsewhere; seeds set; relative paths used.
Total: 100 pts
_________________________________________________________________________
Programming Assignment Requirements & Grading Criteria (40% of Final Grade Calculation)
General Submission Requirements
- Assignment Types
- Two programming assignments will be assigned during the semester:
- Programming Assignment 1 – Module 1
- Programming Assignment 2 – Module 8
- Focus: algorithm design, implementation, performance evaluation, and reproducibility.
- Submission Format
- Submit one Jupyter Notebook for the entire assignment or one Jupyter Notebook for each problem (.ipynb) — no external README/PDF/Word.
- Notebook must contain, in order:
- Cover Page — name, course, assignment #, date.
- README (Execution & Setup) — Python version; required libraries + install steps; dataset source + download instructions; run instructions; hardware notes (GPU/CPU) and CPU-friendly demo when relevant.
- Grading Checklist (Embedded) — do not remove; self-audit before submitting.
- Problem Sections — each programming task clearly labeled.
- References — cite datasets, external code/snippets, articles, and any AI tools used (briefly state how).
- Inside Each Problem
- Problem Statement — restate in your own words.
- Assumptions — preprocessing choices, metrics/thresholds, modeling assumptions.
- Pseudocode — step-by-step outline of your algorithm.
- Code Implementation — fully executable Python code blocks.
- Performance Analysis — discuss runtime complexity (time/space) and efficiency trade-offs.
- Figures/Tables — plots/tables labeled with titles, captions, and axes.
- Conclusions/Discussion — interpret results, trade-offs, and limitations.
- Re-runability — notebook must run top-to-bottom without errors; set seeds where appropriate.
- Coding Standards & Documentation
- Use clear, descriptive variable/function names.
- Include meaningful comments for key logic and functions.
- Organize code logically with best practices.
- External code may be used if cited and explained.
- Deadlines & Late Submissions
- Due per course calendar.
- Late = one letter grade per week unless pre-approved.
- Resubmissions
- Allowed on a case-by-case basis.
- Include a brief changelog cell summarizing revisions.
- Academic Integrity & AI-Use Disclosure
- Tools/resources (including LLMs) may be used for ideation, editing, or debugging.
- Must cite them and describe how they were used.
- You are responsible for understanding all submitted work — be prepared to explain it.
- Do not share or post solutions publicly.
Grading Criteria (100 Points Total)
- Completeness & Problem Coverage — default 20 pts
- All parts of the assignment implemented: pseudocode, code, performance analysis, outputs.
- Nothing missing or left vague.
- Writing Quality, Technical Accuracy & Justification — default 20 pts
- Problem statements and explanations are clear, concise, technically correct, and well-justified.
- Design choices and conclusions are explained logically.
- Quantitative Work: Assumptions, Derivations & Calculations— default 20 pts
- Assumptions clearly listed before coding.
- Any derivations or intermediate steps (e.g., formulas, metrics) shown in Markdown or code.
- Final results correct, precise, and well-presented.
- Code Quality, Documentation & Execution — default 20 pts
- Notebook runs without errors (no Tracebacks).
- Clear names, clean formatting, meaningful comments.
- Organized, efficient, and reproducible code.
- Examples, Test Cases & Visuals — default 10 pts
- Includes labeled test cases with realistic inputs/outputs.
- Figures/tables labeled with titles, captions, and axes.
- For ML tasks: include appropriate metrics (precision/recall/F1, ROC/PR, etc.) — not accuracy alone.
- Notebook README & Reproducibility — default 10 pts
- Python version, package list + install steps.
- Dataset details + download instructions.
- Step-by-step run instructions.
- Notebook reproducible on another system; relative paths used; seeds set.
Total: 100 pts
Important Notebook Guidelines
- Notebook must be self-contained — no external README files.
- All documentation, problem statements, results, and analysis must be inside the notebook.
- Execution order matters — run all cells in order before submission.
- Reproducibility required — dataset links + non-standard packages must be clearly documented.
Grading Policy
In this course we do not grade on a curve for the final grade.
In this course a +/- grading system is used. The assignments are graded on a curve, the final grade will not be curved or rounded up.
Score Range | Letter Grade |
---|
100-98 | = A+ |
97-94 | = A |
93-90 | = A− |
89-87 | = B+ |
86-83 | = B |
82-80 | = B− |
79-77 | = C+ |
76-73 | = C |
72-70 | = C− |
69-67 | = D+ |
66-63 | = D |
<63 | = F |
Academic Policies
Deadlines for Adding, Dropping and Withdrawing from Courses
Students may add a course up to one week after the start of the term for that particular course. Students may drop courses according to the drop deadlines outlined in the EP academic calendar (https://ep.jhu.edu/student-services/academic-calendar/). Between the 6th week of the class and prior to the final withdrawal deadline, a student may withdraw from a course with a W on their academic record. A record of the course will remain on the academic record with a W appearing in the grade column to indicate that the student registered and withdrew from the course.
Academic Misconduct Policy
All students are required to read, know, and comply with the Johns Hopkins University Krieger School of Arts and Sciences (KSAS) / Whiting School of Engineering (WSE) Procedures for Handling Allegations of Misconduct by Full-Time and Part-Time Graduate Students. This policy prohibits academic misconduct, including but not limited to the following: cheating or facilitating cheating; plagiarism; reuse of assignments; unauthorized collaboration; alteration of graded assignments; and unfair competition. Course materials (old assignments, texts, or examinations, etc.) should not be shared unless authorized by the course instructor. Any questions related to this policy should be directed to EP’s academic integrity officer at ep-academic-integrity@jhu.edu.
Students with Disabilities - Accommodations and Accessibility
Johns Hopkins University is committed to providing welcoming, equitable, and accessible educational experiences for all students. If disability accommodations are needed for this course, students should request accommodations through Student Disability Services (SDS) as early as possible to provide time for effective communication and arrangements. For further information about this process, please refer to the SDS Website.
Student Conduct Code
The fundamental purpose of the JHU regulation of student conduct is to promote and to protect the health, safety, welfare, property, and rights of all members of the University community as well as to promote the orderly operation of the University and to safeguard its property and facilities. As members of the University community, students accept certain responsibilities which support the educational mission and create an environment in which all students are afforded the same opportunity to succeed academically. For a full description of the code please visit the following website: https://studentaffairs.jhu.edu/policies-guidelines/student-code/
Classroom Climate
JHU is committed to creating a classroom environment that values the diversity of experiences and perspectives that all students bring. Everyone has the right to be treated with dignity and respect. Fostering an inclusive climate is important. Research and experience show that students who interact with peers who are different from themselves learn new things and experience tangible educational outcomes. At no time in this learning process should someone be singled out or treated unequally on the basis of any seen or unseen part of their identity. If you have concerns in this course about harassment, discrimination, or any unequal treatment, or if you seek accommodations or resources, please reach out to the course instructor directly. Reporting will never impact your course grade. You may also share concerns with your program chair, the Assistant Dean for Diversity and Inclusion, or the Office of Institutional Equity. In handling reports, people will protect your privacy as much as possible, but faculty and staff are required to officially report information for some cases (e.g. sexual harassment).
Course Auditing
When a student enrolls in an EP course with “audit” status, the student must reach an understanding with the instructor as to what is required to earn the “audit.” If the student does not meet those expectations, the instructor must notify the EP Registration Team [EP-Registration@exchange.johnshopkins.edu] in order for the student to be retroactively dropped or withdrawn from the course (depending on when the "audit" was requested and in accordance with EP registration deadlines). All lecture content will remain accessible to auditing students, but access to all other course material is left to the discretion of the instructor.