This course provides a rigorous mathematical foundation for the statistical and algorithmic reasoning involved in modern data science. It is designed to prepare students to approach data modeling, simulation, and evaluation with mathematical precision and clarity. Students will explore logic, set theory, combinatorics, linear models from first principles, and essential probability theory with a computational focus. Emphasis is placed on the conceptual structure behind methods such as regression, classification, and clustering, enabling students to understand not only how to use them—but why they work.
This course provides a rigorous mathematical foundation for statistical and algorithmic reasoning essential to modern data science. Students will delve into logic, set theory, combinatorics, linear algebra, probability theory, regression, classification, and clustering methods from first principles, complemented by computational exercises.
Course content is divided into weekly modules, accessible via Canvas. Each module includes lectures, readings, discussions, and assignments. Check the Course Calendar for specific deadlines.
Foundations of Mathematical Reasoning |
Data Types and Mathematical Structures |
Counting and Combinatorics |
Probability Foundations |
Random Variables and Distributions |
Joint Distributions and Independence |
Mathematical Statistics for Modeling |
Bayesian Thinking and Simulation |
Linear Algebra for Data Models |
Linear Models and Least Squares |
Logistic Regression and Optimization |
Information Theory and Model Complexity |
Mathematical Principles of Clustering |
By course end, students will:
· Formulate problems using mathematical structures such as sets, functions, and relations.
· Analyze regression models through matrix algebra and optimization methods.
· Apply combinatorial and probabilistic reasoning to data modeling.
· Understand and utilize frequentist and Bayesian statistical inference.
· Articulate assumptions and limitations of statistical models clearly and mathematically.
· Translate mathematical concepts into reproducible computational experiments.
Mathematical Methods in Data Science: Bridging Theory and Applications with Python by Sebastien Roch
Free Copy from the author: https://mmids-textbook.github.io/index.html
Python and Jupyter Notebooks with necessary libraries.
Item | % of Grade |
Weekly Problem Sets and Computational Exercises | 50% |
Midterm Exam | 20% |
Final Project | 30% |
Assignments are due according to the dates posted in your Canvas course site. You may check these due dates in the Course Calendar or the Assignments in the corresponding modules. I will post grades one week after assignment due dates.
Generally I do not directly grade spelling and grammar. However, egregious violations of the rules of the English language will be noted without comment. Consistently poor performance in either spelling or grammar is taken as an indication of poor written communication ability that may detract from your grade.
Late Work Policy: I highly recommend turning assignments in on time. Late assignments lose 10% if submitted within 24 hours past due date. Another 10% for every 24 hours until 1 week where it will no longer be accepted. Late work not accepted for midterm or final projects unless documented exceptional circumstances are provided.
Please see video on AI use in module
Deadlines for Adding, Dropping, and Withdrawing from Courses
Students may add a course up to one week after the start of the term for that particular course. Students may drop courses according to the drop deadlines outlined in the EP academic calendar. Between the 6th week of the class and prior to the final withdrawal deadline, a student may withdraw from a course with a W on their academic record. A record of the course will remain on the academic record with a W appearing in the grade column to indicate that the student registered and withdrew from the course.
Academic Misconduct Policy
Students with Disabilities - Accommodations and Accessibility
Student Conduct Code
Classroom Climate
JHU is committed to creating a classroom environment that values the diversity of experiences and perspectives that all students bring. Everyone has the right to be treated with dignity and respect. Fostering an inclusive climate is important. Research and experience show that students who interact with peers who are different from themselves learn new things and experience tangible educational outcomes. At no time in this learning process should someone be singled out or treated unequally on the basis of any seen or unseen part of their identity. If you have concerns in this course about harassment, discrimination, or any unequal treatment, or if you seek accommodations or resources, please reach out to the course instructor directly. Reporting will never impact your course grade. You may also share concerns with your program chair, the Assistant Dean for Diversity and Inclusion, or the Office of Institutional Equity. In handling reports, people will protect your privacy as much as possible, but faculty and staff are required to officially report information for some cases (e.g. sexual harassment).
Course Auditing
When a student enrolls in an EP course with “audit” status, the student must reach an understanding with the instructor as to what is required to earn the “audit.” If the student does not meet those expectations, the instructor must notify the EP Registration Team (EP-Registration@exchange.johnshopkins.edu) in order for the student to be retroactively dropped or withdrawn from the course (depending on when the "audit" was requested and in accordance with EP registration deadlines). All lecture content will remain accessible to auditing students, but access to all other course material is left to the discretion of the instructor.