Data analytics is a field of study involving computational statistics, data mining and machine learning, to explore data sets, explain phenomena and build models for inference and prediction. The course begins with an overview of some traditional analysis approaches including ordinary least squares regression and related topics, notably diagnostic testing, detection of outliers and methods to impute missing data. Next comes nonlinear regression, and regularization models including ridge regression. Generalized linear models follow, emphasizing logistic regression and including models for polytomous data. Variable subsetting is addressed through stepwise procedures and the LASSO. Supervised machine learning topics include the basic concepts of resampling, boosting and bagging and several techniques: Decision Trees, Classification and Regression Trees, Random Forests, Conditional Random Forests, Adaptive Boosting, Support Vector Machines and Neural Networks. Unsupervised approaches are addressed through applications using principal component analysis, k-means Clustering, Partitioning Around Medoids and Association Rule Mining. Methods for assessing model predictive performance are introduced including Confusion Matrices, k-fold Cross-Validation and Receiver Operating Characteristic Curves. Environmental and public health applications are emphasized, with modeling techniques and analysis tools implemented in R.
The course materials are divided into modules which can be accessed by clicking Course Modules on the course menu. A module will have several sections including the overview, readings, video lectures and presentations, discussions, and assignments. You are encouraged to preview all sections of the module before starting. Most modules run for a period of seven (7) days, exceptions are noted in the Course Outline. You should regularly check the Calendar and Announcements for assignment due dates.
The goals of the course are: (1) to provide a rigorous and comprehensive foundation for important statistical and machine learning concepts and tools in Data Analytics: and (2) to provide students with experience in the R programming environment.
Required
You will not have to purchase textbooks. The assigned readings for each week will be posted in the Blackboard course site at the eReserves link. In addition to readings from the books listed below, chapters and sections from other books will also be assigned. These readings will also be posted as eReserves.
Within each module in the Canvas course site you will find a list of linked "additional" resources. You may find these resources helpful; however, you are not required to read these resources.
This course uses the R Statistical Programming Environment. R is a free, widely available software system. Students may choose to use the R console or the RStudio package.
R can be downloaded at www.r-project.org Rstudio is available at https://www.rstudio.com/products/rstudio/download/ (Choose the free RStudio Desktop).
It is expected that each module will take approximately 7–10 hours per week to complete. Here is an approximate breakdown: assigned readings (approximately 3–4 hours per week), listening to video lectures (approximately 2–3 hours per week), and assignments (approximately 2–3 hours per week).
This course consist of the following requirements:
Weekly Assignments (80% of Final Grade Calculation)
EP uses a +/- grading system (see “Grading System,” Graduate Programs catalog, p. 10).
Score Range | Letter Grade |
---|---|
100-98 | = A+ |
97-94 | = A |
93-90 | = A− |
89-87 | = B+ |
86-83 | = B |
82-80 | = B− |
79-77 | = C+ |
76-73 | = C |
72-70 | = C− |
69-67 | = D+ |
66-63 | = D |
<63 | = F |
Deadlines for Adding, Dropping and Withdrawing from Courses
Students may add a course up to one week after the start of the term for that particular course. Students may drop courses according to the drop deadlines outlined in the EP academic calendar (https://ep.jhu.edu/student-services/academic-calendar/). Between the 6th week of the class and prior to the final withdrawal deadline, a student may withdraw from a course with a W on their academic record. A record of the course will remain on the academic record with a W appearing in the grade column to indicate that the student registered and withdrew from the course.
Academic Misconduct Policy
All students are required to read, know, and comply with the Johns Hopkins University Krieger School of Arts and Sciences (KSAS) / Whiting School of Engineering (WSE) Procedures for Handling Allegations of Misconduct by Full-Time and Part-Time Graduate Students.
This policy prohibits academic misconduct, including but not limited to the following: cheating or facilitating cheating; plagiarism; reuse of assignments; unauthorized collaboration; alteration of graded assignments; and unfair competition. Course materials (old assignments, texts, or examinations, etc.) should not be shared unless authorized by the course instructor. Any questions related to this policy should be directed to EP’s academic integrity officer at ep-academic-integrity@jhu.edu.
Students with Disabilities - Accommodations and Accessibility
Johns Hopkins University values diversity and inclusion. We are committed to providing welcoming, equitable, and accessible educational experiences for all students. Students with disabilities (including those with psychological conditions, medical conditions and temporary disabilities) can request accommodations for this course by providing an Accommodation Letter issued by Student Disability Services (SDS). Please request accommodations for this course as early as possible to provide time for effective communication and arrangements.
For further information or to start the process of requesting accommodations, please contact Student Disability Services at Engineering for Professionals, ep-disability-svcs@jhu.edu.
Student Conduct Code
The fundamental purpose of the JHU regulation of student conduct is to promote and to protect the health, safety, welfare, property, and rights of all members of the University community as well as to promote the orderly operation of the University and to safeguard its property and facilities. As members of the University community, students accept certain responsibilities which support the educational mission and create an environment in which all students are afforded the same opportunity to succeed academically.
For a full description of the code please visit the following website: https://studentaffairs.jhu.edu/policies-guidelines/student-code/
Classroom Climate
JHU is committed to creating a classroom environment that values the diversity of experiences and perspectives that all students bring. Everyone has the right to be treated with dignity and respect. Fostering an inclusive climate is important. Research and experience show that students who interact with peers who are different from themselves learn new things and experience tangible educational outcomes. At no time in this learning process should someone be singled out or treated unequally on the basis of any seen or unseen part of their identity.
If you have concerns in this course about harassment, discrimination, or any unequal treatment, or if you seek accommodations or resources, please reach out to the course instructor directly. Reporting will never impact your course grade. You may also share concerns with your program chair, the Assistant Dean for Diversity and Inclusion, or the Office of Institutional Equity. In handling reports, people will protect your privacy as much as possible, but faculty and staff are required to officially report information for some cases (e.g. sexual harassment).
Course Auditing
When a student enrolls in an EP course with “audit” status, the student must reach an understanding with the instructor as to what is required to earn the “audit.” If the student does not meet those expectations, the instructor must notify the EP Registration Team [EP-Registration@exchange.johnshopkins.edu] in order for the student to be retroactively dropped or withdrawn from the course (depending on when the "audit" was requested and in accordance with EP registration deadlines). All lecture content will remain accessible to auditing students, but access to all other course material is left to the discretion of the instructor.