575.704.81 - Applied Statistical Analysis and Design of Experiments for Environmental Applications

Environmental Engineering and Science
Summer 2023

Description

This course introduces statistical analyses and techniques of experimental design appropriate for use in environmental applications. The methods taught in this course allow the experimenter to discriminate between real effects and experimental error in systems that are inherently noisy. Statistically designed experimental programs typically test many variables simultaneously and are very efficient tools for developing empirical mathematical models that accurately describe physical and chemical processes. They are readily applied to production plant, pilot plant, and laboratory systems. Topics covered include fundamental statistics; the statistical basis for recognizing real effects in noisy data; statistical tests and reference distributions; analysis of variance; construction, application, and analysis of factorial and fractional-factorial designs; screening designs; response surface and optimization methods; and applications to pilot plant and waste treatment operations. Particular emphasis is placed on analysis of variance, prediction intervals, and control charting for determining statistical significance as currently required by federal regulations for environmental monitoring.Prerequisite: Undergraduate statistics is strongly recommended

Instructor

Default placeholder image. No profile image found for Barry Bodt.

Barry Bodt

babodt@gmail.com

Course Structure

The course content materials are divided into modules which can be accessed by clicking Course Modules on the left menu. A module will have several sections including the overview, content, readings, discussions, and assignments. You are encouraged to preview all sections of the module before starting. Most modules run for a period of seven (7) days, exceptions are noted on the Course Outline page. You should regularly check the Calendar and Announcements for assignment due dates.

Module components vary according to the topics covered. All modules include course notes and most include outside reading in the text or other publications. Others capitalize on a range of options that better deliver certain concepts, for example, tablet and screen capture for showing hand calculation or software application. Optional, self-check quizzes help solidify concepts after reading. Discussions provide an opportunity to learn from each other as we increase understanding of course concepts. Practice problems are available in most modules and problem sets for grading are common to all. Two exams formally assess understanding and a project given at the end of the course provides an opportunity to apply modeling skills.

Course Topics

EN. 575.704 Bulleted Topics
• Differentiate between environmental populations and samples from those populations.
• Compute and display descriptive statistical information, using a few representative tools.
• Calculate probabilities of events using basic probability rules.
• Compute probabilities of more general events expressed in terms of random variables.
• Discuss the roles of estimation, hypothesis testing, and modeling in statistical inference.
• Construct expressions for the first two central moments of a random variable.
• Develop the joint distributions and their moments.
• Develop marginal and conditional distributions.
• Calculate probabilities and random variable values for probability distributions using software or tables.
• Apply common probability distributions.
• Verify if the common normal assumption is appropriate for a given data set.
• Construct and interpret interval estimates for the mean, quantile, and proportion.
• Establish an appropriate sample design to achieve a stated precision for an interval estimate.
• Conduct a hypothesis test of an assertion about the population mean.
• Quantify the sensitivity of a decision rule in hypothesis testing in terms of the power of the test.
• Apply the t-distribution to inferential tasks involving one population mean.
• Apply the t-distribution to inferential tasks involving dependent and independent samples.
• Generate a prediction interval for a new observation or mean.
• Conduct inferential tasks involving spread for one or two populations.
• Employ a nonparametric approach in comparing dependent populations
• Apply nonparametric approaches for one-sample inference with quantiles and proportions.
• Apply nonparametric approaches for comparing populations based on one or more proportions.
• Generate and interpret control charts for the process mean and fraction defective.
• Recognize sample considerations for control chart data.
• Weigh a proposed sampling plan against practical limitations.
• Identify repeat measures and spatial and temporal aspects of a sample.
• Distinguish among approximately ten unique sampling strategies.
• Adapt estimation strategies to sampling environments with finite populations.
• Account for spatial and temporal correlation among samples.
• Translate a research question into experimental concepts such as factors, levels, treatments, and effects
• Evaluate an experimental design for consistency with the three principles of design.
• Recognize the agreement between design principles and the EPA’s data quality approach.
• Understand the steps required to establish a good study plan from data collection to analysis.
• Express a response variable as a one-factor linear model.
• Partition the variation of the response into effect and error.
• Build and interpret an ANOVA table by hand.
• Test the validity of model assumptions for ANOVA.
• Execute comparisons post-hoc for significant effects.
• Develop solutions to one-way ANOVA using software.
• Recognize a random effect and interpret it correctly.
• Express a response in terms of a linear model with a block effect.
• Build and interpret an ANOVA for randomized block design by hand.
• Recognize the advantage and disadvantage of blocking with respect to ANOVA results and develop a run sequence.
• Run model diagnostics and complete all computations for ANOVA using software for the one-way random effect model and randomized block design
• Construct general factorial models and partition the total variation by hand.
• Interpret the model and explain its efficiency advantage.
• Conduct general factorial analysis using software.
• Perform model diagnostics using software.
• Develop a run sequence for data collection supporting a general factorial model.
• Conduct a simple linear regression analysis by hand and with software.
• Perform model diagnostics for the simple linear regression using software.
• Conduct a multiple regression with model diagnostics using software.
• Develop a response surface model using software.
• Devise a staged-sampling strategy to support response surface estimation.
• Construct a 2k design
• Implement a 2k design as a screening experiment
• Analyze a 2k design by hand in two ways: standard and Yate’s Method
• Analyze a 2k design using software
• Recognize the utility of fractional factorial designs

Course Goals

My goal in this course is for you to become a confident practitioner of Statistics in Environmental Science. To achieve that, I will help you gain a foundational understanding of applied probability and statistics, ranging from basic notions of chance and the collection of noisy data to descriptive summary of data to advanced concepts in statistical inference.  Next, I want to broaden your approach to problems so that you naturally include statistical considerations as an integral part of study planning and execution. And last, I want to develop your skills and comfort with statistical software so that you will be able to correctly conduct statistical analysis to achieve study goals.


Course Learning Outcomes (CLOs)

Textbooks

Ayyub B.M.& McCuen, R.H. (2011). Probability, Statistics, and Reliability for Engineers and Scientists (3rd ed.). Boca Raton, FL: Chapman & Hall, CRC Press.

ISBN-10: 1439809518
ISBN-13: 9781439809518

Required Software

Statistical software will be required to complete some material, especially in the second half of the course where calculation is prohibitive for meaningful data sets. I do not require you have a specific package, even the analysis pack for Excel will do in many instances, but I do recommend three that you might investigate.

R is a free package that has grown up on the Internet much like the Linux OS. It is a tremendously flexible package. I would recommend you have this one. Access R through http://www.r-project.org/. RStudio is R with a better user interface. Over the past several years, I have grown to really enjoy R and RStudio for teaching, especially.

Minitab (version 17 or later) is the one I use most for examples in the notes, but I will use the other two for variety. Minitab is available on Windows and Mac platforms. I am still conversant in Minitab to help you when needed, but I have moved on to R and JMP. Minitab is now accessible through a Minitab Workspace, which I imagine uses the current version, 21. This is available for Windows & Mac for a six-month download through http://www.onthehub.com/minitab/. The current Minitab is still very good and quite helpful for experimental design related analyses. The cost is $32.99 for six months.

SAS JMP (version 14) is what I used to generate some course materials. The current version is 17 and is available for Windows and Mac. It is another great piece of software. I found navigation through it a little different at first but have grown to make it my first choice for analyses. It is not clear that you can get JMP onthehub as we have done previously.  For a a student price, or perhaps a university download, JMP should be available to you. You can visit https://www.jmp.com/en_us/academic/licensing-for-students.html.

Assignments requiring software will not be package specific. The only course tie to software is that I tend to illustrate more using Minitab than the other two in the lectures. R and RStudio have steeper learning curves, but if you are fluent with various software applications, either might suffice. If you prefer a more menu-driven interface, pick up JMP or Minitab in addition to R or RStudio.

Student Coursework Requirements

It is expected that each module will take approximately 7–10 hours per week to complete. Here is an approximate breakdown: reading the assigned sections of the text (approximately 3–4 hours per week) as well as some outside reading, listening to the audio annotated presentations (< 1 hour per week), Office Hours (optional, 1 hour per week), writing assignments (< 1 hour per week), and problem set completion (approximately 2–3 hours per week).

This course will consist of four basic student requirements:
1. Preparation and Participation (Module Discussions) (15% of Final Grade Calculation)
You are responsible for carefully reading all assigned material and being prepared for discussion. Readings come from the course notes, our text, and government publications. Additional reading may be assigned to supplement text readings. Post your initial response to the discussion questions by the evening of day 3 (Wednesday) for that module week. Post a second time building on the posts of others by Monday evening. The goal of discussion is an a free exchange of ideas where we all learn from one another and at the same time foster a class community. I want the discussion posts to be substantive. Grading will reflect that. When I grade discussions, meeting a two-post per week submission is required. If you are doing what is required and are posting substantively, the baseline for a grade is a 90. If you do less than required or posts are not substantive, you will earn somewhat less. If you do more than is required or your posts are truly though provoking, you will earn more. This is a subjective assessment on my part. The easiest way to hurt your discussion score is to not post at all. If you fail to post, the first time it occurs in each half of the summer term, you will receive a gift of a 50/100. Subsequent missed posts in the same 1/2 of the term will result in a zero. Please ensure that your postings are civil and constructive. I will monitor module discussions and will respond to some of the discussions as discussions are posted. 


2. Assignments (40% of Final Grade Calculation)
Assignments will include a mix of qualitative assignments (e.g. model critiques, case study reviews) and quantitative problem sets. All assignments should include your name and assignment identifier. Qualitative assignments should be well written and supporting figures and tables captioned and labeled. Quantitative assignments take the form of problem sets and are a focus of this course. The naming convention for these is Module #_Lastname_Firstname.pdf. All problem sets should be prepared as a pdf file that mixes hand calculations and software output and lists problems in the assignment order. Software output needs to be in close proximity to hand calculations for the same problem. Consider reducing the image size of graphs to save space. Resolution of the entire assignment need not be any more than 150 dpi. I write this to emphasize that file sizes should not be exceptionally large. In most cases, I would expect you to be at approximately 1 mb or less. Be mindful that sometimes Canvas has trouble rendering very large files, which in turn limits my ability to grade/annotate and return. Keep in mind, that only one file is to be submitted. Do not submit a separate file for each problem. 
Late submissions will be reduced by one letter grade for each week late (no exceptions without prior coordination with the instructor).

Qualitative problems are evaluated by the following grading elements:


Quantitative assignments are evaluated by the following grading elements:

3. Course (Team) Project (15% of Final Grade Calculation)
A course project will be assigned over the past few weeks of the course. The next-to-the-last week will is when most of the work will be performed.
The course project is evaluated by the following grading elements:

Note: Team scores (same for all) will be awarded to each team member. Student scores are individually assessed. The weighted composite determines the project grade.

4. Exams (30% of Final Grade Calculation, combined from 15% for Midterm and 15% for Final)
The midterm exam will be available in Module 7 and the final exam will be available in the next-to-last Module. You will have one week to complete the exams and they will be due by 11:59PM exactly one week from their release. You may use course resources to complete the exams.
The exams are evaluated by the following grading elements:

Grading Policy

Assignments are due according to the dates for assignments items in the corresponding modules. I will try to post grades one week after assignment due dates. I generally do not directly grade spelling and grammar. However, egregious violations of the rules of the English language will be noted without comment. Consistently poor performance in either spelling or grammar is taken as an indication of poor written communication ability that may detract from your grade. A grade of A indicates achievement of consistent excellence and distinction throughout the course—that is, conspicuous excellence in all aspects of assignments and discussion in every week.
A grade of B indicates work that meets all course requirements on a level appropriate for graduate academic work. These criteria apply to both undergraduates and graduate students taking the course.

The following grading scale will be used.


Score RangeLetter Grade
100-97= A+
96-93= A
92-90= A−
89-87= B+
86-83= B
82-80= B−
79-77= C+
76-73= C
72-70= C−
69-67= D+
66-63= D
<63= F

Academic Policies

Deadlines for Adding, Dropping and Withdrawing from Courses

Students may add a course up to one week after the start of the term for that particular course. Students may drop courses according to the drop deadlines outlined in the EP academic calendar (https://ep.jhu.edu/student-services/academic-calendar/). Between the 6th week of the class and prior to the final withdrawal deadline, a student may withdraw from a course with a W on their academic record. A record of the course will remain on the academic record with a W appearing in the grade column to indicate that the student registered and withdrew from the course.

Academic Misconduct Policy

All students are required to read, know, and comply with the Johns Hopkins University Krieger School of Arts and Sciences (KSAS) / Whiting School of Engineering (WSE) Procedures for Handling Allegations of Misconduct by Full-Time and Part-Time Graduate Students.

This policy prohibits academic misconduct, including but not limited to the following: cheating or facilitating cheating; plagiarism; reuse of assignments; unauthorized collaboration; alteration of graded assignments; and unfair competition. Course materials (old assignments, texts, or examinations, etc.) should not be shared unless authorized by the course instructor. Any questions related to this policy should be directed to EP’s academic integrity officer at ep-academic-integrity@jhu.edu.

Students with Disabilities - Accommodations and Accessibility

Johns Hopkins University values diversity and inclusion. We are committed to providing welcoming, equitable, and accessible educational experiences for all students. Students with disabilities (including those with psychological conditions, medical conditions and temporary disabilities) can request accommodations for this course by providing an Accommodation Letter issued by Student Disability Services (SDS). Please request accommodations for this course as early as possible to provide time for effective communication and arrangements.

For further information or to start the process of requesting accommodations, please contact Student Disability Services at Engineering for Professionals, ep-disability-svcs@jhu.edu.

Student Conduct Code

The fundamental purpose of the JHU regulation of student conduct is to promote and to protect the health, safety, welfare, property, and rights of all members of the University community as well as to promote the orderly operation of the University and to safeguard its property and facilities. As members of the University community, students accept certain responsibilities which support the educational mission and create an environment in which all students are afforded the same opportunity to succeed academically. 

For a full description of the code please visit the following website: https://studentaffairs.jhu.edu/policies-guidelines/student-code/

Classroom Climate

JHU is committed to creating a classroom environment that values the diversity of experiences and perspectives that all students bring. Everyone has the right to be treated with dignity and respect. Fostering an inclusive climate is important. Research and experience show that students who interact with peers who are different from themselves learn new things and experience tangible educational outcomes. At no time in this learning process should someone be singled out or treated unequally on the basis of any seen or unseen part of their identity. 
 
If you have concerns in this course about harassment, discrimination, or any unequal treatment, or if you seek accommodations or resources, please reach out to the course instructor directly. Reporting will never impact your course grade. You may also share concerns with your program chair, the Assistant Dean for Diversity and Inclusion, or the Office of Institutional Equity. In handling reports, people will protect your privacy as much as possible, but faculty and staff are required to officially report information for some cases (e.g. sexual harassment).

Course Auditing

When a student enrolls in an EP course with “audit” status, the student must reach an understanding with the instructor as to what is required to earn the “audit.” If the student does not meet those expectations, the instructor must notify the EP Registration Team [EP-Registration@exchange.johnshopkins.edu] in order for the student to be retroactively dropped or withdrawn from the course (depending on when the "audit" was requested and in accordance with EP registration deadlines). All lecture content will remain accessible to auditing students, but access to all other course material is left to the discretion of the instructor.