: The Current Calendar
The Course Calendar
This page provides the proposed calendar for the course. If changes need to be made (which is very likely in the Spring Term), then I will change them here and announce the changes during class. If you would like, you may download the calendar as an Excel file. That may allow you to see the course in a different light. Remember that one of your jobs is to see the connections among the course topics — and among the courses topics.
This calendar was last updated on:
February 25, 2025, at 3:05 pm.
March 25 (Tuesday)
Initial Thoughts on the Course
There are no classes today; it is the last day of Spring Break. Today makes a great day to make your New-Term Resolutions.
Thinking back over the past several terms, how will you change to make this term a success for you? This is such an important question that I will ask it on the first day of the term (Wednesday) in the form of a quiz. When you start a professional job somewhere, you will need to convey how you will help the company grow and how the company will help you grow. Good jobs will provide opportunities for growth to you. Make sure you take advantage of them.
Module I: Review (and extension) of the past
There are prerequisites for this course. However, it would be unfortunate if I were to assume you remembered everything from those courses. On the other hand, it would take too much time to reteach everything important. So, this first module reviews the past, but with applications to the future.
March 26 (Wednesday)
Review, Plus Ultra!
Today, we emphasize that we are continuing our study of statistics. The discussion will focus on reminding you of some of the many things you learned back in introductory statistics and in integral calculus.
March 27 (Thursday)
Review, Plus Ultra!
This lecture will continue yesterday’s discussion. This review is designed to put you in the mindset of your previous courses with the intention of making these courses merely an extension… leading to a single course exploring statistics… that happens to take a few years.
March 28 (Friday)
Review, Plus Ultra!
We finish this first week continuing the review. We should get through the continuous distributions, the moments, and (most importantly) the “transformations of random variables” sections to finish lectures on this Learning Module. Note how probabiity and statistics are just applications of calculus (and of discrete mathematics). This couse is just using what you have learned before to answer important questions about data and how we can learn from data.
- Read for today:
Section 13.1–3
Section 13.1–3
Section 13.1–3
March 31 (Monday)
Session to see how to use LaTeX, since it is required for all homework assignments. LaTeX is also the typesetting program of choice for mathematicians and most statisticians (the exceptions come from the areas of applied statistics).
April 2 (Wednesday)
SCA 1
These Statistical Computing Activities (SCA) are designed to give you the opportunity to apply the computer to questions raised in the class. Treat them as a chance to better understand the probabilities underlying the statistics.
This SCA bridges the gap between the Mathematica of the past and the R
of the future. The reason for the change in programming language is two-fold. First, those who tend to focus in this field do not use Mathematica. Second, noticing the similarities between Mathematica and R
really helps to better understand the underlying probability and statistics that your courses are examining.
- Activity link: SCA 1
Module II: Understanding Estimators
In your introductory statistics course (STAT200), you used estimators throughout. For instance, you used the sample mean estimator to estimate the population mean. You used the sample proportion estimator to estimate the population proportion. This module will examine estimators — and their behavior — in greater detail.
April 3 (Thursday)
Method of Moments Estimators
The main goal of statistics is to estimate a population parameter. This section looks at one method for creating a reasonable statistic. In the future, we will look at how we can measure the quality of an estimator.
- Read for today:
Section 5.1, 5.2
Section 5.1, 5.2.1
Section 5.1, 5.2.1
April 4 (Friday)
Maximum Likelihood Estimators
Today, we will look at a second method for creating an estimator. Remember that an estimator is a function of the data designed to estimate a parameter. Last time, we looked at the Method of Moments technique for creating an estimator. Today, it is Maximum Likelihood. These are not the only methods, nor are they perfect. You will discover that estimators have strengths and weaknesses, which we will investigate in the next Statistical Computing Activity.
Again, note that this is applied differential calculus — applications of the mathematics, not new mathematics.
- Read for today:
Section 5.3
Section 5.2.2
Section 5.2.2
- Due today:
Assignment 1
April 7 (Monday)
Desirable Properties
The previous two classes have dealt with methods for creating estimators (of parameters). Today, we continue to think about what we really want from our estimators in a random world. Being unbiased is nice, but is there more?
- Read for today:
Section 5.4, 5.5
Section 5.3.1
Section 5.3.1
April 9 (Wednesday)
Sufficiency
Another desirable property of estimators (functions of the data that are designed to estimate a population parameter) is “sufficiency.” You will see the echo of sufficiency when we talk about pivotal quanties in a few days.
- Read for today:
Section 5.4.2
Section 5.3.2
Section 5.3.2
- Due today:
SCA 1
April 10 (Thursday)
Monte Carlo Simulation
Once again, we rely on simulation to investigate distributions that are not Normal. In most cases, working with non-Normal distribution leaves us with mathematically intractable problems. MC is the only solution. Thus, we are focusing on simulation in ths course because it offers a general technique to better understand the behavior of estimators (and test statistics and confidence intervals).
April 11 (Friday)
SCA 2
These Statistical Computing Activities (SCA) are designed to give you the opportunity to apply the computer to questions raised in the class. Treat them as a chance to better understand the probabilities underlying the statistics.
This one focuses on estimators with an eye to comparing them using bias, variance, and the mean square error. Remember that bias only speaks to the “large-scale” properties of the estimator (the average of multiple experiments). It is the MSE that allows us to compare estimates.
For preparation, please do as many of the problems as you can before class. Also, you may want to refresh your memory of Taylor Series Approximation.
- Activity link: SCA 2
Module III: Confidence Intervals and Coverage
Confidence intervals served as one-half of the foundation for the second half of your introductory statistics course. But, what are they, and what do they really tell us? In this module, we arrive at answers to these questions by introducing something called the “pivotal quantity.”
April 14 (Monday)
Pivotal Method
A pivotal quantity is a function of the data and an unobservable parameter where its distribution does not depend on any unknown parameters. We will use such quantities to determine the endpoints of confidence intervals. It turns out, however, that pivotal quantities are also useful for hypothesis testing. At some level, this should not surprise us because confidence intervals and p-values are strongly related.
- Read for today:
Section 6.1–6.3
Section 5.4, 5.5
Section 5.4
April 16 (Wednesday)
One and Two-Sample Confidence Intervals
Now that we know how confidence intervals can be formulated, let us spend the day looking at the results of those forumulations. Today, we will look at the confidence interval formulas that are in frequent use in statistics. For some of these, the pivotal quantity will be obvious (μ and σ²). For others, there may be no pivotal quantity except as an approximation (π). In all cases, these are the default calculations a computer program goes through to calculate the confidence interval.
Note, however, that there is an additional concept taught today: coverage. If the confidence interval contains the parameter, then we say “the parameter is covered by the interval.” The rate at which the covering happens is the coverage (or the coverage rate). What should the value of the coverage equal? That’s right! For the confidence interval to be “correct”, it needs to have a coverage rate close to our claimed confidence level, γ (gamma). If the coverage is far from γ, then we say that the confidence intervals are inaccurate.
The other recurring aspect is precision. The width of the confidence interval is a measure of its precision (smaller is more precise). Which is more important, accuracy or precision? That’s a great question. What do you think?
- Read for today:
Section 6.2–6.5
Section 5.6, 5.7
Section 5.5, 5.6
April 17 (Thursday)
One and Two-Sample Confidence Intervals
- Read for today:
Section 6.2–6.5
Section 5.6, 5.7
Section 5.7
April 18 (Friday)
SCA 3: MLEs and Coverage
Today’s SCA examines coverage. It has you go through the steps of creating a confidence interval for a specific instance. Then, it has you estimate the coverage. Thankfully, if you do everything correctly, the coverage will be sufficiently close to 95%. Thus, under the assumptions of the procedure, the coverage is correct. Phew!
- Activity link: SCA 3
- Due today:
Assignment 2
SCA 2
Module IV: This thing called the p-value
The other foundation of introductory statistics was hypothesis testing and the p-value. This module looks at its calculation and determines how to create these values.
April 21 (Monday)
Hypothesis Testing
Welcome to the fourth module of the course: Hypothesis Testing. During this module, you will learn how to create hypotheses, determine which is the null and which is the alternative, calculate an appropriate test statistic, calculate the corresponding p-value, and draw an appropriate conclusion about the hypothesis given to us by the researcher.
Today is just the introduction, where we learn the basic theory behind confidence intervals hypothesis testing. As we move forward in this module, we will learn a few methods for creating pivotal quantities test statistics.
- Read for today:
Section 7.1
Section 6.1
Section 6.1
April 23 (Wednesday)
Type I, Type II, and Power
- Read for today:
Section 7.1
Section 6.1
Section 6.1
April 24 (Thursday)
Neyman-Pearson Lemma
The Neyman-Pearson Lemma proves that tests with a certain form are “most powerful.” That is all. We should look at the form of the test and not worry about calculating the value of k.
- Read for today:
Section 7.2
Section 6.2
Section 6.2
April 25 (Friday)
Likelihood Ratio Tests
Last time, we looked at the Neyman-Pearson Lemma. According to the Lemma, tests of certain forms are most powerful. Well, those forms are ratios of likelihoods. Thus, today, we examine methods for creating tests based on the likelihood ratio. Be aware, again, that we are looking for the “form” of the test. The actual test we use will still be based on our knowledge of probability distributions. Thus, “A test where we reject for a sample mean that is ‘too large’” could be a likelihoor ratio test (and, hence, most powerful). However, unless we know the distribution of the sample mean, it is not useful.
Again, our knowledge of probability distributions saves the day.
- Read for today:
Section 7.3
Section 6.3
Section 6.3
- Due today:
Assignment 3
SCA 3
April 28 (Monday)
The Usual Hypothesis Tests
This is the last lecture day of the module. As we did with the previous module, we go through several of the tests commonly used in statistics. The key for each is to realize that it is an appropriate test statistic because we know its distribution. Without knowing the distributions, the test statistic is worthless… much like was the case with the Neyman-Pearson Lemma and the Likelihood Ratio tests. If we do not know the distribution of the test statistic, we have nothing.
- Read for today:
Section 7.4, 7.5
Section 6.4, 6.5
Section 6.4, 6.5
April 30 (Wednesday)
SCA 4
If we need more time on the “Usual Hypothesis Tests,” we will take this class period. Otherwise, today is the in-class time devoted to the infamous SCA 4. In this SCA, you use your understanding of the testing-creation process to create a hypothesis test for skew. Will you follow in the footsteps of Student and use simulation to find the crtical values? Or will you be able to live up to the expectations of Ronald Fisher, who was able to find the distribution of the test statistic, allowing us to use any value of α, not just 0.05? Both are definitely options.
- Activity link: SCA 4
May 1 (Thursday)
Alpha-Testing and Power-Testing
Today’s work focuses on checking if the given hypothesis test is appropriate and if there is a better test available.The first requires us to check that the rejection rate is as claimed (or close to it). The second requires us to think about power. Both of these checks are important for the course project (more on that next module).
Make sure you can check that the real Type I error rate is close to the claimed alpha-level. Also, make sure you can estimate the power of a test.
May 5 (Monday)
SCA 4
This marks the last day to work on SCA 4 in class. At the end of class, you will receive The Examination.
- Activity link: SCA 4
May 7 (Wednesday)
No Class
No class today. Work on your Midterm.
May 8 (Thursday)
No Class
No class today. Work on your Midterm.
May 9 (Friday)
No Class
No class today. Work on your Midterm.
Module V: Goodness of Fit tests
Now that the theory part of the course is finished, it is time to turn towards applications. The usual first application is goodness of fit. Here, we will use everything we have talked about and apply it to determining how to determing the distribution that generated a sample.
May 12 (Monday)
Goodness-of-Fit for Count Data
The Midterm Examination due at the start of class today.
Today starts Module Four: Goodness of Fit. This module moves from the “theory” of the first half to the “application” of the second. This module looks at how we can determine the distribution of a random variable based on the observed data… as opposed to basing it on the understood data-generating process. In general, there are basically two types of goodness-of-fit tests. The first is based on the emirical pdf. The second is based on the empirical CDF.
Today we look at the first type. Note that we will critique the weaknesses of this approach, noting that such tests rely far too much on “rules of thumb”
- Read for today:
Section 7.6
Section 7.1, 7.2
Section 11.1–4
May 14 (Wednesday)
Kolmogorov-Smirnov test (and others)
We will explore the Kolmogorov-Smirnov test. This test is used to determine if data follows a fully-defined distribution. That is, it cannot be used to test if “The data are from a Normal.” It can, however, be used to determine if “The data are from a N(0,3) distribution.” We have already seen this test (ks.test
) in determining if the p-values come from a standard Uniform, Unif(0,1).
Among other things, we will see what happens if we use the data to estimate the parameters of the distribution.
However, we also need to look at Normality tests, such as the Shapiro-Wilk test. This test (and its kin) are interesting in that they are “composite”tests. All you need to do is specify that the hypothesized distribution is “Normal” in order to use them. This is a great advancement over the Kolmogorov-Smirnov test which requires fully stating the distribution. Because Normality is so important in Statistics, this is the focus of the course project.
- Read for today:
Section 7.6.3
Section 7.3.2–7.3.4
Section 11.5
May 15 (Thursday)
Brief Presentations on SCA 4 (5 minutes or less)
SCA 5: Goodness of Fit
Module Five ended quickly. It was short and sweet. It focused on being able to estimate the distribution of the data-generating process without being able to specify it from first principles. This SCA focuses on the famous Chi-Square Goodness-of-Fit test and some things it can test.
Note that the Chi-Square test actually takes an n-dimensional problem and makes it a one-dimensional problem. In other words, this seems like a geometry problem, where we are projecting a higher-dimensional observation onto a lower-dimensional one. What does this mean in terms of lost information?
- Activity link: SCA 5
Module VIII: Bayesian Analysis
All of the statistics you have (probably) done at Knox is called frequentist (or Fisherian) statistics. This choice arises from the need to meet the needs of our client disciplines (Biology, Political Science, etc.) as well as from the need to ensure the calculations are easily done (mostly the first). In making this choice, however, we are left with confidence intervals and p-values, which really do not measure what we want them to measure. In this module, we discover another way of formulating statistics— one that leads to natural interpretations of the estimators.
May 16 (Friday)
Bayesian Point Estimation
Today, we start our final module: Bayesian Analysis, Module 7. Because Module 6 takes the space normally dedicated to two modules, and because we spent a significant amount of time on review, we are skipping it and moving directly to Bayesian Analysis.
So farin your statistical career at Knox College, you have formulated a research question, collected data, and used that data to estimate a confidence interval or test a hypothesis. The confidence intervals are correctly interpreted as “We are 95% ‘confident’ that the population parameter is in this interval.” whatever this means. The p-value is correctly interpreted as “The probability of observing data this extreme — or more so — given the null hypothesis is true.“ … whatever that means.
Using Bayesian statistics, we can correctly make claims like “The probability that θ is in this interval is 95%” or “The probability that the null hypothesis is correct is 95%.”
This is the strength of the Bayesian paradigm: We can directly address what the researcher cares about… where is the parameter.
Finally, Bayesian analysis rocks because we are able to include the results of previous experiments (data-collections) in our new estimates.
- Read for today:
Section 11.1, 11.2
Section 11.1, 11.2
Section 10.1, 10.2
- Due today:
Assignment 4
SCA 4
May 19 (Monday)
Bayesian Credible Intervals
Last time, we were introduced to the Bayesian paradigm. We were able to estimate the value of the parameter. Today, we are able to calculate “credible intervals,” which are probability intervals for the parameter. This makes so much more sense than confidene intervals. No longer do we have to state that the intervals “contain the population parameter γ of the time under repeated sampling. Now, we can provide a probability measure on a single experiment.
- Read for today:
Section 11.3
Section 11.3
Section 10.3
May 21 (Wednesday)
Bayesian Hypothesis Testing
Last time, we provided the understanding of confidence intervals in the Bayesian perspective. We learned that Bayes allowed us to make the statements we hoped confidence intervals could make about θ. Today, we do the same with hypothesis tests.
One thing we can do quite easily with Bayes is to test hypotheses that θ is in a given interval. You cannot do that with frequentist statistics.
For instance, Bayes allows us to easily determine the probability that θ is between 0.3 and 0.9. There is no easy way of doing that with frequentist statistics. This is yet another strength of Bayes.
So, why don’t we teach Bayes in STAT 200?
- Read for today:
Section 11.4
Section 11.4
Section 10.4
May 22 (Thursday)
Bayesian Activity
Now that we have learned a bit about Bayesian analysis, let us perform an in-class activity to help us better understand the difference between frequentist and Bayesian analyses.
⟼ Today’ Activity
May 23 (Friday)
End of Term Activity 1
This is a second look at Bayesian Analysis. Please make sure you have access to your books and have skimmed the section before class.
⟼ Today’ Activity
- Read for today:
Section 11.5
Section 11.5
Section 10.5
- Due today:
Assignment 5
SCA 5
May 26 (Monday)
No formal class
There is no formal class held today. I will be in the classroom to answer questions.
May 28 (Wednesday)
No formal class
There is no formal class held today. I will be in the classroom to answer questions.
May 29 (Thursday)
Presentations
The project presentations are at the start of the class today. Be prepared. The presentations should be about 5–10 minutes. That will be the usual time period needed for you to properly present your results. Remember that you will need to focus on your Normality test. Be able to explain how it works, why it is able to test Normality, and how well it works in relation to the Shapriro-Wilk test. This is what we care about.
Get the Final Examination in class today. It is due at the end of our scheduled FInal Examiantion.
Module IX: It’s the End of the Term as We Know It (and I Feel Fine)
Well, that is the end of the term. We are looking at two reading days and three final examination days. Your final examination and written paper are due on the last day of finals.
May 30 (Friday)
Reading Day
Spend today reviewing your notes carefully so that you are dealing with the material one more time. Research shows that the more you interact with the material, the better you learn (and understand) it.
May 31 (Saturday)
Reading Day
Spend today working on your examination and your paper.
June 1 (Sunday)
Final Examinations
June 2 (Monday)
Final Examinations
June 3 (Tuesday)
Final Examinations
Your final examination is due at the end of the final exam period today (10:00 pm).
Also, your paper is due at the same time.
June 4 (Wednesday)
Summer Break
Yay! This is the start of Summer “Vacation.” What will you do to help your future this summer?
And that is all there is to this course! Again, the key is to be proactive in your learning. Learning how to learn will serve you well into the future. In many ways, your grade in this course has less to do with your statistical abilities than is does with your abilities as a student. Since we all have things to learn about learning, spend this time polishing your skills.