Mathematical Statistics, II

 

[The Homework Assignments]
Assignment 5

This is the first homework assignment that deals with applications of the theory we have learned in this course, not just the theory behind it. In the application part of these assignments, I look for you to make an argument and support that argument with facts — properly cited/supported — from your analysis. This means you provide the values of appropriate statistics in addition to your analysis.

Graphics are essential to telling the story of your analysis, as is a professional-looking submission.

 

The Part: Kolmogorov and Smirnov

The Kolmogorov-Smirnov test is a true expression of statistics. First, Kolmogorov proposed a particular method for testing the difference between an observed distribution and a hypothesized distribution. This distribution was an infinite sum of weighted Chi-square distributions. In other words, Kolmogorov provided the exact distribution, even though it was difficult to use.

Second, Smirnov provided a simplification to the mathematically exact distribution of Kolmogorov. This simplification helped make the Kolmogorov-Smirnov test usable. In other words, the mathematics got us to the exact, yet not useful, result and statistics got us something useful (yet only approximate).

In this assignment, the first problem deals with determining the value of \(D\) from the Kolmogorov-Smirnov test by hand for some distributions. The second problem has you checking your work using R.

 

Problem 1: The Probability Function

Calculate the value of \(D\) for the empirical (observed) distribution of the net profit and each of the two distributions below.

To do this, you will need to first calculate the empirical (observed) cumulative distribution of the data. The data you use will be the daily net profit earned by the Lamplighter Restaurant. The variable is a part of the lamplighterSales.csv dataset located at

https://courses.kvasaheim.com/math322/assignments/lamplighterSales.csv

Then, you will need to use the theoretical CDF for these distributions to calculate the maximum difference between the two.

The two distributions you will use are

  1. Normal\((\mu=2000;\ \sigma=1000)\)
  2. Poisson\((\lambda=2000)\)

 

Problem 2: The R Function

Check your work using the ks.test function in R. Explain any differences.

 

 

The Extra Credit Problem: The Gamma to the Rescue

For extra credit, do the above with this distribution.

  1. Gamma\((\alpha=10;\ \beta=200)\)

In other words, calculate \(D\), compare it to the value calculated by R, and explain any differences. In R, the \(\alpha\)=shape and \(\beta\)=scale.

This page was last modified on 6 April 2023.
All rights reserved by Ole J. Forsberg, PhDd, ©2008–2023. No reproduction of any of this material is allowed without explicit written permission of the copyright holder.