Mathematical Statistics, II

 

[The Homework Assignments]
Assignment 6

Welcome to the assignment looking at linear regression and estimating effect sizes. Recall that the effect size is how much the independent variable affects the dependent variable. It is what applied researchers care about.

 

Part I: Ordinary Least Squares Regression

In this part of the homework, let us be completely in the realm of the classical linear model (CLM) on a simple linear model. Thus, the model is

\[ y = \beta_0 + \beta_1 x + \varepsilon \]

and the assumption is

\[ \varepsilon \stackrel{\text{iid}}{\sim} \text{Normal}(0;\ \sigma^2) \]

In this part, you will show that MSE is an unbiased estimator of \(\sigma^2\). The extra credit problem at the end will have you do the same, but in a different way.

 

Problem 1: MSE Unbiasedness, the Start

Show that the sum of squared errors, SSE, can be written as

\[ \text{SSE} = S_{yy} - \hat{\beta}_1\ S_{xy} \]

 

Problem 2: MSE Unbiasedness, the Sequel

Show that the sum of squared errors, SSE, can be written as

\[ \mathbb{E}\left[\text{SSE}\right] = (n-2) \sigma^2 \]

 

Problem 3: MSE Unbiasedness, the Conclusion

Finally, in one line, show that

\[ \text{MSE} := \frac{ \text{SSE} }{n-2} \]

is an unbiased estimator of  2.

 

 

Part II: Applied OLS Regression

In this part, you need to use the crime dataset to answer these questions. The crime data file can be found in the expected place:

https://rfs.kvasaheim.com/data/crime.csv

Problem 4: Modeling

Fit a model for the average school enrollment in 2000 (enroll00) using the gross state product (GSP) per capita in 1990 (gspcap90). Check the assumptions of the regression. Even if a requirement is violated, pretend it is not and report the p-value for the Shapiro-Wilk test.

 

Problem 5: Effects

Report a point estimate and a 95% confidence interval for the effect of GSP per capita in 1990 on the enrollment rate.

 

Problem 6: Estimation

Estimate the expected enrollment rate for a state with a GSP per capita of $60,000. Also, provide a 95% confidence interval.

 

Problem 7: Predictons

Predict the enrollment rate for a state with a GSP per capita of $60,000. Also, provide a 95% prediction interval.

 

 

The Extra Credit Problem: There is Another…

Above, you proved that the MSE was an unbiased estimator of \(\sigma^2\). Here, you will do it a different way.

Under the CLM assumptions, prove

\[ \frac{(n-2)\ \text{MSE}}{\sigma^2} \sim \chi^2_{n-2} \]

Then use the fact that the expected value of a Chi-square random variable is its degrees of freedom.

 

This page was last modified on 6 April 2023.
All rights reserved by Ole J. Forsberg, PhDd, ©2008–2023. No reproduction of any of this material is allowed without explicit written permission of the copyright holder.