##### Demonstration Script #1a ##### MATH322 ##### ##### Estimator Evaluation ##### ### Packages extend functionality ### Or, load several functions without creating a package source("http://rfs.kvasaheim.com/math322.R") ### Binomial # The MOM and MLE estimator for pi is X/n. Let us compare # that estimator to the Agresti-Coull estimator, (X+1)/(n+2) # # See your notes for the calculations. # n = 10 p = seq(0,1, length=1e4) mseP = p*(1-p)/n mseT = n*p*(1-p)/(n+2)^2 + ((n*p+1)/(n+2)-p)^2 plot(p,mseP) lines(p,mseT, col="purple",lwd=2) ## Change the value of n to see the effect of sample size on # which estimator has th elowest MSE. As n -> Inf, the difference # between the estimators vanishes. # In all cases, if we suspect p is middling, then we should use # the Agresti-Coull estimator. If we are estimating the probability # of rare or common events, then the MOM estimator (Wald estimator) # chould be used. ### Normal # The MOM and MLE estimator for the mean is x-bar. Let us # compare that estimator with the median as an estimator. # While calculating the MSE for the MOM estimator is easily # done, such is not the case with the median. What is its # variance?? # # This process can make estimating the MSE doable for # any estimator n = 100 MOM = numeric() MED = numeric() for(i in 1:1e4) { x = rnorm(n, m=4, s=11) MOM[i] = mean(x) MED[i] = median(x) } # Bias: mean(MOM - 4) mean(MED - 4) # Variance: var(MOM - 4) var(MED - 4) # MSE: mean( (MOM-4)^2 ) mean( (MED-4)^2 ) ## Feel free to change n. Note that as n -> Inf, # the differences in MSE do not disapear. The # mean estimator is more efficient than the median # estimator. ### Unif(0, theta) # The MOM for theta is 2xbar # The MLE for theta is max{X} # # Which is better? The bias can be calculated # mathematically. bias(MOM)=0; bias(MLE)=(-1/n+1)theta # Calculating the variance, hwoever, is a bit more # difficult... doable, but difficult. # # This proccess should look familiar (see previous code) n = 100 theta=10 MOM = numeric() MLE = numeric() for(i in 1:1e4) { x = runif(n, min=0,max=theta) MOM[i] = 2*mean(x) MLE[i] = max(x) } # Bias: mean(MOM - theta) mean(MLE - theta) # Variance: var(MOM - theta) var(MLE - theta) # MSE: mean( (MOM-theta)^2 ) mean( (MLE-theta)^2 ) ## So, which of the two estimators should you use if # all you care about it bias? Which should you use if # you care about MSE? ### Bern(pi) # Let us draw once from this distribution to estimate # pi. The MOM is X. Let us introduce the U estimator: 0.500 # So, which of the two estimators is best? # # It can be shown that MSE(MOM) = p(1-p) and MSE(U) = (0.5-p)^2 # # Which estimator is the better in terms of MSE? # # This code is similar to the code for the first example. Why # are we using that code instead of that from the second # example? p = seq(0,1,length=1e4) mseM = p*(1-p) mseU = (0.5-p)^2 plot(p,mseM, type="l",col="red", lwd=2) lines(p,mseU, col="purple",lwd=2) ## So, what is our conclusion?