######################################## # # Script: 19 April 2011 (20110419.R) # ######################################## # Today: # # ** Monte Carlo and Testing tests # . Test assumptions violated? Perform it anyway! # . No available test? Create one! # # Another use for Monte Carlo techniques is to help determine the # answer the the common question: # # "How large does n have to be?" # # # If anyone quickly tells you a number, do not trust that person. # The correct answer is "It depends on your sample." Thus, as # social scientists, we can even use a t-test to compare two # samples from an Exponential distribution, as long as # "n is large enough". # # The (inviolate) assumption of the t-test was that the samples # were Normally distributed. But, we can still use the t-test # if the samples are not Normally distributed. By the Central # Limit Theorem (from Methods I), averages of large samples # are approximately Normally distributed. So, even for non- # Normal data, we just need a large sample size. # # How large? It depends on how non-Normal your data is. # Example: # # The data is drawn from an exponential distribution (perhaps # lifetimes of lightbulbs). # # To see how non-Normal an exponential distribution is, let # us plot the Normal pdf and an Exponential pdf: # X <- -60:60/10 # From -6 to 6, but with step size of 0.10 Y <- dnorm(X) # pdf for the Normal Z <- dexp(X) # pdf for the Exponential plot(X,Y, type="l", lwd=2, col=3, ylim=c(0,1), las=1) lines(X[X>=0],Z[X>=0], lwd=2, col=2) # Wow! Look how much the distributions differ! No one would # ever mistake an exponential distribution with a Normal, right? # So, can we use a t-test on Exponential data? # # Let's see how bad it is for this sample n <- 10 z <- rexp(n) t.test(z, mu=1) # Well, it worked in that case (or maybe not). Will it work # in general? # To answer this, we use Monte Carlo: # Three steps to any Monte Carlo experiment # 1. Get a random sample # 2. Calculate the p-value of the test # 3. Repeat with another random sample # The rest of the steps depend on the program. # So, let us do the experiment p <- numeric() # This will hold the p-values trials <- 1e4 # This is the number of trials to run n <- 10 # This is the sample size per trial for(i in 1:trials) { # Start the loop z <- rexp(n) # Get the random sample p[i] <- t.test(z, mu=1)$p.value # Get the p-value from the test } # Continue the loop # The results p # All of the p-values from the experiments hist(p, breaks=21) # The pdf of the p-values # NOW: # Remember that, for an appropriate test, the p-values are # Uniformly distributed from 0 to 1. (Think about what this # means.) So, how bad were the distribution of the p-values # for a sample size of 10? # # Yeah, not good. Note that the first bar (the rejection bar) # is much higher than it should be. # # Conclusion: # If your data are exponentially distributed, and your sample # is of size 10, you cannot meaningfully use the t-test. # What about a larger sample size? How large??? Fiddle with the # sample size, n, until you get the histogram you need: p <- numeric() trials <- 1e5 n <- 1000 # The sample size for(i in 1:trials) { z <- rexp(n) p[i] <- t.test(z, mu=1)$p.value } hist(p, breaks=21) abline(h=trials/20, lty=2, lwd=2, col="red") # See how this pdf is much more uniform. # # Conclusion: # If your data are exponentially distributed, and your sample # is of size 1000, you *can* meaningfully use the t-test. # # You can do this with other distributions. All it takes is knowing # your data. =) # If your data is count data, what is its distribution? Poisson # If your data is coin flip data, what is its distribution? Binomial # If your data is lifetime data, what is its distribution? Gamma # Hmmm... these sound familiar. ################################################## # What if no test exists? # # Remember back a couple months ago where I told you the three # needs of a good test statistic? Well, the third was that the # test statistic had a known distribution. # # Not always do we have that luxury. For instance, there is no # known distribution for comparing two binomial distributions. # So, questions such as, "Are the success probabilities for these # two sets of binomial data equal?" cannot be answered exactly. They # cannot even be answered approximately for small sample sizes using # the usual Normal approximation. # # The only way to determine if two proportions are statistically # equal is to calculate an empirical confidence interval. # # Idea: # Create a test statistic that is large when the null hypothesis # is unlikely. # Find the distribution of that test statistic by repeatedly # calculating the test statitic from random data generated # under the null hypothesis. # Find the appropriate confidence interval from the definition # of confidence interval. # # Example: # I throw 5 quarters into the air and count the number of heads. This # constitutes one sample of data (XQ). # I throw 5 nickles into the air and count the number of heads. This # constitutes the second sample of data (XN). # # H0: The probability of getting a Head is the same for each type of coin. # (pQ == pN) # # HA: The null hypothesis is not correct. # (pQ != pN) # Samples XQ <- c(3,2,0,2,1,5,3,2,2,2) XN <- c(4,3,1,2,3,4,2,2,4,4) # Let us choose the difference in means as our test statistic, T TS <- mean(XQ) - mean(XN) TS # We see TS = -0.7 for our data. # Q: Is this too small a value for our H0? # A: Find an empirical distribution for our test statistic: T <- numeric() trials <- 1e4 # Larger is better, but slower (as usual) n <- 10 # Number of times I threw the coins into the air sz <- 5 # Number of coins I threw into the air pr <- 0.51 # Under the null, the probabilities are equal; thus, # they are equal to the pooled probability. A general # way of calculateing this pooled success probability # is pr <- (mean(XQ)/sz + mean(XN)/sz)/2 for(i in 1:trials) { X1 <- rbinom(n, size=sz, prob=pr) X2 <- rbinom(n, size=sz, prob=pr) T[i] <- mean(X1) - mean(X2) } T # The values of the test statistic hist(T) # The pdf of the test statitistic quantile(T, c(0.025,0.975)) # The 95% empirical confidence interval # As our original test statistic, TS, is contained within this # interval, we conclude that we cannot reject the null hypothesis # that the success probabilities of the two coins is the same. # For a different alpha? alpha <- 0.1 quantile(T, c(alpha/2,1-alpha/2)) # We cannot reject, at the alpha=0.10 level, the null hypothesis # that the success probabilities of the two samples are the same. # Example: # I measure the time until a lightbulb dies for 10 Type I lightbulbs. # I measure the time until a lightbulb dies for 10 Type II lightbulbs. # # H0: The expected lifetimes are the same # HA: Not H0 # Recall that lifetimes are distributed Exponential # My sample: T1 <- c(0.0047, 0.0311, 0.2577, 0.1801, 0.1290, 0.9671, 0.5793, 0.5175, 0.0486, 0.9676) T2 <- c(1.1908, 0.1641, 0.5220, 0.0183, 2.0449, 1.9009, 1.2569, 1.4883, 2.3701, 0.8026) # Why not use the difference in the average lifetime as our statistic. TS <- mean(T1) - mean(T2) TS # TS = -0.80762 # As no distribution for T, we need to test # it empirically T <- numeric() trials <- 1e4 n <- 10 avg <- mean(c(T1,T2)) for(i in 1:trials) { X1 <- rexp(n, rate=1/avg) X2 <- rexp(n, rate=1/avg) T[i] <- mean(X1) - mean(X2) } T hist(T) quantile(T, c(0.025,0.975)) # Our original test statistic, TS=-0.80762, is not contained in this # interval. Thus, at the alpha=0.05 level, we reject the null hypothesis # and conclude that the average lightbulb lifetimes are not the same # between the two groups. ################################################## # Using Monte Carlo methods, we can also estimate the probability # of events happening when all we have is a point estimate. # Example: # We modeled the data and predicted that the vote in favor of # Maine's ssm ballot measure would be 45%. That's nice. But, a # more interesting question is to ask: "What is the probability # that the ballot measure will pass in Maine?" # We will do that on Thursday.