#################### # # Filename: 20110823.R # #################### # # Purpose: Demonstrate t-tests and the use of # random numbers in evaluating tests # # Step 1: Create some data x <- rnorm(n=10, m=10,s=12) # The vector x holds 100 pieces of data. We actually # known that this data is Normally distributed; we # designed it that way (the 'norm' in 'rnorm'). We # also know the mean of that population (m=10). Why # might it be useful to know reality for once? # Step 2: State our null hypothesis # # H0: The mean of the population equals 11. # Step 3: Test the null hypothesis t.test(x, mu=11) # Note that p > 0.05, thus we cannot reject the null hyothesis # and we conclude that 11 is reasonable mean for the population. # Note, we said REASONABLE. # Also note, your answer may differ. This is the nature of # random data generation. To ensure that we are on the 'same # page,' we can set the random number generator (RNG) seed: set.seed(111) # Now, our answers will agree. ### Paired-Sample t-test # Step 0: When do we use this test? Refer to notes/book. # Step 1: Create some data x.before <- rnorm(n=10, m=11, s=12) x.after <- rnorm(n=10, m=20, s=12) # These are both valid variable names (and quite clear) # Step 2: State our null hypothesis # # H0: There is no difference between the pre- and post-tests # Step 3: Test the null hypothesis t.test(x.before,x.after, paired=TRUE) # Since p < 0.05, thus we reject the null hypothesis and # conclude that there is a significant difference between pre- # and post-tests at the alpha=0.05 level. # We could have also concluded that it was unreasonable at the # alpha=0.05 level to conclude that there is no difference between # pre- and post-tests. ### Independent-samples t-test # Step 0: When do we use this test? Refer to notes/book. # Step 1: Create some data set.seed(222) x1 <- rnorm(n=10, m=11, s=12) x2 <- rnorm(n=10, m=20, s=12) # Step 2: State our null hypothesis # # H0: There is no difference in means between the two populations # Step 3: Test the null hypothesis t.test(x1,x2) # Note: If we knew that the variance of the two populations was equal, # then we could have used: t.test(x1,x2, var.equal=TRUE) # Since p > 0.05, we fail to reject the null hypothesis and # conclude that the means of the two populations are # reasonably close to each other (at the alpha=0.05 level). ################################################## ### Extensions: # Recall that for these tests, we assume that the data are # Normally distributed. What if they are not? What if they # are distributed according to, say, an Exponential distribution? # Let us see: # Set the seed set.seed(888) # Step 1: Create some data x1 <- rexp(n=10, rate=1) x2 <- rexp(n=10, rate=1) # Step 2: State our null hypothesis # # H0: There is no difference in means between the two populations # Step 3: Test the null hypothesis t.test(x1,x2) # Note: We *know* that the means of the two populations are equal, # since we designed them that way. Thus, we would expect (hope) # to fail to reject the null hypothesis and conclude that it is # reasonable for the two population means to be equal. Is this # what happened?