##### SCA-41 ##### ##### Chi-Square Test of Independence ##### ### This gives a few examples of the analysis process for testing ### for independence between two categorical variables ### Preamble source("http://rfs.kvasaheim.com/stat200.R") ### Part 0: Basics # Is there a relationship between major type and gender? # # In tabular form, the data are # # Type | MNS | HSS | HUM | ART | # Female | 12 | 16 | 19 | 9 | # Male | 8 | 9 | 13 | 9 | # Create the matrix: data = matrix( c(12,16,19,9, 8,9,13,9), ncol=4, byrow=TRUE) # Perform the test: chisq.test(data) # We did not detect a relationship between gender and major type # (p-value = 0.8325). We cannot conclude that the major distribution # differs for the genders. ### Part I: Hair and Eyes # Is there a relationship between natural hair color # and natural eye color? dt = read.csv("http://rfs.kvasaheim.com/data/hairandeyecolor.csv") attach(dt) summary(dt) table(eye,hair) chisq.test( table(eye,hair) ) # We did not detect a relationship between hair color and eye # color (p-value = 0.2926). # Graphic assocplot( table(eye,hair) ) mosaicplot( table(eye,hair) ) detach(dt) ### Part II: Kafka and the Cats # Can we find some factors that may be related to a person's # choice of favorite Murakami book? # Data exploation dt = read.csv("http://rfs.kvasaheim.com/data/murakami.csv") attach(dt) summary(dt) ## gender? table(gender,book) chisq.test( table(gender,book) ) # Yep. It appears as though males prefer 1Q84 to the other # books and females prefer Kafka on the Shore. ## eye color? table(eyecolor,book) chisq.test( table(eyecolor,book) ) # Yep. It appears as though eye color and book are dependent. Brown- # eyed people do not like the Wind Up Bird Chronicle, but blue-eyed # people really seem to like it. ## nationality? table(nationality,book) chisq.test( table(nationality,book) ) # Yep. It appears as though nationality and book are dependent. Czechs # seem to really like 1Q84, and so do the Japanese. Americans do not # seem to like that book. ## vote? table(vote,book) chisq.test(table(vote,book)) # Nope. It does not appear as though political vote has anything to do # with which book is most liked. ### NOTE # The previous "analysis" is frequently done by business planners. The # correct next step is to gather additional data and test the hypotheses # constructed above. The usual next step, however, is to treat these # provisional conclusions as being Truth. # # We were just exploring the data. At no point did we conclude anything # definitely. detach(dt)