##### SCA-22
##### 
##### Two-Sample Proportions Tests
##### 

### This gives a few examples of the analysis process for comparing
### two population proportions.


### Preamble

# Import extra functionality
source("http://rfs.kvasaheim.com/stat200.R")


### Part I: Hats
#
#   I would like to know if males tend to wear hats more often
#   (GREATER) than females.
#
#   To determine this, I measure the number of males wearing
#   hats (and the number of males in my sample) and the number
#   of females wearing hats (and the number of females in my
#   sample).
#
#   Of the 57 males I measured, 15 wore hats. Of the 92 females
#   I measured, 18 wore hats.

prop.test(x=c(15,18), n=c(57,92), alternative="greater")

#   n.b. Here, we are not relying on the alphabet; we were able
#        to explicitly state which came first in our mind. Thus, 
#        the alternative hypothesis here is p_male > p_female.
#
#   According to the proportions test, we were not able to conclude
#   males wear hats more frequently than females (p-value = 0.2232).

binom.plot( x=c(15,18), n=c(57,92), names=c("Male","Female"),
	ylab="Proportion Wearing Hats")


### Part II: Politics
#
#   I would like to know if males tend to be Republican more 
#   (GREATER) than females.
#
#   To test this. To do so, I ask n=1056 males and n=1056 
#   females who identified as either Democrat or Republican
#   their political party affiliation. Of that sample, a total
#   of 656 males and 563 females identified as Republican.

prop.test(x=c(656,563), n=c(1056,1056), alternative="greater")

#   According to the proportions test, we are able to conclude
#   males tend to be Republican more frequently than females (p-value =
#   0.0000).
#
#   In fact, we are 95% confident that the gender gap is greater
#   than 5.19%.

binom.plot( x=c(563,656), n=c(1056,1056), names=c("Male","Female"), ylab="Proportion Identifying as Republican")


# SOURCE: http://news.gallup.com/poll/120839/Women-Likely-Democrats-Regardless-Age.aspx


### Part III: Politics
#
#   I would like to know if the Galesburg Congressional district (IL-17)
#   tends to be more Democratic than the Stillwater, OK, Congressional
#   District (OK-3).
#
#   To test this, I randomly call 2000 people in IL-17 (less expensive)
#   and 500 people in OK-3. Here are the results:
#
#   District:    IL-17 | OK-3 
#   Democrats:    1030 | 185
#   Sample Size:  2000 | 500

prop.test(x=c(1030,185), n=c(2000,500), alternative="greater")

#   Conclusion: Because the p-value is approximately 0, we reject
#   the null hypothesis. We conclude that IL-17 is significantly
#   more Democratic than OK-3. We are 95% confident that IL-17
#   is at least 17% more Democratic than OK-3.

binom.plot(x=c(1030,185), n=c(2000,500), xlab="Congressional District", ylab="Democratic Support", names=c("IL-17","OK-3"), ylim=c(0.3,0.6), boxcol="lightblue" )

# Information: https://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index