##### SCA-23
##### 
##### Two-Sample Variance Tests
##### 

### This gives a few examples of the analysis process for comparing
### two population variance.
###
### In statistics, these tend to be used to test if the variance of 
### two samples are equal. If they are, then one could use a (marginally) 
### more powerful version of the t-test.
###
### In financial mathematics, these procedures are used to compare
### investment risk.
###


### Preamble

# Import extra functionality
source("http://rfs.kvasaheim.com/stat200.R")


### Part I: IBM vs. Microsoft
#
#   In SCA-13, we examined both IBM and Microsoft individually. Here,
#   let us compare their risks.
#
#   n.b.: Comparing two variances is a procedure about their ratio.
#

IBM = c(152.64, 153.45, 149.00, 153.94, 152.14, 153.84, 152.69, 
	158.07, 160.91, 148.79)
MSFT = c(89.39, 88.52, 92.33, 90.77, 91.86, 93.08, 96.07, 96.11, 95.35, 92.31)

shapiroTest(IBM)
shapiroTest(MSFT)

#   Fisher's F-test requires that the two populations be Normal. The
#   Because the p-value for each is greater than our usual alpha = 0.05,
#   the Shapiro-Wilk test indicates that neither population violates
#   this requirement.
#

var.test(IBM,MSFT)

#   According to Fisher's F-test, we do not have sufficient evidence
#   that the risks (variances) differ (p-value = 0.3423).

boxplot(IBM,MSFT)


### Part II: OPEC vs. Wealth
#
#   I would like to estimate the average wealth in OPEC state and
#   compare it to that of non-OPEC states. As a part of this
#   analysis, I should also determine if their variabilities 
#   (variances) differ.
#

dt = read.csv("http://rfs.kvasaheim.com/data/gdp.csv")
attach(dt)

#
#   To do this, we would like to use Fisher's F-test, as it is the
#   most powerful of the variance tests available to us. 
#

var.test(gdpcap~OPEC)

#   According to Fisher's F-test, there is a significant difference
#   between the two variances. In fact, we are 95% confident that 
#   OPEC members states are between 2.7 and 16.5 times more variable
#   than OPEC non-members.

detach(dt)


### Part III: To Nationalize or Not? 
#
#   I would like to determine if the variability in the quality
#   of the government (FSI) depends on whether the government has
#   nationalized its petroleum industry.
#
#   n.b.: The Failed States Index (FSI) is a measure of how well the 
#         government meets its obligations to its citizens. 
#

dt = read.csv("http://rfs.kvasaheim.com/data/clf.csv")
attach(dt)
names(dt)

var.test(fsi~nationalized)

boxplot(fsi~nationalized)

#   Because the p-value is greater than our usual 0.05, we fail
#   to reject the null hypothesis that there is no relationship
#   between the average FSI of nationalized states and of non-
#   nationalized states.

detach(dt)