1
Þ Example – 3 : Two sample t test
o Aim: Smoking is thought to have adverse effect on human reproduction. Food and cigarette smoke are the largest potential sources of cadmium exposure for members of the general population. Is cadmium transferred to human via smoking?
o Data: collected from 14 mothers who are smokers, 19 who are non.
o Question: Does it appear that the mean cadmium level is higher among smokers than non –smokers?
o R commands:
> nons = c(10,8,4,12.8,25,11.8,9.8,12.5,15.4,23.5,9.4,25.1,19.5,25.5,9.8,7.5,11.8,12.2,15)
> s = c(30,30.1,15,24.1,30.5,17.8,16.8,14.8,13.4,28.5,17.5,14.4,12.5,20.4)
> t.test(nons,s,alt="less",var.equal=T)
Two Sample t-test
data: nons and s
t = -2.6805, df = 31, p-value = 0.005834
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf -2.306676
sample estimates:
mean of x mean of y
14.13684 20.41429
o Interpretation?
o Checking assumption #1 (Normality): qqplot and Shapiro-Wilks test
Ø qq plot
> par(mfrow=c(2,2))
> qqnorm(nons,lwd=2)
> qqline(nons,lwd=2)
> qqnorm(s,lwd=2)
> qqline(s,lwd=2) # try lwd with different values
> boxplot(nons,col=gray(0.7),lwd=2)
> boxplot(s,col=gray(0.7),lwd=2) #try gray with different values
D Clearly not Normal: The dots in the qqplot are way away from the straight line; the median line is not in the middle.
Ø Shapiro-Wilk test: It is clear that the distributions are not Normal by the graphical methods. Also use a formal testing procedure:
> shapiro.test(nons)
Shapiro-Wilk normality test
data: nons
W = 0.8903, p-value = 0.03265
> shapiro.test(s)
Shapiro-Wilk normality test
data: s
W = 0.8533, p-value = 0.02467
o Checking assumption #2 (Homogeneity of Variances):
Ø Box plot: see the boxplots above. Clearly variability is higher among smokers
Ø Snedecor’s F-test
> var.test(nons,s)
F test to compare two variances
data: nons and s
F = 0.9175, num df = 18, denom df = 13, p-value = 0.8472
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.307567 2.505068
sample estimates:
ratio of variances
0.9175458
C Variances of the two samples can be considered equal
Þ Example – 4 : Refer to the t.test in Example 3. . Instead of everything, only print the p value. Only print the value of the statistic.
a = t.test(nons,s,alt="less",var.equal=T) # a is a list containing the components
# below
> a$statistic
t
-2.68047
> a$p.value
[1] 0.005834458
> a$conf.int
[1] -Inf -2.306676
attr(,"conf.level")
[1] 0.95
> a$method
[1] " Two Sample t-test"
> a$data.name
[1] "nons and s"
Refer to R help for t.test. All the components of the output list variable are listed under the Value.
Alternatively, use the function names to see the components:
> names(a)
[1] "statistic" "parameter" "p.value" "conf.int" "estimate"
[6] "null.value" "alternative" "method" "data.name"
Þ Example – 5 : Two sample t test when the variances are unequal
o Research Hypothesis: exposure to GM (genetically modified) food increases the kidney weight.
o Data: Kidney weight of a certain animal fed with GM corn vs. that for the ones fed with nonGM corn.
nonGM corn group (in gram): 0.3167, 0.3610, 0.3524, 0.37, 0.29, 0.28, 0.35
GM corn group (in gram): 4.67, 2.79, 10.74, 5.25, 4.91, 3.08, 1.97
o Statistical Hypothesis: H0: μGM = μnGM , H1: μGM > μnGM
o Assumption checks
Normality check: Through Shapiro-Wilk test at 5% level of significance.
> shapiro.test(nongm)
Shapiro-Wilk normality test
data: nongm
W = 0.839, p-value = 0.2443 #assumption of normality is not rejected
> shapiro.test(gm)
Shapiro-Wilk normality test
data: gm
W = 0.8282, p-value = 0.0769 #assumption of normality is not rejected
Homogeneity of variances check:
> var.test(nongm,gm)
F test to compare two variances
data: nongm and gm
F = 2e-04, num df = 6, denom df = 6, p-value = 7.147e-11 #homogeneity of
# variances is rejected
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
2.627589e-05 8.899532e-04
sample estimates:
ratio of variances
0.0001529193
o Welch two sample t-test: version of the two sample t test for samples with unequal variances.
> test = t.test(gm,nongm,alt="greater",var.equal=F)
> test
Welch Two Sample t-test
data: gm and nongm
t = 4.0494, df = 6.002, p-value = 0.003363
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
2.310244 Inf
sample estimates:
mean of x mean of y
4.7728571 0.3314429
Þ Example – 6 : Paired t test
o Aim: Automative engineer. Two different types of metering devices for an electronic fuel injection system. Investigate whether they differ in fuel mileage performance.
o Data: The system is installed on 12 different cars. The test is run with each metering device on each car. Paired data.
> x = c(17.6,19.4,19.5,17.1,15.3,15.9,16.3,18.4,17.3,19.1,17.8,18.2)
> y = c(16.8,20.0,18.2,16.4,16,15.4,16.5,18,16.4,20.1,16.7,17.9)
> d = x-y
o Assumption checks for Normality of the difference (d)
> shapiro.test(d)
Shapiro-Wilk normality test
data: d
W = 0.9342, p-value = 0.4265
o R commands for a paired t test
> t.test(d,mu=0)
One Sample t-test
data: d
t = 1.3448, df = 11, p-value = 0.2058
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.1856942 0.7690275
sample estimates:
mean of x
0.2916667
> t.test(x,y,paired=T)
Paired t-test
data: x and y
t = 1.3448, df = 11, p-value = 0.2058
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.1856942 0.7690275
sample estimates:
mean of the differences
0.2916667
Þ Example – 7: One way Analysis of Variance
o Aim: 3 new methods for teaching tennis serve; A, B, C. C being the currently most applied method.
* From Wikipedia: “An ace occurs when a served ball lands on the other side of the court, within the service box, and the player on that side fails to even touch the ball with his racket.”
o Study: In total, 15 tennis players of similar ranking. 5 of them are trained with method A, 4 with method B, 6 with method C.
o Data:
Method / Observations (number of aces in the grand slams in one year)A
B
C / 13, 10, 8, 11, 8
13, 11, 14, 14
4, 1, 3, 4, 2, 4
o R commands to prepare the data for anova:
> a = c(13,10,8,11,8)
> b = c(13,11,14,14)
> c = c(4,1,3,4,2,4)
> d = stack(list("a"=a,"b"=b,"c"=c))
> d
values ind
1 13 a
2 10 a
3 8 a
4 11 a
5 8 a
6 13 b
7 11 b
8 14 b
9 14 b
10 4 c
11 1 c
12 3 c
13 4 c
14 2 c
15 4 c
o Assumption check
Normality of parent populations:
> shapiro.test(a)
Shapiro-Wilk normality test
data: a
W = 0.91, p-value = 0.4677
> shapiro.test(b)
Shapiro-Wilk normality test
data: b
W = 0.8274, p-value = 0.1612
> shapiro.test(c)
Shapiro-Wilk normality test
data: c
W = 0.8311, p-value = 0.1099
Homogeneity of variances: Bartlett’s test
> bartlett.test(values ~ ind, data =d)
Bartlett test of homogeneity of variances
data: values by ind
Bartlett's K-squared = 1.1858, df = 2, p-value = 0.5527 #homogeneous
# variances
o One way analysis of means:
> oneway.test(values~ind,data=d,var.equal=T)
One-way analysis of means
data: values and ind
F = 50.625, num df = 2, denom df = 12, p-value = 1.415e-06
Þ Example – 8: Multiple Testing
o After rejecting the null hypothesis in Example 7: which pairs are different?
o Test the difference between all possible pairs. Adjust the Type I error level for multiple testing.
o R command:
> pairwise.t.test(d$values,d$ind,p.adj="bonferroni",data=d)
Pairwise comparisons using t tests with pooled SD
data: d$values and d$ind
a b
b 0.054 -
c 3.9e-05 1.9e-06
P value adjustment method: bonferroni
Þ Example – 9: Validity of the model on which the ANOVA is based
o Every ANOVA is based on a model
o In example 8, the model can be written in 2 ways:
§ either as a one factor main effect anova model as follows:
Yij = μ + τi + εij
where Yij is the observed data in group j by subject i,
μ is the grand mean (average without any group effect)
τi is the effect of the ith group, i.e. the difference between the grand
average and the average after exposed to the method.
εij are the stochastic error terms with iid N(0,σ2) distribution
§ or as a linear regression
Yij = β0 + β1X1 + β2X2 + εij
where
β0 is the intercept
β1 represents how the average is for group B with respect to group A
β2 represents how the average is for group C with respect to group A
How are X1 and X2 defined here?
o For our specific data set, is it o.k. to assume that this model holds?
If this model does not hold, then it does not make sense to base our analysis on this model.
o How do we check if the model holds for our data set? i.e. how do we check if our data can be represented by this model?
o If the model is correct for our data set: observed data and the predictions based on the model should be similar. That is, the difference between the observed values and the values obtained from the model should be symmetric around zero, i.e. this difference (the residuals) should satisfy iid N(0,σ2)
o Residual analysis and model adequacy:
> fit = lm(values ~ ind, data=d)
> summary(fit)
Call:
lm(formula = values ~ ind, data = d)
Residuals:
Min 1Q Median 3Q Max
-2.000e+00 -1.500e+00 2.326e-16 1.000e+00 3.000e+00
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.0000 0.7303 13.693 1.10e-08 ***
indb 3.0000 1.0954 2.739 0.0180 *
indc -7.0000 0.9888 -7.079 1.28e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.633 on 12 degrees of freedom
Multiple R-squared: 0.894, Adjusted R-squared: 0.8764
F-statistic: 50.63 on 2 and 12 DF, p-value: 1.415e-06
> attributes(fit)
$names
[1] "coefficients" "residuals" "effects" "rank"
[5] "fitted.values" "assign" "qr" "df.residual"
[9] "contrasts" "xlevels" "call" "terms"
[13] "model"
$class
[1] "lm"
> fit$residual
1 2 3 4 5
3.000000e+00 2.326156e-16 -2.000000e+00 1.000000e+00 -2.000000e+00
6 7 8 9 10
1.057097e-17 -2.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00
11 12 13 14 15
-2.000000e+00 2.326156e-16 1.000000e+00 -1.000000e+00 1.000000e+00
> mean(fit$residual)
[1] -4.971428e-17
o Median and mean being close to each other and close to 0: indicates Normal distribution around 0.
o R squares being close to 1: indicates adequate fit
o More diagnosis:
Ø par(mfrow=c(2,2))
> plot(fit)
# automatically gives the following 4 plots simultaneously
Ø Plot 1: Residuals vs. Fitted
o Mean residual is 0 for all the groups.
o Variances of the residuals seem to be not so different than each other over the groups (maybe that one point on the top for the second bunch).
Ø Plot 2: Normal q-q plot
o Hard to comment on Normality graphically based on such a small number of points. Let’s conduct a formal test:
o > shapiro.test(fit$residual)
Shapiro-Wilk normality test
data: fit$residual
W = 0.8714, p-value = 0.03543
o Normality of the residuals is rejected at 10% significant level. Model is now questionable in this case. So is the validity of ANOVA F test.
Ø Plot 3: Scale-Location
o Similar to Plot 1.
Ø Plot 4: to detect influential observations.
O.Ilk Dag, STAT 291 lecture notes, 2013, 30 October