Þ Example 3 : Two Sample T Test

Þ Example – 3 : Two sample t test

o Aim: Smoking is thought to have adverse effect on human reproduction. Food and cigarette smoke are the largest potential sources of cadmium exposure for members of the general population. Is cadmium transferred to human via smoking?

o Data: collected from 14 mothers who are smokers, 19 who are non.

o Question: Does it appear that the mean cadmium level is higher among smokers than non –smokers?

o R commands:

> nons = c(10,8,4,12.8,25,11.8,9.8,12.5,15.4,23.5,9.4,25.1,19.5,25.5,9.8,7.5,11.8,12.2,15)

> s = c(30,30.1,15,24.1,30.5,17.8,16.8,14.8,13.4,28.5,17.5,14.4,12.5,20.4)

> t.test(nons,s,alt="less",var.equal=T)

Two Sample t-test

data: nons and s

t = -2.6805, df = 31, p-value = 0.005834

alternative hypothesis: true difference in means is less than 0

95 percent confidence interval:

-Inf -2.306676

sample estimates:

mean of x mean of y

14.13684 20.41429

o Interpretation?

o Checking assumption #1 (Normality): qqplot and Shapiro-Wilks test

Ø qq plot

> par(mfrow=c(2,2))

> qqnorm(nons,lwd=2)

> qqline(nons,lwd=2)

> qqnorm(s,lwd=2)

> qqline(s,lwd=2) # try lwd with different values

> boxplot(nons,col=gray(0.7),lwd=2)

> boxplot(s,col=gray(0.7),lwd=2) #try gray with different values

D Clearly not Normal: The dots in the qqplot are way away from the straight line; the median line is not in the middle.

Ø Shapiro-Wilk test: It is clear that the distributions are not Normal by the graphical methods. Also use a formal testing procedure:

> shapiro.test(nons)

Shapiro-Wilk normality test

data: nons

W = 0.8903, p-value = 0.03265

> shapiro.test(s)

Shapiro-Wilk normality test

data: s

W = 0.8533, p-value = 0.02467

o Checking assumption #2 (Homogeneity of Variances):

Ø Box plot: see the boxplots above. Clearly variability is higher among smokers

Ø Snedecor’s F-test

> var.test(nons,s)

F test to compare two variances

data: nons and s

F = 0.9175, num df = 18, denom df = 13, p-value = 0.8472

alternative hypothesis: true ratio of variances is not equal to 1

95 percent confidence interval:

0.307567 2.505068

sample estimates:

ratio of variances

0.9175458

C Variances of the two samples can be considered equal

Þ Example – 4 : Refer to the t.test in Example 3. . Instead of everything, only print the p value. Only print the value of the statistic.

a = t.test(nons,s,alt="less",var.equal=T) # a is a list containing the components

# below

> a$statistic

-2.68047

> a$p.value

[1] 0.005834458

> a$conf.int

[1] -Inf -2.306676

attr(,"conf.level")

[1] 0.95

> a$method

[1] " Two Sample t-test"

> a$data.name

[1] "nons and s"

Refer to R help for t.test. All the components of the output list variable are listed under the Value.

Alternatively, use the function names to see the components:

> names(a)

[1] "statistic" "parameter" "p.value" "conf.int" "estimate"

[6] "null.value" "alternative" "method" "data.name"

Þ Example – 5 : Two sample t test when the variances are unequal

o Research Hypothesis: exposure to GM (genetically modified) food increases the kidney weight.

o Data: Kidney weight of a certain animal fed with GM corn vs. that for the ones fed with nonGM corn.

nonGM corn group (in gram): 0.3167, 0.3610, 0.3524, 0.37, 0.29, 0.28, 0.35

GM corn group (in gram): 4.67, 2.79, 10.74, 5.25, 4.91, 3.08, 1.97

o Statistical Hypothesis: H0: μGM = μnGM , H1: μGM > μnGM

o Assumption checks

Normality check: Through Shapiro-Wilk test at 5% level of significance.

> shapiro.test(nongm)

Shapiro-Wilk normality test

data: nongm

W = 0.839, p-value = 0.2443 #assumption of normality is not rejected

> shapiro.test(gm)

Shapiro-Wilk normality test

data: gm

W = 0.8282, p-value = 0.0769 #assumption of normality is not rejected

Homogeneity of variances check:

> var.test(nongm,gm)

F test to compare two variances

data: nongm and gm

F = 2e-04, num df = 6, denom df = 6, p-value = 7.147e-11 #homogeneity of

# variances is rejected

alternative hypothesis: true ratio of variances is not equal to 1

95 percent confidence interval:

2.627589e-05 8.899532e-04

sample estimates:

ratio of variances

0.0001529193

o Welch two sample t-test: version of the two sample t test for samples with unequal variances.

> test = t.test(gm,nongm,alt="greater",var.equal=F)

> test

Welch Two Sample t-test

data: gm and nongm

t = 4.0494, df = 6.002, p-value = 0.003363

alternative hypothesis: true difference in means is greater than 0

95 percent confidence interval:

2.310244 Inf

sample estimates:

mean of x mean of y

4.7728571 0.3314429

Þ Example – 6 : Paired t test

o Aim: Automative engineer. Two different types of metering devices for an electronic fuel injection system. Investigate whether they differ in fuel mileage performance.

o Data: The system is installed on 12 different cars. The test is run with each metering device on each car. Paired data.

> x = c(17.6,19.4,19.5,17.1,15.3,15.9,16.3,18.4,17.3,19.1,17.8,18.2)

> y = c(16.8,20.0,18.2,16.4,16,15.4,16.5,18,16.4,20.1,16.7,17.9)

> d = x-y

o Assumption checks for Normality of the difference (d)

> shapiro.test(d)

Shapiro-Wilk normality test

data: d

W = 0.9342, p-value = 0.4265

o R commands for a paired t test

> t.test(d,mu=0)

One Sample t-test

data: d

t = 1.3448, df = 11, p-value = 0.2058

alternative hypothesis: true mean is not equal to 0

95 percent confidence interval:

-0.1856942 0.7690275

sample estimates:

mean of x

0.2916667

> t.test(x,y,paired=T)

Paired t-test

data: x and y

t = 1.3448, df = 11, p-value = 0.2058

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-0.1856942 0.7690275

sample estimates:

mean of the differences

0.2916667

Þ Example – 7: One way Analysis of Variance

o Aim: 3 new methods for teaching tennis serve; A, B, C. C being the currently most applied method.

* From Wikipedia: “An ace occurs when a served ball lands on the other side of the court, within the service box, and the player on that side fails to even touch the ball with his racket.”

o Study: In total, 15 tennis players of similar ranking. 5 of them are trained with method A, 4 with method B, 6 with method C.

o Data:

Method / Observations (number of aces in the grand slams in one year)
A
B
C / 13, 10, 8, 11, 8
13, 11, 14, 14
4, 1, 3, 4, 2, 4

o R commands to prepare the data for anova:

> a = c(13,10,8,11,8)

> b = c(13,11,14,14)

> c = c(4,1,3,4,2,4)

> d = stack(list("a"=a,"b"=b,"c"=c))

> d

values ind

1 13 a

2 10 a

3 8 a

4 11 a

5 8 a

6 13 b

7 11 b

8 14 b

9 14 b

10 4 c

11 1 c

12 3 c

13 4 c

14 2 c

15 4 c

o Assumption check

Normality of parent populations:

> shapiro.test(a)

Shapiro-Wilk normality test

data: a

W = 0.91, p-value = 0.4677

> shapiro.test(b)

Shapiro-Wilk normality test

data: b

W = 0.8274, p-value = 0.1612

> shapiro.test(c)

Shapiro-Wilk normality test

data: c

W = 0.8311, p-value = 0.1099

Homogeneity of variances: Bartlett’s test

> bartlett.test(values ~ ind, data =d)

Bartlett test of homogeneity of variances

data: values by ind

Bartlett's K-squared = 1.1858, df = 2, p-value = 0.5527 #homogeneous

# variances

o One way analysis of means:

> oneway.test(values~ind,data=d,var.equal=T)

One-way analysis of means

data: values and ind

F = 50.625, num df = 2, denom df = 12, p-value = 1.415e-06

Þ Example – 8: Multiple Testing

o After rejecting the null hypothesis in Example 7: which pairs are different?

o Test the difference between all possible pairs. Adjust the Type I error level for multiple testing.

o R command:

> pairwise.t.test(d$values,d$ind,p.adj="bonferroni",data=d)

Pairwise comparisons using t tests with pooled SD

data: d$values and d$ind

a b

b 0.054 -

c 3.9e-05 1.9e-06

P value adjustment method: bonferroni

Þ Example – 9: Validity of the model on which the ANOVA is based

o Every ANOVA is based on a model

o In example 8, the model can be written in 2 ways:

§ either as a one factor main effect anova model as follows:

Yij = μ + τi + εij

where Yij is the observed data in group j by subject i,

μ is the grand mean (average without any group effect)

τi is the effect of the ith group, i.e. the difference between the grand

average and the average after exposed to the method.

εij are the stochastic error terms with iid N(0,σ2) distribution

§ or as a linear regression

Yij = β0 + β1X1 + β2X2 + εij

where

β0 is the intercept

β1 represents how the average is for group B with respect to group A

β2 represents how the average is for group C with respect to group A

How are X1 and X2 defined here?

o For our specific data set, is it o.k. to assume that this model holds?

If this model does not hold, then it does not make sense to base our analysis on this model.

o How do we check if the model holds for our data set? i.e. how do we check if our data can be represented by this model?

o If the model is correct for our data set: observed data and the predictions based on the model should be similar. That is, the difference between the observed values and the values obtained from the model should be symmetric around zero, i.e. this difference (the residuals) should satisfy iid N(0,σ2)

o Residual analysis and model adequacy:

> fit = lm(values ~ ind, data=d)

> summary(fit)

Call:

lm(formula = values ~ ind, data = d)

Residuals:

Min 1Q Median 3Q Max

-2.000e+00 -1.500e+00 2.326e-16 1.000e+00 3.000e+00

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 10.0000 0.7303 13.693 1.10e-08 ***

indb 3.0000 1.0954 2.739 0.0180 *

indc -7.0000 0.9888 -7.079 1.28e-05 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.633 on 12 degrees of freedom

Multiple R-squared: 0.894, Adjusted R-squared: 0.8764

F-statistic: 50.63 on 2 and 12 DF, p-value: 1.415e-06

> attributes(fit)

$names

[1] "coefficients" "residuals" "effects" "rank"

[5] "fitted.values" "assign" "qr" "df.residual"

[9] "contrasts" "xlevels" "call" "terms"

[13] "model"

$class

[1] "lm"

> fit$residual

1 2 3 4 5

3.000000e+00 2.326156e-16 -2.000000e+00 1.000000e+00 -2.000000e+00

6 7 8 9 10

1.057097e-17 -2.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00

11 12 13 14 15

-2.000000e+00 2.326156e-16 1.000000e+00 -1.000000e+00 1.000000e+00

> mean(fit$residual)

[1] -4.971428e-17

o Median and mean being close to each other and close to 0: indicates Normal distribution around 0.

o R squares being close to 1: indicates adequate fit

o More diagnosis:

Ø par(mfrow=c(2,2))

> plot(fit)

# automatically gives the following 4 plots simultaneously

Ø Plot 1: Residuals vs. Fitted

o Mean residual is 0 for all the groups.

o Variances of the residuals seem to be not so different than each other over the groups (maybe that one point on the top for the second bunch).

Ø Plot 2: Normal q-q plot

o Hard to comment on Normality graphically based on such a small number of points. Let’s conduct a formal test:

o > shapiro.test(fit$residual)

Shapiro-Wilk normality test

data: fit$residual

W = 0.8714, p-value = 0.03543

o Normality of the residuals is rejected at 10% significant level. Model is now questionable in this case. So is the validity of ANOVA F test.

Ø Plot 3: Scale-Location

o Similar to Plot 1.

Ø Plot 4: to detect influential observations.

O.Ilk Dag, STAT 291 lecture notes, 2013, 30 October