252solnF2 11/06/03 (Open this document in 'Page Layout' view!)
F. ANALYSIS OF VARIANCE
1. 1-Way Analysis of Variance
Text 11.1-11.6, 11.7**, 11.8 [11.1- 11.7, 11.8*] (11.1- 11.7, 11.8* (Same problem, different numbers – both answers will be posted)
2. 2 -Way Analysis of Variance
Text 11.15-11.18, 11.23, 11.29-11.32, 11.36 [11.15-11.18, 11.23, 11.28-11.30, 11.34] (11.15-11.18, 11.23, 11.28-11.30, 11.34), F1, F2, F4
3. More than 2-Way analysis of Variance
F3
4. Kruskal-Wallis Test
Text 12.86-12.87, 12.89 [11.39-11.40, 11.42] (11.39-11.40, 11.42), Downing and Clark 18-12, 18-13 (in chapter 17 in D&C 3rd edition),
5. Friedman Test
Text 12.93-12.95 [11.46-11.48] (11.65-11.67 on CD) Downing and Clark 18-4, 18-6 (in chapter 17 in D&C 3rd edition),
Graded Assignment 4 (Will be posted)
This document includes Exercises 11.15 – 11.36 and Problems F1, F2 and F4.
------
2-Way ANOVA Problems.
In the conventional 2-way problem, we assume rows, columns and observations per cell, so there are a total of observations. The model reads , where . We test three pairs of Hypotheses - (i) All row means equal (All zero); Not all row means equal , (ii) All column means equal (All zero); Not all column means equal, (iii) No interaction (All zero) ; Interaction. Thus we get the ANOVA table below.
Source / SS / DF / MS / F / /Rows / / / / ___ / ___ / Row means equal
Columns / / / / ___ / ___ / Column means equal
Interaction / / / / ___ / ___ / No Interaction
Within / / /
Total / /
In the randomized block design so there are a total of observations. The design becomes as below. The only thing that should require explanation is that just as , in the ordinary 2-way model, in the randomized block design , .
Source / SS / DF / MS / F / /Rows (Blocks) / / / / ___ / ___ / Row means equal
Columns (Between) / / / / ___ / ___ / Column means equal
Within (Error) / / /
Total / /
Exercise 11.15: Solutions to 11.15 – 23 are repeated, heavily edited, from the Instructor’s Solution Manual . (a) Between: = 5 – 1 = 4
(b) Blocks: = 7 – 1 = 6
(c) Error: = (7 – 1)( 5 – 1) = 24
(d) df T = = 7 x 5 – 1 = 34
252solnF2 11/06/03
Exercise 11.16: Given (Blocks), (Between groups) and
11.16 (a) = 210 – 60 – 75 = 75
(b) Between groups: = 15
(c) Blocks: = 12.5
(d) Error: = 3.125
(e) = 4.80
(f) = 4.00
Exercise 11.17: (a) If we fill in the table on the previous page, we get the table below.
Source / SS / DF / MS / F / /Rows (Blocks) / 75 / 6 / 12.5 / 4.00s / / Row means equal
Columns (Between) / 60 / 4 / 15 / 4.80s / / Column means equal
Within (Error) / 75 / 24 / 3.125
Total / 210 / 34
For testing the treatment means:
(b)-(c) Decision rule: If F > 2.78, reject H0. ‘s’ indicates a significant effect.
(d) Decision: Since Fcalc = 4.80 is above the upper critical bound of F = 2.78, reject H0.
For testing the block means:
(e)-(f) Decision rule: If F > 2.51, reject H0.
(g) Decision: Since Fcalc = 4.00 is above the upper critical bound of F = 2.51, reject H0.
Exercise 11.18: The outline says the following.
For row means, use .
For column means, use
Note that if , replace with .
So for the columns in the problem above use and the ‘critical range’ is 2.786
Exercise 11.23: The
11.23 H0: H1: At least one mean differs.
The Minitab output follows with commentary.
252solnF2 11/06/03
————— 11/6/2003 5:51:30 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > Retrieve "C:\Documents and Settings\RBOVE.WCUPANET\My Documents\Drive D\MINITAB\252COFFEE.MTW".
Retrieving worksheet from file: C:\Documents and Settings\RBOVE.WCUPANET\My Documents\Drive D\MINITAB\252COFFEE.MTW
# Worksheet was saved on Thu Nov 06 2003
Results for: 252COFFEE.MTW
MTB > print c1 c2 c3
Data Display This is the form in which the data was stored on CD.
Row Expert Brand Rating
1 1 1 24
2 1 2 26
3 1 3 25
4 1 4 22
5 2 1 27
6 2 2 27
7 2 3 26
8 2 4 24
9 3 1 19
10 3 2 22
11 3 3 20
12 3 4 16
13 4 1 24
14 4 2 27
15 4 3 25
16 4 4 23
17 5 1 22
18 5 2 25
19 5 3 22
20 5 4 21
21 6 1 26
22 6 2 27
23 6 3 24
24 6 4 24
25 7 1 27
26 7 2 26
27 7 3 22
28 7 4 23
29 8 1 25
30 8 2 27
31 8 3 24
32 8 4 21
33 9 1 22
34 9 2 23
35 9 3 20
36 9 4 19
252solnF2 11/06/03
MTB > table c1 c2 This was run to be sure that there was one point for each combination of expert and brand.
Tabulated Statistics: Expert, Brand
Rows: Expert Columns: Brand
1 2 3 4 All
1 1 1 1 1 4
2 1 1 1 1 4
3 1 1 1 1 4
4 1 1 1 1 4
5 1 1 1 1 4
6 1 1 1 1 4
7 1 1 1 1 4
8 1 1 1 1 4
9 1 1 1 1 4
All 9 9 9 9 36
Cell Contents --
Count
MTB > table c1 c2; This was run to check on where the data was. It should look
SUBC> data c3. like the data in the problem in the book.
Tabulated Statistics: Expert, Brand
Rows: Expert Columns: Brand
1 2 3 4
1 24.000 26.000 25.000 22.000
2 27.000 27.000 26.000 24.000
3 19.000 22.000 20.000 16.000
4 24.000 27.000 25.000 23.000
5 22.000 25.000 22.000 21.000
6 26.000 27.000 24.000 24.000
7 27.000 26.000 22.000 23.000
8 25.000 27.000 24.000 21.000
9 22.000 23.000 20.000 19.000
Cell Contents --
Rating:Data
252solnF2 11/06/03
MTB > table c1 c2; This was run to make row and column means available for
SUBC> mean c3. later computations.
Tabulated Statistics: Expert, Brand
Rows: Expert Columns: Brand
1 2 3 4 All
1 24.000 26.000 25.000 22.000 24.250
2 27.000 27.000 26.000 24.000 26.000
3 19.000 22.000 20.000 16.000 19.250
4 24.000 27.000 25.000 23.000 24.750
5 22.000 25.000 22.000 21.000 22.500
6 26.000 27.000 24.000 24.000 25.250
7 27.000 26.000 22.000 23.000 24.500
8 25.000 27.000 24.000 21.000 24.250
9 22.000 23.000 20.000 19.000 21.000
All 24.000 25.556 23.111 21.444 23.528
Cell Contents --
Rating:Mean
MTB > Twoway c3 c2 c1. Finally, we actually run the computer routine.
Two-way ANOVA: Rating versus Brand, Expert
Analysis of Variance for Rating
Source DF SS MS F P
Brand 3 79.64 26.55 26.42 0.000
Expert 8 153.22 19.15 19.06 0.000
Error 24 24.11 1.00
Total 35 256.97
------
Hand calculations follow.
Expert / B R A N D / Sum / / / SS /A / B / C / D
1 / 24 / 26 / 25 / 22 / 97 / 4 / 24.250 / 2361 / 588.0635
2 / 27 / 27 / 26 / 24 / 104 / 4 / 26.000 / 2710 / 676.0000
3 / 19 / 22 / 20 / 16 / 77 / 4 / 19.250 / 1501 / 370.5625
4 / 24 / 27 / 25 / 23 / 99 / 4 / 24.750 / 2459 / 612.5625
5 / 22 / 25 / 22 / 21 / 90 / 4 / 22.500 / 2034 / 506.2500
6 / 26 / 27 / 24 / 24 / 101 / 4 / 25.250 / 2557 / 637.5625
7 / 27 / 26 / 22 / 23 / 98 / 4 / 24.500 / 2418 / 600.2500
8 / 25 / 27 / 24 / 21 / 97 / 4 / 24.250 / 2371 / 588.0625
9 / 22 / 23 / 20 / 19 / 84 / 4 / 21.000 / 1774 / 441.0000
Sum / 216 / 230 / 208 / 193 / 847 / 36 / 20185 / 5020.3125
/ 9 / 9 / 9 / 9 / 36 / /
/ 24.00 / 25.556 / 23.111 / 21.444 / 23.528
SS / 5240 / 5906 / 4846 / 4193 / 20185
/ 576 / 653.07 / 534.12 / 459.86 / 2223.05
252solnF2 11/06/03
Note that is not a sum, but is . .
Source / SS / DF / MS / F / /Rows(Blocks) / 152.846 / 8 / 19.1057 / 18.57s / / Row means equal
Columns(Brands) / 79.046 / 3 / 26.3487 / 25.61s / / Column means equal
Within / 24.704 / 24 / 1.029
Total / 256.596 / 35
Decision rule for Brands: If F > 3.01, reject H0. The Instructor’s Solution Manual gives the table below states, without any indication of how it was computed. (Excel?)
Anova: Two-Factor Without ReplicationSource of Variation / SS / df / MS / F / P-value / F crit
Rows / 153.2222 / 8 / 19.15278 / 19.06452 / 1.21E-08 / 3.362857
Columns / 79.63889 / 3 / 26.5463 / 26.42396 / 8.86E-08 / 4.718061
Error / 24.11111 / 24 / 1.00463
Total / 256.9722 / 35
Test statistic: F = 26.42
Decision: Since Fcalc = 26.42 is above the critical bound F = 3.01, reject H0. There is adequate evidence to conclude that there is a difference in the mean summed ratings of the four brands of Colombian coffee.
In the last problem we got a formula for a Tukey confidence interval. and the ‘critical range’ is 1.30.
The means are / A / B / C / D/ 24.00 / 25.556 / 23.111 / 21.444
And there are ways to pick a pair of means.
Pairs of means that differ at the 0.05 level are marked with * below. We can do these differences as follows. Each box filled in represents the absolute value of (column level mean) - (row level mean).
A / B / CB / 1.556 / ……. / …….
C / 0.889 / 2.445 / …….
D / 2.556 / 4.112 / 1.667
252solnF2 11/06/03
This problem may be fun for someone. The printout below is obviously missing some numbers. What are they?.
Exercise 14.40( From the McClave et. al. text): The printout given by the text is reformatted below.
ANALYSIS OF VARIANCE
SOURCE DF SS MS F
A 3 ___ .75 ____
B 1 .95 ______
AB __ ___ .30 ____
ERROR ______
TOTAL 23 6.5
Solution: c) Since MSA is SSA divided by DFA, SSA must be 3 times .75 or 2.25. MSB must be .95 divided by DFB, which is 1, so MSB is .95. For the interaction DFAB is equal to the product of degrees of freedom for A and B, so DFAB must be 3. If there are a total of 23 degrees of freedom, the error degrees of freedom must be 23 - 3 - 1 - 3 = 16. Since MSAB is SSAB divided by DFAB, SSAB must be 3 times .30 or .90. note that the problem has
Our table is now:
SOURCE DF SS MS F
A 3 2.25 .75 ____
B 1 0.95 .95 ____
AB 3 0.90 .30 ____
ERROR 16 ______
TOTAL 23 6.5
Complete the SS column by subtracting 2.25, 0.95 and 0.90 from 6.5 and getting 2.40. Divide 2.40 by 16 to get the error mean square of .15, so that we now have:
SOURCE DF SS MS F
A 3 2.25 .75 ____
B 1 0.95 .95 ____
AB 3 0.90 .30 ____
ERROR 16 2.40 .15
TOTAL 23 6.5
Get the F column by dividing the three top mean squares by the error mean square. Look up the 10% values of F on the F table.
SOURCE DF SS MS F
A 3 2.25 .75 5.00 s
B 1 0.95 .95 6.33 s
AB 3 0.90 .30 2.00 ns
ERROR 16 2.40 .15
TOTAL 23 6.5
a) DF for factor A is 3, so there are 4 levels. DF for factor B is 1, so there are 2 levels.
b) The total number of observations was 23 + 1 =24. Since the were 4 levels for factor A and 2 for factor B, there were 4 times 2 or 8 treatments. If we divide 24 by 8, we get 3 observations for each treatment.
d) The sum of squares between treatments is 2.25 + 0.95 + 0.90 = 4.1. 8 treatments means 7 = 3 + 1 + 3 degrees of freedom. The mean square for all treatments is thus 4.1 divided by 7, which equals 0.5857. To get an F statistic divide MST = 0.5857 by MSW= 0.15 and get 3.90. This F has 7 and 16 degrees of freedom. To test the null hypothesis "" at the 10% significance level, use Since 3.90 is larger than the value from the table, reject the null hypothesis and conclude that the treatment means differ.
252solnF2 11/06/03
e) From the ANOVA table we see that there is no significant interaction, but that factor A and B have significant effects. Because there was no significant interaction, we can be sure that the tests of the two main factors are meaningful.
Problem F1: In a 2-way analysis of variance there are 5 rows, 10 columns and 5 observations per cell.
a. SST = 1000, SSW = 100, SSR = 200, SSC = 300
b. SST = 272.5, SSW = 100, SSR = 6, SSC = 22.5
Complete the ANOVA using a 1 percent significance level. State the hypotheses you test.
Solution: a)
Source / SS / DF / MS / F / /Rows / 200 / 4 / 50.00 / 100.00 / s / Row means equal
Columns / 300 / 9 / 33.33 / 66.67 / s / Column means equal
Interaction / 400 / 36 / 11.11 / 22.22 / s / No Interaction
Within / 100 / 200 / 0.50
Total / 1000 / 249
Note that , the total number of observations. ‘s’ (significant) means ‘reject .’ ‘ns’ (not significant) means ‘do not reject .’