Analysis of Variance Notes
· Experiments versus Studies
· Types of Experiments
· Assumptions & Assumption Checks
· Types of Analysis
· NCSS
1. Experiments versus Studies
1.1 Terminology
· Factors versus Independent Variables
Example: hours studied and major are two factors affecting Grade
· Treatments –
Example: specific combinations of hours studied and teaching method
1.2 Purpose
· Observational Study –
o Correlational –
o Observe values of X
· Experiment –
o Cause-Effect
o Control values of X
· Designs
o Balanced
o unbalanced
2. Types of Experimental Designs
2.1 Randomized Design
· one factor
· two factor
2.2 Randomized Block Design
2.3 Examples
· Teaching Method only
· Teaching method and hours studied
· Teaching method within major
3. Assumptions & Assumption Checks
3.1 Assumptions
· Same Variance
· Independence
· Normality
3.2 Assumption checks
· Modified Levine – comparing differences to center
· Normality Tests and Box Plots
4. Analysis
· Sources of Variability and degrees of freedom
· Tests of effects of
o One factor designs
o Each factor in two factor designs
o Combination effects in two factor designs
· Tests of Assumptions
· Tests and estimation of differences in averages
4.1 Sources of variability and degrees of freedom
·
· Total: Values around overall average: divisor of (n-1)
· Factor: Factor averages variation : divisor of (# averages – 1)
· Interaction: Combination effects: divisor of (product of factor divisors)
· Error: Randomness: divisor of (n - # of averages or combination of averages)
4.2 Tests of effects of
4.2.1 One Factor – Completely Randomized Design or Independent Sample Study
4.2.1.1 Test Template:
· Null hypothesis: average value of Y is the same for all levels of the factor
· Alternative: at least two are different
· Test Statistic: Compares variation of factor averages to variation of random data
Among-Group variation to within-group variation
· Rejection Region: Above ratio is large (F ratio) > F table
Two degrees of freedom: numerator degrees of freedom and divisor degrees of freedom
· Conclusion: We can (not) say the average value of Y differs for at least two levels of the factor.
4.2.1.2 Example: Y = tensile strength of a product
Factor = 4 Suppliers
Obtain samples of size 5 from each supplier (n = ____ )
MSA = sample factor variability = 21.095
MSW = sample error variability = 6.094
· Null hypothesis: m1=m2=m3=m4 (average value of ______is the
Same for all ______)
· Alternative: at least two are different
· Test Statistic: MSA/MSW =
· Rejection Region: Reject Ho if F > F table with
· Numerator degrees of freedom = ______and denomination d.f. = _____
· F-Table = ______
· Conclusion: We can (not) say that the average ______differs for at least two ______
4.2.2 One Factor – Randomized Block
4.2.2.1 Test Template:
Same as in 4.2.1.1 but divisor degrees of freedom =
(# of factor means-1)*( # of block means-1)
4.2.1.2 Example: Y = Rating of a restaurant’s service
Factor = 4 Restaurants
Block = all restaurants reviewed by same 6 raters (n = ____ )
MSA = sample factor variability = 595.8
MSE = sample error variability = 14.986
· Null hypothesis: m1=m2=m3=m4 (average value of ______is the
Same for all ______)
· Alternative: at least two are different
· Test Statistic: MSA/MSW =
· Rejection Region: Reject Ho if F > F table with
· Numerator degrees of freedom = ______and denomination d.f. = _____
· F-Table = ______
· Conclusion: We can (not) say that the average ______differs for at least two ______
4.2.3 Two Factors – Interaction or combination effects
4.2.3.1 Test Template:
· Null hypothesis: (no interaction) difference in average value of Y between any two levels of factor one does not depend on the level of factor two
· Alternative: (interaction) difference in average value of Y between any two levels of factor one does depend on the level of factor two
· Test Statistic: Compares variation of interaction to variation of random data
Among-Group variation to within-group variation
· Rejection Region: Above ratio is large (F ratio) > F table
Two degrees of freedom:
numerator d.f. = product of factor 1 and 2 d.f.
denominator = n – number of combination of factor 1 and 2
· Conclusion: we can (not) say that the difference in average value of Y between any two levels of factor one does depend on the level of factor two.
4.2.1.2 Example: Y = length of a ball-bearings life
Factor 1 = heat treatment (high or low)
Factor 2 = ring osculation (high or low)
Obtain samples of size 2 from each combination (n = ____ )
MSAB = sample interaction variability = 3280.5
MSE = sample error variability = 61
· Null hypothesis: (no interaction) difference in average value of ______between any two levels of ______does not depend on the level of ______
· Alternative: (interaction) difference in average value of ______between any two levels of ______does depend on the level of ______
· Test Statistic: Compares variation of interaction to variation of random data
F= MSAB / MSE =
· Rejection Region: Above ratio is large (F ratio) > F table
Two degrees of freedom:
numerator d.f. = product of factor 1 and 2 d.f = .
denominator = n – number of combination of factor 1 and 2 =
· Conclusion: we can (not) say that the difference in average value of ______between any two levels of ______does depend on the level of ______.
4.2.4 One of the two factors – Completely Randomized Design or Independent Sample Study – NO SIGNIFICANT INTERACTION
4.2.4.1 Test Template:
same as in the one-factor test but
divisor d.f. = n – (#of levels of factor 1)*(# in factor 2)
4.2.4.2 Example: Y = rating of a photographic plate
Factor A = 2 levels of development strength,
Factor B = 2 levels of development time (10 and 14 minutes)
Randomly assign 4 plates to each of the 4 combinations
MSA = sample variability of factor A (time) = 1.5625
MSB = sample variability of factor B (strength) = 56.5625
MSE = sample error variability = 2.229
(no interaction was found – testing time effect)
· Null hypothesis: m1=m2 (average value of ______is the
Same for all ______)
· Alternative: at least two are different
· Test Statistic: MSB/MSE =
· Rejection Region: Reject Ho if F > F table with
· Numerator degrees of freedom = ______and denomination d.f. = _____
· F-Table = ______
· Conclusion: We can (not) say that the average ______differs for at least two ______
4.3 Tests of Assumptions
4.3.1 Equal Variance –
4.3.1.1 Test Template:
· Null hypothesis: variation of Y is the same for all levels of the factor
· Alternative: at least two are different
· Compute the absolute difference between each value in a group and the median of the group
Test Statistic and rejection region: same as for the factor tests
· Conclusion: We can (not) say the variation of Y differs for at least two levels of the factor.
4.3.1.2 Example: Y = tensile strength of a product
Factor = 4 Suppliers
Obtain samples of size 5 from each supplier (n = ____ )
MSDifference = sample factor variability = 0.59
MSE = sample error variability = 2.2853
· Null hypothesis: s21=s22=s23=s24 (variability of ______is the
Same for all ______)
· Alternative: at least two are different
· Test Statistic: MSDiff/MSE =
· Rejection Region: Reject Ho if F > F table with
· Numerator degrees of freedom = ______and denomination d.f. = _____
· F-Table = ______
· Conclusion: We can (not) say that the variability of ______differs for at least two ______
4.3.2 Normality –
4.3.2.1 Test Template:
· Null hypothesis: distribution of Y is the normal for all levels of the factor
· Alternative: at least one is not normal
Test Statistic and rejection region: use tests on NCSS and p-value is less than alpha reject normality.
· Conclusion: We can (not) say the distribution of Y is not normal for at least two levels of the factor.
4.3.2.2 Example: Y = tensile strength of a product
Factor = 4 Suppliers
Obtain samples of size 5 from each supplier (n = ____ )
Assumption Test Prob -Level
Skewness Normality of Residuals 0.605780
Kurtosis Normality of Residuals 0.548522
Omnibus Normality of Residuals 0.731126
· Null hypothesis: distribution of ______is normality distributed for all ______)
· Alternative: distribution of ______is non-normally distributed for at least one level of ______)
· Test Statistic: p-value
· Rejection region: p-value < alpha
· Conclusion: We can (not) say that the distribution of ______is non-normally distributed for at least one level of ______)
4.4 Testing the difference in means
4.4.1 Expermentwise error versus comparison error
4.4.2 Testing one factor
Use NCSS. The output will tell you which means are statistically different
Example: Y = tensile strength of a product
Factor = 4 Suppliers
Obtain samples of size 5 from each supplier
Tukey-Kramer Multiple-Comparison Test
Response: strength
Term A: supplier
Alpha=0.050 Error Term=S(A) DF=16 MSE=6.094 Critical Value=4.046122
Different
Group Count Mean From Groups
1 5 19.52 2
4 5 21.16
3 5 22.84
2 5 24.26 1
Conclusions: We can say that the average value of (Y) ______for (factor level) ______differs from (factor level).
<Repeat for each difference>
The average (Y) for the other (factor levels) ______are not significantly different.
4.4.3 Same procedure works for Randomized Block and Two-factor studies without interaction.
4.5 Nonparametric tests
4.5.1 Kruskal-Wallis test
· One-factor designs
· Compares medians instead of means
· Test similar to ANOVA but does not require normality
· Using NCSS: p-value < alpha reject equality of medians
4.5.2 Friedman’s Test
· Randomized block designs
· Compares medians instead of means
· Test similar to ANOVA but does not require normality
· Using NCSS p-value < alpha reject equality of medians
5. NCSS
5.1 data format: place all the values of Y in one column and let the next column(s) be the values of the factor(s).
5.2 Approach
5.2.1 One factor designs
· Click on Analysis, ANOVA, one-way anova
· Choose the dependent variable and factor
· In reports, uncheck EMS report and check Tukey-Kramer Test
5.2.2 Randomized Block Designs
· Click on Analysis, ANOVA, Analysis of Variance
· Choose
o First, the dependent variable
o Second, for factor 1 the block and choose Random from Type-list
o Third, for factor 2 the factor of interest, (fixed type)
· In reports, uncheck EMS report and check Tukey-Kramer Test
5.2.3 Two-factor designs
· Click on Analysis, ANOVA, Analysis of Variance
· Choose
o First, the dependent variable
o Second, factor 1 Type Fixed
o Third, factor 2 Type fixed
o If interaction exists, tests for two-factor interaction
· In reports, uncheck EMS report and check Tukey-Kramer Test