Simple Linear Regression Notes

Analysis of Variance Notes

· Experiments versus Studies

· Types of Experiments

· Assumptions & Assumption Checks

· Types of Analysis

· NCSS

1. Experiments versus Studies

1.1 Terminology

· Factors versus Independent Variables

Example: hours studied and major are two factors affecting Grade

· Treatments –

Example: specific combinations of hours studied and teaching method

1.2 Purpose

· Observational Study –

o Correlational –

o Observe values of X

· Experiment –

o Cause-Effect

o Control values of X

· Designs

o Balanced

o unbalanced

2. Types of Experimental Designs

2.1 Randomized Design

· one factor

· two factor

2.2 Randomized Block Design

2.3 Examples

· Teaching Method only

· Teaching method and hours studied

· Teaching method within major

3. Assumptions & Assumption Checks

3.1 Assumptions

· Same Variance

· Independence

· Normality

3.2 Assumption checks

· Modified Levine – comparing differences to center

· Normality Tests and Box Plots

4. Analysis

· Sources of Variability and degrees of freedom

· Tests of effects of

o One factor designs

o Each factor in two factor designs

o Combination effects in two factor designs

· Tests of Assumptions

· Tests and estimation of differences in averages

4.1 Sources of variability and degrees of freedom

· Total: Values around overall average: divisor of (n-1)

· Factor: Factor averages variation : divisor of (# averages – 1)

· Interaction: Combination effects: divisor of (product of factor divisors)

· Error: Randomness: divisor of (n - # of averages or combination of averages)

4.2 Tests of effects of

4.2.1 One Factor – Completely Randomized Design or Independent Sample Study

4.2.1.1 Test Template:

· Null hypothesis: average value of Y is the same for all levels of the factor

· Alternative: at least two are different

· Test Statistic: Compares variation of factor averages to variation of random data

Among-Group variation to within-group variation

· Rejection Region: Above ratio is large (F ratio) > F table

Two degrees of freedom: numerator degrees of freedom and divisor degrees of freedom

· Conclusion: We can (not) say the average value of Y differs for at least two levels of the factor.

4.2.1.2 Example: Y = tensile strength of a product

Factor = 4 Suppliers

Obtain samples of size 5 from each supplier (n = ____ )

MSA = sample factor variability = 21.095

MSW = sample error variability = 6.094

· Null hypothesis: m1=m2=m3=m4 (average value of ______is the

Same for all ______)

· Alternative: at least two are different

· Test Statistic: MSA/MSW =

· Rejection Region: Reject Ho if F > F table with

· Numerator degrees of freedom = ______and denomination d.f. = _____

· F-Table = ______

· Conclusion: We can (not) say that the average ______differs for at least two ______

4.2.2 One Factor – Randomized Block

4.2.2.1 Test Template:

Same as in 4.2.1.1 but divisor degrees of freedom =

(# of factor means-1)*( # of block means-1)

4.2.1.2 Example: Y = Rating of a restaurant’s service

Factor = 4 Restaurants

Block = all restaurants reviewed by same 6 raters (n = ____ )

MSA = sample factor variability = 595.8

MSE = sample error variability = 14.986

· Null hypothesis: m1=m2=m3=m4 (average value of ______is the

Same for all ______)

· Alternative: at least two are different

· Test Statistic: MSA/MSW =

· Rejection Region: Reject Ho if F > F table with

· Numerator degrees of freedom = ______and denomination d.f. = _____

· F-Table = ______

· Conclusion: We can (not) say that the average ______differs for at least two ______

4.2.3 Two Factors – Interaction or combination effects

4.2.3.1 Test Template:

· Null hypothesis: (no interaction) difference in average value of Y between any two levels of factor one does not depend on the level of factor two

· Alternative: (interaction) difference in average value of Y between any two levels of factor one does depend on the level of factor two

· Test Statistic: Compares variation of interaction to variation of random data

Among-Group variation to within-group variation

· Rejection Region: Above ratio is large (F ratio) > F table

Two degrees of freedom:

numerator d.f. = product of factor 1 and 2 d.f.

denominator = n – number of combination of factor 1 and 2

· Conclusion: we can (not) say that the difference in average value of Y between any two levels of factor one does depend on the level of factor two.

4.2.1.2 Example: Y = length of a ball-bearings life

Factor 1 = heat treatment (high or low)

Factor 2 = ring osculation (high or low)

Obtain samples of size 2 from each combination (n = ____ )

MSAB = sample interaction variability = 3280.5

MSE = sample error variability = 61

· Null hypothesis: (no interaction) difference in average value of ______between any two levels of ______does not depend on the level of ______

· Alternative: (interaction) difference in average value of ______between any two levels of ______does depend on the level of ______

· Test Statistic: Compares variation of interaction to variation of random data

F= MSAB / MSE =

· Rejection Region: Above ratio is large (F ratio) > F table

Two degrees of freedom:

numerator d.f. = product of factor 1 and 2 d.f = .

denominator = n – number of combination of factor 1 and 2 =

· Conclusion: we can (not) say that the difference in average value of ______between any two levels of ______does depend on the level of ______.

4.2.4 One of the two factors – Completely Randomized Design or Independent Sample Study – NO SIGNIFICANT INTERACTION

4.2.4.1 Test Template:

same as in the one-factor test but

divisor d.f. = n – (#of levels of factor 1)*(# in factor 2)

4.2.4.2 Example: Y = rating of a photographic plate

Factor A = 2 levels of development strength,

Factor B = 2 levels of development time (10 and 14 minutes)

Randomly assign 4 plates to each of the 4 combinations

MSA = sample variability of factor A (time) = 1.5625

MSB = sample variability of factor B (strength) = 56.5625

MSE = sample error variability = 2.229

(no interaction was found – testing time effect)

· Null hypothesis: m1=m2 (average value of ______is the

Same for all ______)

· Alternative: at least two are different

· Test Statistic: MSB/MSE =

· Rejection Region: Reject Ho if F > F table with

· Numerator degrees of freedom = ______and denomination d.f. = _____

· F-Table = ______

· Conclusion: We can (not) say that the average ______differs for at least two ______

4.3 Tests of Assumptions

4.3.1 Equal Variance –

4.3.1.1 Test Template:

· Null hypothesis: variation of Y is the same for all levels of the factor

· Alternative: at least two are different

· Compute the absolute difference between each value in a group and the median of the group

Test Statistic and rejection region: same as for the factor tests

· Conclusion: We can (not) say the variation of Y differs for at least two levels of the factor.

4.3.1.2 Example: Y = tensile strength of a product

Factor = 4 Suppliers

Obtain samples of size 5 from each supplier (n = ____ )

MSDifference = sample factor variability = 0.59

MSE = sample error variability = 2.2853

· Null hypothesis: s21=s22=s23=s24 (variability of ______is the

Same for all ______)

· Alternative: at least two are different

· Test Statistic: MSDiff/MSE =

· Rejection Region: Reject Ho if F > F table with

· Numerator degrees of freedom = ______and denomination d.f. = _____

· F-Table = ______

· Conclusion: We can (not) say that the variability of ______differs for at least two ______

4.3.2 Normality –

4.3.2.1 Test Template:

· Null hypothesis: distribution of Y is the normal for all levels of the factor

· Alternative: at least one is not normal

Test Statistic and rejection region: use tests on NCSS and p-value is less than alpha reject normality.

· Conclusion: We can (not) say the distribution of Y is not normal for at least two levels of the factor.

4.3.2.2 Example: Y = tensile strength of a product

Factor = 4 Suppliers

Obtain samples of size 5 from each supplier (n = ____ )

Assumption Test Prob -Level

Skewness Normality of Residuals 0.605780

Kurtosis Normality of Residuals 0.548522

Omnibus Normality of Residuals 0.731126

· Null hypothesis: distribution of ______is normality distributed for all ______)

· Alternative: distribution of ______is non-normally distributed for at least one level of ______)

· Test Statistic: p-value

· Rejection region: p-value < alpha

· Conclusion: We can (not) say that the distribution of ______is non-normally distributed for at least one level of ______)

4.4 Testing the difference in means

4.4.1 Expermentwise error versus comparison error

4.4.2 Testing one factor

Use NCSS. The output will tell you which means are statistically different

Example: Y = tensile strength of a product

Factor = 4 Suppliers

Obtain samples of size 5 from each supplier

Tukey-Kramer Multiple-Comparison Test

Response: strength

Term A: supplier

Alpha=0.050 Error Term=S(A) DF=16 MSE=6.094 Critical Value=4.046122

Different

Group Count Mean From Groups

1 5 19.52 2

4 5 21.16

3 5 22.84

2 5 24.26 1

Conclusions: We can say that the average value of (Y) ______for (factor level) ______differs from (factor level).

The average (Y) for the other (factor levels) ______are not significantly different.

4.4.3 Same procedure works for Randomized Block and Two-factor studies without interaction.

4.5 Nonparametric tests

4.5.1 Kruskal-Wallis test

· One-factor designs

· Compares medians instead of means

· Test similar to ANOVA but does not require normality

· Using NCSS: p-value < alpha reject equality of medians

4.5.2 Friedman’s Test

· Randomized block designs

· Compares medians instead of means

· Test similar to ANOVA but does not require normality

· Using NCSS p-value < alpha reject equality of medians

5. NCSS

5.1 data format: place all the values of Y in one column and let the next column(s) be the values of the factor(s).

5.2 Approach

5.2.1 One factor designs

· Click on Analysis, ANOVA, one-way anova

· Choose the dependent variable and factor

· In reports, uncheck EMS report and check Tukey-Kramer Test

5.2.2 Randomized Block Designs

· Click on Analysis, ANOVA, Analysis of Variance

· Choose

o First, the dependent variable

o Second, for factor 1 the block and choose Random from Type-list

o Third, for factor 2 the factor of interest, (fixed type)

· In reports, uncheck EMS report and check Tukey-Kramer Test

5.2.3 Two-factor designs

· Click on Analysis, ANOVA, Analysis of Variance

· Choose

o First, the dependent variable

o Second, factor 1 Type Fixed

o Third, factor 2 Type fixed

o If interaction exists, tests for two-factor interaction

· In reports, uncheck EMS report and check Tukey-Kramer Test