5

Random and Mixed Effects ANOVA

A classification variable in ANOVA may be either “fixed” or “random.” The meaning of “fixed” and “random” are the same as they were when we discussed the distinction between regression and correlation analysis. With a fixed variable we treat the observed values of the variable as the entire population of interest. Another way to state this is to note that the sampling fraction is one. The sampling fraction is the number of values in the sample divided by the number of values in the population.

Suppose that one of the classification variables in which I am interested is the diagnosis given to a patient. There are three levels of this variable, 1 (melancholic depression), 2 (postpartum depression), and 3 (seasonal affective disorder). Since I consider these values (1, 2, and 3) the entire population of interest, the variable is fixed.

Suppose that a second classification variable is dose of experimental therapeutic drug. The population of values of interest ranges from 0 units to 100 units. I randomly chose five levels from a uniform population that ranges from 0 to 100, using this SAS code:

Data Sample; Do Value=1 To 5;Dose=round(100*Uniform(0)); Output; End; run;

Proc Print; run;

quit;

Obs Value Dose

1 1 12

2 2 23

3 3 54

4 4 64

5 5 98

In my research, I shall use the values 12, 23, 54, 64, and 98 units of the drug. There is a uncountably large number of possible values between 0 and 100, so my sampling fraction is 5/¥ = 0. Put another way, dose of drug is a random effects variable.

The Group x Dose ANOVA here will be “mixed effects,” because there is a mixture of fixed and random effects. When calculating the F ratios, we need to consider the expected values for the mean squares in both numerator and denominator. We want the denominator (error term) to have an expected mean square that contains everything in the numerator except the effect being tested.

Howell (Statistical Methods for Psychology, 7th edition, page 433) shows the expected values for the mean squares. They are:

Main effect of Group (fixed): Group, Interaction, Error

Main effect of Dose (random): Dose, Error

Group x Dose Interaction: Interaction, Error

Within Cells Error (MSE): Error

The F for the main effect of group will be . Under the null, group has zero effect, and the expected value of F is (0 + interaction +error)/(interaction + error) = 1. If group has an effect, the expected value of F > 1.

The F for the main effect of dose will be . Under the null, dose has no effect, and the expected value of F is (0 + error/error) = 1. If dose has an effect, the expected value of F > 1.

The F for the Group x Dose interaction will be . Under the null, the interaction has no effect, and the expected value of F is (0 + error/error) = 1. If dose has an effect, the expected value of F > 1.

You can use the TEST statement in PROC GLM to construct the appropriate F tests.

An Example

Download the Excel file ANOVA-MixedEffects.xls, available at http://core.ecu.edu/psyc/wuenschk/StatData/StatData.htm .

Bring it into SAS. If you do not know how to do this, read my document Excel to SAS .

Run this code:

proc glm; class group dose; model score = group|dose / ss3;

Test H = group E = group*dose;

title 'Mixed Effects ANOVA: Group is fixed, dose is random'; run;

------

Mixed Effects ANOVA: Group is fixed, dose is random

The GLM Procedure

Dependent Variable: Score

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 14 978.986667 69.927619 15.20 <.0001

Error 60 276.000000 4.600000

Corrected Total 74 1254.986667

R-Square Coeff Var Root MSE Score Mean

0.780077 28.02388 2.144761 7.653333

Source DF Type III SS Mean Square F Value Pr > F

Group 2 246.7466667 123.3733333 26.82 <.0001

Dose 4 612.5866667 153.1466667 33.29 <.0001

Group*Dose 8 119.6533333 14.9566667 3.25 0.0039

Tests of Hypotheses Using the Type III MS for Group*Dose as an Error Term

Source DF Type III SS Mean Square F Value Pr > F

Group 2 246.7466667 123.3733333 8.25 0.0114

The appropriate F statistics are:

Group: F(2, 8) = 8.25

Dose: F(4, 60) = 33.29

Group x Dose: F(8, 60) = 3.25.

Random Command

GLM has a random command that can be used to identify random effects. Unfortunately it does not result in the properly selection of error terms for a mixed model.

proc glm; class group dose; model score = group|dose / ss3; Random dose group*dose / Test;

title 'Mixed Effects ANOVA: Group is fixed, dose is random'; run;

Tests of Hypotheses for Mixed Model Analysis of Variance

Dependent Variable: Score Score

Source DF Type III SS Mean Square F Value Pr > F

Group 2 246.746667 123.373333 8.25 0.0114

Dose 4 612.586667 153.146667 10.24 0.0031

Error 8 119.653333 14.956667

Error: MS(Group*Dose)

Source DF Type III SS Mean Square F Value Pr > F

Group*Dose 8 119.653333 14.956667 3.25 0.0039

Error: MS(Error) 60 276.000000 4.600000

Notice that SAS has used the interaction MS as the error term for both main effects. It should have used it only for the main effect of group. SPSS UNIANOVA has the same problem Howell (7th ed.) has noted this in footnote 2 on page 434.

Power Considerations

Interaction mean squares typically have a lot fewer degrees of freedom than do error mean squares. This can cost one considerable power when an interaction mean square is used as the denominator of an F ratio. Howell (5th edition, page 445) suggested one possible way around this problem. If the interaction effect has a large p value (.25 or more), dump it from the model. This will result in the interaction SS and the interaction df being pooled together with the error SS and the error df. The resulting pooled error term, , is then used as the denominator for testing both main effects. For the data used above, the interaction was significant, so this would not be appropriate. If the interaction had not been even close to significant, this code would produce the appropriate analysis:

proc glm; class group dose; model score = group dose / ss3;

title 'Main Effects Only, Interaction Pooled With Within-Cells Error'; run;

Subjects – the Hidden Random Effect

We pretend that we have randomly sampled subjects from the population to which we wish to generalize our results – or we restrict our generalizations to that abstract population for which our sample could be considered random. Accordingly, subjects is a random variable. In the typical ANOVA, independent samples and fixed effects, subjects is lurking there as a random effect. What we call “error” is simply the effect of subjects, which is nested within the cells. ANOVA is only necessary when we have at least one random effect (typically subjects) and we wish to generalize our results to the entire population of subjects from which we randomly sampled. If we were to consider subjects to be a fixed variable, then we would have the entire population and would not need ANOVA – the means, standard deviations, etc. computed with our data would be parameters, not statistics, and there would be no need for inferential statistics.

Nested and Crossed Factors

Suppose one factor was Households and another was Neighborhoods. Households would be nested within Neighborhoods – each household is in only one neighborhood. If you know the identity of the household, you also know the identity of the neighborhood.

Neighborhood 1 / Neighborhood 2 / Neighborhood 3
H1 / H4 / H7
H2 / H5 / H8
H3 / H6 / H9

Now suppose that one factor is Teachers, the other is Schools, and each teacher taught at each of the three schools. Teachers and Schools are crossed.

School 1 / School 2 / School 3
T1 / T1 / T1
T2 / T2 / T2
T3 / T3 / T3

Between Subjects (Independent Samples) Designs

The subjects factor is nested within the grouping factor(s).

Group 1 / Group 2 / Group 3
S1 / S4 / S7
S2 / S5 / S8
S3 / S6 / S9

Within Subjects (Repeated Measures, Related Samples, Randomized Blocks) Designs

With this design, the subjects (or blocks or plots) factor is crossed with the other factor.

Condition 1 / Condition 2 / Condition 3
S1 / S1 / S1
S2 / S2 / S2
S3 / S3 / S3

Omega Squared

If you want to use w2 with data from a mixed-effects or random effects ANOVA, you will need to 438-440 in Howell (7th ed.). After reading that, you just might decide that h2 is adequate.

Karl L. Wuensch, Dept. of Psychology, East Carolina Univ., Greenville, NC USA

19. December 2010