Script
ANOVA – Randomized Blocks
Slide 1
· Welcome back. In this module we present another method for evaluating whether population means differ based on a particular factor, known as a randomized block approach. These are sometimes called 2-Factor Models Without Replication for reasons we shall discuss momentarily.
Slide 2
· One of the goals in many ANOVA models is to reduce the overall variability by reducing the variability that is unexplained.
· We will be more sure of our results and require fewer observations if there is less unexplained variability.
· We may also feel that more than one factor is affecting the outcome of an experiment.
· Both of these concerns are address by what are called multifactor designs.
· In this module, although we will be interested in only one factor that may affect the population means, we will “block” our responses to reduce the overall variability and offer a more fair comparison of the outcomes.
o Excel calls these randomized block models, Two Factor Without Replication Models
Slide 3
· In a randomized block experiment, we divide the population into subsets or blocks that each possesses a common characteristic. Then for each treatment, we select one element at random from each block.
· This matching or blocking is our attempt to reduce the overall variability
· Consider again our teaching mode example where we wished to determine if different teaching modes affected the grade on a final exam.
o When we took the sample we could have been unlucky. We may have, by chance, selected students that were, quote, good in math when we sampled those taking the course by lecture, and those not so good in math when we sampled those taking the course taking it by text reading. Because we have a gut feel that those that are good in math will, in general, do better than those who are not so good, if the sample was chosen completely at random, this might be the result we were measuring rather than the results based on teaching mode.
Slide 4
· To level the playing field so to speak, we may choose to model this situation using a randomized block approach. In a randomized block model, each element in the first group shares a common characteristic, each element in the second group has a different common characteristic, each element in the third group has still a different common characteristic, and so forth.
· So blocking “pairs things up”. We discussed this earlier when we did “matched pairs” experiments for the differences in two population means. But here, since there are more than two populations, we do not call it “pairing” anymore, we call it “blocking”.
· Thus for a randomized block model, we assume that
o The experiment itself consists of “b” blocks for each of k treatments.
o Then one observation is selected randomly and independently from each block for each treatment level. The distribution of possible outcomes for each treatment-block combination is assumed to have a normal distribution.
o And the standard deviations of each treatment-block combination, although unknown, are assumed to be equal.
Slide 5
· Let’s formalize our example of determining if different teaching modes affect the outcomes on a common final exam in statistics.
· This time the modeler actually does the following. He classifies each student taking the course in each teaching mode by the score that the student received on the math portion of the standard SAT tests. These classifications (which are the blocks) are scores:
o of 700 or more
o Scores between 600 and 700
o Between 500 and 600
o Between 400 and 500
o And less than 400
· So the experiment consists of a sample of 20 students: four who scored over 700: one of which took the course by lecture, a second who took it by text reading, a third that took it by videotape, and the fourth who took it over the internet; and four students who scored between 600 and 700: one of which took the course by lecture, a second who took it by text reading, a third that took it by videotape, and the fourth who took it over the internet, and so forth.
o Now we get the sample treatment means by averaging the scores of the five students who took the course by lecture, the five that took it by text reading, the five that took it by videotape, and the five that took it over the internet.
o But this time we also get a set of “block” means by averaging the scores of the four students that scored over 700, the four that score between 600 and 700, the four that scored between 500 and 600, the four that scored between 400 and 500, and the four that scored less than 400.
o And we calculate the grand mean by averaging the scores of all 20 students.
Slide 6
· Let’s look at the results generated by one such random sample of outcomes.
· Remember we have 4 treatments: Lecture, Text, Videotape, and Internet
· And we have 5 blocks based on SAT scores
o Those that score 700 or more on the SAT’s
o Those that scored between 600 and 700
o Between 500 and 600
o Between 400 and 500
o And those that scored less than 400
· As we said, we begin by averaging the scores for each of the four teaching modes – these are called the treatment means.
· Then we average the scores from each of the 5 SAT levels – these are called the block means.
· And we get the average of all the scores which we call the grand mean.
Slide 7
· Now we have a revised breakdown for the sums of squares and the degrees of freedom.
· Let’s begin with the degrees of freedom
o As always the total degrees of freedom is n minus 1
o And the degrees of freedom due to treatment is the number of treatments, k, minus 1
o But the remainder of the degrees of freedom, this time, is split into two parts. Block degrees of freedom, which is the number of blocks minus 1,
o and finally the remainder of the degrees of freedom, or the degrees of freedom due to error, gotten from subtracting the treatment and block degrees of freedom from the total degrees of freedom.
· There is a similar division for the Sums of squares
o The total sum of squares calculated in the same manner as before
o The treatment sum of squares also calculated in the same manner as before.
o But the remainder this time is broken into block sum of squares, calculated in a manner similar to the treatment sum of squares, but this time viewing the blocks as treatments.
o And the sum of squares due to error gotten from subtracting the treatment sum of squares and block sum of squares form the total sum of squares.
Slide 8
· Reviewing
· We have k treatments resulting in k-1 degrees of freedom for treatment which has a sum of squares SSTr,
· We have b blocks resulting in b-1 degrees of freedom for blocks which has a sum of squares SSBl,
· Thus we have a total of n equal to k times b observations which has n minus 1 degrees of freedom and a total sum of squares of SST
· Subtracting
· gives the information about the error term
o subtracting the degrees of freedom for treatment and blocks from the total degrees of freedom gives the degrees of freedom due to error
o and subtracting the sum of squares for treatment and blocks from the total sums of squares gives the sum of squares due to error.
Slide 9
· As before, given the sums of squares and degrees of freedom we can calculate mean square values.
· The mean square due to treatment is the sums of squares due to treatment divided by the treatment degrees of freedom
· The mean square due to blocks is the sums of squares due to blocks divided by the blocks degrees of freedom
· And the mean square due to error is the sums of squares due to error divided by the degrees of freedom due to error
· Now we did these calculations to answer one question, “Can we conclude that teaching mode produces differences in the mean final exam scores?”
o As before this is an F-test performed by comparing an F-statistic derived from mean squares due to treatment over mean squares due to error to some critical F-value, which for alpha equal to .05 is F sub .05, with degrees of freedom due to treatment in the numerator and degrees of freedom due to error in the denominator.
· Now we performed this experiment by blocking, because we felt that individuals in each block perform differently.
· This, in fact, can also be tested by performing an F-test by comparing an F-statistic derived from mean squares due to blocks over mean squares due to error to some critical F-value, which for alpha equal to .05 is F sub .05, with degrees of freedom due to blocks in the numerator and degrees of freedom due to error in the denominator.
Slide 10
· For the teaching mode example
· The calculation of the sums of squares goes like this.
o The total sum of squares is found by subtracting each individual observation from the grand mean, squaring them, and summing them up. This turns out to be 2534.
o The sum of squares due to treatment is found by treating each of the five observations for each treatment as if they were all the mean for that treatment and doing the same thing. So we would have five 80 minus 75-squares, five 69 minus 75-squares and so forth. This gives a treatment sum of squares of 430.
o We repeat the above process for the blocks; that is, we treat each of the four observations for each block as if they were all the mean for that block, that is we would have four 88 minus 75-squares, four 79 minus 75-squares, and so forth, giving a value for the sum of squares due to blocks of 1576
o The sum of squares due to error is the difference of the treatment and block sums of squares from the total sum of squares, giving 528.
· We now turn to the degrees of freedom
o The total degrees of freedom is n minus 1, which in this case is 20 minus 1 or 19.
o The degrees of freedom due to treatments is the number of levels of the treatment, 4, minus 1 giving 3.
o The degrees of freedom due to blocks is the number of blocks, 5, minus 1 giving 4.
o And the degrees of freedom due to error is the difference between the total degrees of freedom and those due to treatments and blocks; in this case 19 minus 3 minus 4 or 12.
Slide 11
· We can now do the required F-tests.
· To determine if scores are affected by teaching mode, we are doing a test of H0, all the treatment means are equal, versus Ha, that at least one differs from the others.
· We select alpha equal to .05
· And the test is Reject H0 or accept Ha, that is, conclude that there are differences based on teaching mode, if
· We get an F-statistic, calculated by mean squares due to treatment over mean squares due to error, that exceeds F sub .05, with treatment or 3 degrees of freedom in the numerator and error or 12 degrees of freedom in the denominator. From an F-table we see this value is 3.49.
· Performing the calculations we find that,
o The mean square due to treatment is 143.33 and the mean square due to error is 44.
o Thus the value of the F-statistic is 143.33 divided by 44 which is 3.26
o Since 3.26 is not greater than 3.49
o We cannot conclude that teaching mode affects the average exam scores.
Slide 12
· We mentioned that we could also do a test to see if there really was a difference in performance based on SAT scores.
· This is a test of H0, the block means are the same versus Ha that at least one differs from the others.
· We select alpha equal to .05
· And the test is Reject H0 or accept Ha, that is, conclude that there are differences based on SAT scores, if
· We get an F-statistic, this time calculated by mean squares due to block over mean squares due to error, that exceeds F sub .05, with block or 4 degrees of freedom in the numerator and error or 12 degrees of freedom in the denominator. From an F-table we see this value is 3.26.
· Performing the calculations we find that,
o The mean square due to blocks is 394 and the mean square due to error is 44.
o Thus the value of the F-statistic is 394 divided by 44 which is 8.95
o Since 8.95 is greater than 3.26
o So we can conclude that SAT scores affect the average exam scores.
Slide 13
· This analysis is easy to perform using Excel. First we enter the data systematically into a block. Here the block is from cell A1 to cell E6, with labels in the first row and the first column.
· Then from DATA ANALYSIS in the TOOLS menu we select Anova: Two Factor Without Replication – Excel is saying that there are two factors here, teaching mode and SAT score. Without replication means that we only take one observation for each teaching mode-SAT score pair, one form over 700 and lecture, one from less than 400 and internet, and so forth.
· In the resulting dialogue box, we enter the whole block from A1 to E6 for the input range, we check Labels since there is a row and a column of labels, and we select a cell, A8, where we wish the output to begin. Note that you must either include exactly one row and one column of labels or leave the box unchecked and highlight only the data points from B2 to E6.