R. Mead (1977, The design of expertiments, Cambridge University press) describes a randomized complete block experiment with rice, in which ten different spacing treatments were compared. The ten spacings were all possible combinations of pairs of inter-seedling distances (15 cm, 20 cm, 24 cm and 30 cm).
Table 1: Rice spacing data (Mead, 1977, p. 13)
Block
SpacingIIIIIIIV
30 cm 30 cm5.95 5.30 6.50 6.35
30 cm 24 cm7.10 6.45 6.60 5.75
30 cm 20 cm7.00 6.50 6.35 8.90
30 cm 15 cm8.10 5.50 6.60 7.50
24 cm 24 cm8.85 7.65 7.00 7.90
24 cm 20 cm7.65 6.90 8.25 8.30
24 cm 15 cm7.80 6.75 8.20 7.25
20 cm 20 cm8.05 6.65 8.10 8.05
20 cm 15 cm9.30 8.75 8.75 8.00
15 cm 15 cm9.35 8.10 7.607.75
We first analyse this experiment by an ANOVA treating spacing as a quantitative factor.
Source DF Type I SS Mean Square F Value Pr > F
Block 3 5.88000000 1.96000000 4.26 0.0138
Treatment 9 23.14225000 2.57136111 5.59 0.0002
Error 27 12.41375000 0.45976852
There is clearly a significant treatment effect. The treatment levels have a quantitative dimension. The most important aspect is the area per seedling. A linear regression against area per seedling yields the following result:
Source DF Type I SS Mean Square F Value Pr > F
Block 3 5.88000000 1.96000000 4.26 0.0138
Area 1 16.78629155 16.78629155 36.51 <.0001
Lack-of-fit 8 6.35595845 0.79449481 1.73 0.1372
Error 27 12.41375000 0.45976852
The linear regression shows no lack-of-fit and so the model seems to fit well. Dropping the lack-of-fit effect, the fitted linear regression is
y = 9.11 – 0.00336(area)
Mead (1977, p.322) further inspected the fitted means from linear regression and observed means (Table 2). It appears that that yield is underestimated when spacing is more nearly square. In other words, with the same allocated area, growth conditions seem to be most favourable, when the area is a square. The more rectangular the area, the less favourable the growth conditions.
Table 2: Observed and fitted means from linear regression on area for rice spacing data (Mead, 1977, p.322).
SpacingObserved meanFitted meanDeviation$Shape 1000
30 cm 30 cm6.026.07–0.050.00
30 cm 24 cm6.486.66–0.186.23
30 cm 20 cm7.197.07+0.1220.62
30 cm 15 cm6.927.58–0.6660.66
24 cm 24 cm7.857.35+0.500.00
24 cm 20 cm7.787.64+0.144.16
24 cm 15 cm7.507.89–0.3927.74
20 cm 20 cm7.717.75–0.040.00
20 cm 15 cm8.708.09+0.6110.36
15 cm 15 cm8.208.35–0.150.00
$ Shape parameter is described in text below.
In order to formally test this hypothesis, it is useful to compute an index reflecting the shape of the spacing. One idea is to compute the circumference of the rectangular area defined by four adjacent hills and relate it to the minimal circumference that can be obtained form the same area. This minimum is achieved when the area is square. If x1 and x2 denote length and width of the rectangle (for the first spacing x1 = x2 = 30 cm), then the circumference is 2(x1 + x2), while the area is x1x2. If the rectangle were a square with area x1x2, then the circumference would be 4(x1x2). Thus, shape may be assessed by the index
shape = (x1 + x2)/[2(x1x2)].
For a square, the index equals unity. The more elongated the rectangular area, the larger becomes the shape index. Adding this regressor variable, we obtain the following ANOVA:
Source DF Type I SS Mean Square F Value Pr > F
Block 3 5.88000000 1.96000000 4.26 0.0138
Area 1 16.78629155 16.78629155 36.51 <.0001
Shape 1 2.49372217 2.49372217 5.42 0.0276
Lack-of-fit 7 3.86223628 0.55174804 1.20 0.3363
Error 27 12.41375000 0.45976852
The shape index is significant, so adding this to the model has been an improvement. Again, the lack of fit is non-significant, suggesting that the fit is adequate. Note that the F-value for lack of fit has dropped from 1.73 to 1.20, which is a notable improvement, though both F-values were non-significant. The fitted multiple regression model is
y = 9.40 – 0.00367(area) – 13.8(shape) .
In concluding, it should be stressed that the confidence interval for the shape parameter is rather wide, so the estimate is not very accurate.
Standard
Parameter Estimate Error t Value Pr > |t| 95% Confidence Limits
Area -0.00357209 0.00057472 -6.22 <.0001 -0.00474007 -0.00240411
Shape -13.82037776 6.05522213 -2.28 0.0288 -26.12606969 -1.51468583
1
SAS statements
data;
input x1 x2 @@; trt=x1*1000+x2; shape=(x1+x2)/2/sqrt(x1*x2) - 1; shape2=shape*1000; do block=1to4; input y@@; output; end;
datalines;
30 30 5.95 5.30 6.50 6.35
30 24 7.10 6.45 6.60 5.75
30 20 7.00 6.50 6.35 8.90
30 15 8.10 5.50 6.60 7.50
24 24 8.85 7.65 7.00 7.90
24 20 7.65 6.90 8.25 8.30
24 15 7.80 6.75 8.20 7.25
20 20 8.05 6.65 8.10 8.05
20 15 9.30 8.75 8.75 8.00
15 15 9.35 8.10 7.60 7.75
;
procglm;
class block trt;
model y=block trt/ss1;
run;
procglm;
class block trt;
model y=block x1*x2 trt/ss1;
run;
procglm;
class block trt;
model y=block x1*x2/ss1solution;
estimate'intercept' intercept 4 block 1111/divisor=4;
run;
procglm;
class block trt;
model y=block x1*x2 shape trt/ss1;
run;
procglm;
class block trt;
model y=block x1*x2 shape/ss1solutionclparm;
estimate'intercept' intercept 4 block 1111/divisor=4;
run;
1