252grass4-051 4/13/05 (Open this document in 'Page Layout' view!)

Name:

Class days and time:

Please include this on what you hand in!

Graded Assignment 4

The data set GOLFBALL is in problem 11.14 of the text 9th or 10th edition or on the CD. You must answer at least the following questions versions A, B and C of the problem.

Only neat and legible papers with written answers in complete sentences will be read!

a) At a 5% level is there evidence of a difference in the average distance traveled by the golf balls with different designs? Why? c) What assumptions are necessary in (a)? e) What golf ball design should be chosen?

Do this problem in Excel as follows.

Use columns A, B, C, E and F on the Excel spreadsheet for data

In the first row of Columns B, C, D and F put in Des 1, Des 2, Des 3 and Des 4. Label Column E with Des 4a and Column A with ‘golfer.’ Starting in Cell A2 Put in the letters A through J to identify the golfers – unless, of course, you want to suggest some names.

Now put in the data in columns B, C, D and F, skipping column D

Version A

To fill column E in cell E2 write =F2 after your 'enter' this cell should read '213.9'

Use the fill handle on cell E2 to make column E identical to column F except for the heading. Do not go on unless this is true. Save your data as gdataA.xls

Use the 'tools' pull-down menu and pick ‘data analysis' (If you cannot find this, use Tools and Add-Ins to put in the analysis packs.)

Pick 'ANOVA: Single Factor. Set input range to $B$1:$E$11. Select 'New worksheet ply' and ‘columns’, check 'labels in first row' hit 'OK' and save your results as gresltA.xls.

Version B

In order to check for the effect of the fact that the data is blocked by employees, repeat the analysis using ‘ANOVA: Two-Factor without replication. Set input range to $A$1:$E$11, check ‘labels,’ and save your results as gresltB.xls

Version C

Take the last digit of your student number (if it's zero, use 10) and add 5 to it. Call this and make sure that I know its value. Go back to your original data or use the 'file' pull-down menu to open gresltA.xls.

To fill column E this time in cell E2 write =F2+.Now highlight cell E2 and use the fill handle to make column E equal to column F plus . Do not go on unless this is true. Save your data as gdataC.xls.

Run the one-way ANOVA again and save your results as gresltC.xls

Submit the data and results with your Student number. The most effective way to do this is to paste the results into a Word document and then add neat hand or typed notes. Indicate what hypotheses were tested, what the p-value was and whether, using the p-value, you would reject the null if (i) the significance level was 5% and (ii) the significance level was 10%, explaining why. You will have two answers for each of your two problems.

For your version C ANOVA do a Scheffe confidence interval and a Tukey-Kramer interval or procedure for each of the possible differences between means and report which are different at the 5% level according to each of the 2 methods. Answer as much as you can of the questions in Problem 11.14. The extra credit below may be needed for a really complete answer.

Extra Credit: Take the data from your last ANOVA and perform a Levene test on it using the third example in 252mvarex. as a pattern for your calculations. Make sure that you explain what is being tested and what you conclude. Hand in separately – this will be treated as extra credit on your next take-home exam. See below for all of this.

Extra Extra Credit: Do Bartlett and Levene tests using the example in 252mvar as your pattern. It turns out that your ANOVA has just enough columns to do this test. See below for all of this.

Data for 1st and 2nd ANOVA

gdataA

Golfer / Des 1 / Des 2 / Des 3 / Des 4a / Des 4
1 / 206.32 / 203.81 / 217.08 / 213.9 / 213.9
2 / 223.85 / 223.85 / 230.55 / 231.1 / 231.1
3 / 207.94 / 206.75 / 221.43 / 221.28 / 221.28
4 / 224.79 / 223.97 / 227.95 / 221.53 / 221.53
5 / 206.19 / 205.68 / 218.04 / 229.43 / 229.43
6 / 229.75 / 234.3 / 213.84 / 235.45 / 235.45
7 / 204.45 / 204.49 / 224.13 / 213.54 / 213.54
8 / 228.51 / 219.5 / 224.87 / 228.35 / 228.35
9 / 209.65 / 210.86 / 211.82 / 214.51 / 214.51
10 / 221.44 / 233 / 229.49 / 225.09 / 225.09

Results for 1st ANOVA

gresltA

Anova: Single Factor
SUMMARY
Groups / Count / Sum / Average / Variance
Des 1 / 10 / 2162.89 / 216.289 / 104.6483
Des 2 / 10 / 2166.21 / 216.621 / 139.6653
Des 3 / 10 / 2219.2 / 221.92 / 43.08287
Des 4a / 10 / 2234.18 / 223.418 / 60.3002
ANOVA
Source of Variation / SS / df / MS / F / P-value / F crit
Between Groups / 397.9091 / 3 / 132.6364 / 1.525886 / 0.224397 / 2.866266
Within Groups / 3129.27 / 36 / 86.92417
Total / 3527.179 / 39


Results for 2 nd ANOVA

gresltB

Anova: Two-Factor Without Replication
SUMMARY / Count / Sum / Average / Variance
1 / 4 / 841.11 / 210.2775 / 38.96229
2 / 4 / 909.35 / 227.3375 / 16.26729
3 / 4 / 857.4 / 214.35 / 65.66647
4 / 4 / 898.24 / 224.56 / 7.024667
5 / 4 / 859.34 / 214.835 / 127.2787
6 / 4 / 913.34 / 228.335 / 99.43723
7 / 4 / 846.61 / 211.6525 / 87.47603
8 / 4 / 901.23 / 225.3075 / 17.81042
9 / 4 / 846.84 / 211.71 / 4.272733
10 / 4 / 909.02 / 227.255 / 25.50057
Des 1 / 10 / 2162.89 / 216.289 / 104.6483
Des 2 / 10 / 2166.21 / 216.621 / 139.6653
Des 3 / 10 / 2219.2 / 221.92 / 43.08287
Des 4a / 10 / 2234.18 / 223.418 / 60.3002
ANOVA
Source of Variation / SS / df / MS / F / P-value / F crit
Rows / 2058.09 / 9 / 228.6766 / 5.763988 / 0.00018 / 2.250131
Columns / 397.9091 / 3 / 132.6364 / 3.343212 / 0.033859 / 2.960351
Error / 1071.18 / 27 / 39.67334
Total / 3527.179 / 39

Answer: In the first ANOVA we get a p-value of .224397. Since this is above any significance level we are likely to use, we do not reject the null hypothesis that the mean distance that the golf balls go is the same for all numbers of hours worked. . In the second ANOVA, the p-value for columns (.033859) is much lower, so we can reject the original null hypothesis at the 5% significance level. Note that there is a very significant difference between golfers. Too bad that this version completely differs from what it said in the problem.


Results for 3rd ANOVA

gdataC

Golfer / Des 1 / Des 2 / Des 3 / Des 4c / Des 4
1 / 206.32 / 203.81 / 217.08 / 223.9 / 213.9
2 / 223.85 / 223.85 / 230.55 / 241.1 / 231.1
3 / 207.94 / 206.75 / 221.43 / 231.28 / 221.28
4 / 224.79 / 223.97 / 227.95 / 231.53 / 221.53
5 / 206.19 / 205.68 / 218.04 / 239.43 / 229.43
6 / 229.75 / 234.3 / 213.84 / 245.45 / 235.45
7 / 204.45 / 204.49 / 224.13 / 223.54 / 213.54
8 / 228.51 / 219.5 / 224.87 / 238.35 / 228.35
9 / 209.65 / 210.86 / 211.82 / 224.51 / 214.51
10 / 221.44 / 233 / 229.49 / 235.09 / 225.09

gresltC

Anova: Single Factor
SUMMARY
Groups / Count / Sum / Average / Variance
Des 1 / 10 / 2162.89 / 216.289 / 104.6483
Des 2 / 10 / 2166.21 / 216.621 / 139.6653
Des 3 / 10 / 2219.2 / 221.92 / 43.08287
Des 4c / 10 / 2334.18 / 233.418 / 60.3002
ANOVA
Source of Variation / SS / df / MS / F / P-value / F crit
Between Groups / 1919.109 / 3 / 639.703 / 7.359323 / 0.000573 / 2.866266
Within Groups / 3129.27 / 36 / 86.92417
Total / 5048.379 / 39

The modified problem is giving us some very real differences in the average distance the various golf ball designs go. The p-value is low enough to cause a rejection of the null hypothesis at any usual significance level.


Types of contrast between means.

Individual Confidence Interval

If we desire a single interval, we use the formula for the difference between two means when the variance is known. For example, if we want the difference between means of column 1 and column 2. , where .

Scheff Confidence Interval

If we desire intervals that will simultaneously be valid for a given confidence level for all possible intervals between column means, use .

Tukey Confidence Interval

This also applies to all possible differences.

. This gives rise to Tukey’s HSD (Honestly Significant Difference) procedure. Two sample means and are significantly different if is greater than

Contrasts

From the Excel output, and . So Assume . We will need . The table says and . Since 36 is about halfway between 30 and 40, take a halfway point between the two table values and say .. The contrasts follow.

Individual: ns

Scheffe:

ns

Tukey:

ns

From the Excel output,

Individual: ns

Scheffe:

ns

Tukey:

ns

From the Excel output,

Individual: s

Scheffe:

s

Tukey:

s

From the Excel output,

Individual: ns

Scheffe:

ns

Tukey:

ns

From the Excel output,

Individual: s

Scheffe:

s

Tukey:

s

From the Excel output,

Individual: s

Scheffe:

ns

Tukey:

s

Conclusion: I have included individual confidence levels here for completeness. The analysis of variance definitely tells us that the means are not the same, regardless of the significance level we might want to use, because the p-value is small. If we compare the differences in sample means we find that there is no difference between the means for the first 3 designs, but that most of the intervals show design 4 to be superior. The intervals are labeled ‘ns’ for not significant and ‘s’ for significant depending on whether the error part of the interval is larger or smaller than the difference between sample means.

Extra Credit: Take the data from your last ANOVA and perform a Levene test on it using the third example in 252mvarex. as a pattern for your calculations using Minitab. Make sure that you explain what is being tested and what you conclude.

To do this copy your data into rows 1-10 of columns 1-5. Remember that your column labels should be written in above the columns. Just to make sure that you are in the right place, print out your data and run a one-way ANOVA using: print c1-c5

AOVO c2-c5

The test is simply vartest c2-c5;

unstacked.

Don’t give me results without explaining them.

————— 4/5/2005 11:23:01 PM ————————————————————

Welcome to Minitab, press F1 for help.

MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\2gr4-051C.MTW".

Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My

Documents\Minitab\2gr4-051C.MTW'

Worksheet was saved on Tue Apr 05 2005

Results for: 2gr4-051C.MTW

MTB > describe c2-c5

Descriptive Statistics: Des 1, Des 2, Des 3, Des 4

Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3

Des 1 10 0 216.29 3.23 10.23 204.45 206.29 215.55 225.72

Des 2 10 0 216.62 3.74 11.82 203.81 205.38 215.18 226.23

Des 3 10 0 221.92 2.08 6.56 211.82 216.27 222.78 228.34

Des 4 10 0 228.42 2.46 7.77 218.54 219.36 228.31 234.85

Variable Maximum

Des 1 229.75

Des 2 234.30

Des 3 230.55

Des 4 240.45

MTB > AOVOneway c2-c5.

One-way ANOVA: Des 1, Des 2, Des 3, Des 4

Source DF SS MS F P

Factor 3 971.0 323.7 3.72 0.020 This is identical to our previous ANOVA

Error 36 3129.3 86.9

Total 39 4100.3

S = 9.323 R-Sq = 23.68% R-Sq(adj) = 17.32%

MTB > vartest c2-c5;

SUBC> unstacked.

Test for Equal Variances: Des 1, Des 2, Des 3, Des 4

95% Bonferroni confidence intervals for standard deviations

N Lower StDev Upper

Des 1 10 6.40248 10.2298 22.6233

Des 2 10 7.39650 11.8180 26.1357

Des 3 10 4.10804 6.5638 14.5159

Des 4 10 4.86006 7.7653 17.1731

Bartlett's Test (normal distribution)

Test statistic = 3.51, p-value = 0.320

Levene's Test (any continuous distribution)

Test statistic = 3.78, p-value = 0.019

Test for Equal Variances: Des 1, Des 2, Des 3, Des 4

We have very interesting results.
If we are justified using ANOVA, then the Bartlett test should be the correct one to use and should be more powerful than the Levene test. This is not what seems to be happening. The Levene test, with a p-value of 1.9%, which is below 5% rejects the null hypothesis of equal variances. The Bartlett test, with a much higher p-value does not.