Assignments:

  1. A. In order to discover the average number of children, Fred, a young college professor at a large Midwestern university, conducted a survey in which a random sample of 1,000 students in Psychology 101 were asked how many children (including themselves) were in their family. The researcher added all the data together and divided by 1,000 and uncovered the answer – 3.5. The most recent census has found that the average number of children in a U.S. family is 2.5.
  2. Why are these two numbers different?
  3. Should Fred have designed his study differently to get the right answer?
  4. How?

B. Prenatal screening for Down syndrome for mothers over the age or 35 is usually recommended. A non-invasive test is about 95% accurate. That is, if the fetus has Down syndrome it will be detected 95% of the time. And if the fetus does not have Down’s it will correctly say so 80% of the time. We know that Down’s is not very common, affecting only about one in every 200 fetuses whose mothers are over age 35.

a. What is the probability that if the test says the fetus has Down’s, that the test is correct?

b. What is the probability that if the test says the fetus doesn’t have Down’s, that the test is correct?

C. In a survey of hospitals it was found that those hospitals that had the highest proportion of female births tended to also have the fewest births of any of the hospitals in the survey. The Jones family, having already had a son, decided to boost their chances of a daughter by going to one of the hospitals that, so far this year, had the highest likelihood of female births.
a. Is this a sensible strategy?
b. If so, why? If not, why not?
c. How does this shed light on why the best performing mutual funds are usually small?
d. Should this guide our investment strategy? If so, why? If not, why not?

2. A. Find data displays in the mass media that illustrate at least two of the most common errors. You can find one display with multiple flaws, or two displays with one flaw apiece. Redo the displays correctly. Explain (i) where you found the displays, (ii) what you believe the point of the display was, (iii) what were the flaws, and (iv) what you did to fix them.

(e.g. see

B. What were the key lessons in Arbuthnot’s (1710) paper? Compare the explanations for the change in the number of christenings in 1704 with that in 1665-1666.

  1. Find one wonderful display in the mass media. Explain (i) where you found the display, (ii) what you believe the point of the display was, (iii) why you think it is wonderful.

3A. With your knowledge of improved methods of multivariate display, develop a display the following data set:

Antibiotic
Bacteria / Penicillin / Streptomycin / Neomycin / Gram Staining
Aerobacter aerogenes / 870 / 1 / 1.6 / negative
Brucella abortus / 1 / 2 / 0.02 / negative
Brucella anthracis / 0.001 / 0.01 / 0.007 / positive
Diplococcus pneumoniae / 0.005 / 11 / 10 / positive
Escherichia coli / 100 / 0.4 / 0.1 / negative
Klebsiella pneumoniae / 850 / 1.2 / 1 / negative
Mycobacterium tuberculosis / 800 / 5 / 2 / negative
Proteus vulgaris / 3 / 0.1 / 0.1 / negative
Pseudomonas aeruginosa / 850 / 2 / 0.4 / negative
Salmonella (Eberthella) typhosa / 1 / 0.4 / 0.008 / negative
Salmonella schottmuelleri / 10 / 0.8 / 0.09 / negative
Staphylococcus albus / 0.007 / 0.1 / 0.001 / positive
Staphylococcus aureus / 0.03 / 0.03 / 0.001 / positive
Streptococcus fecalis / 1 / 1 / 0.1 / positive
Streptococcus hemolyticus / 0.001 / 14 / 10 / positive
Streptococcus viridans / 0.005 / 10 / 40 / positive

The entries of the table are the minimum inhibitory concentration (MIC) in ug/ml, a measure of the effectiveness of the antibiotic. The MIC represents the concentration of antibiotic required to prevent growth in vitro. The covariate “gram staining” describes the reaction of the bacteria to Gram staining. Gram-positive bacteria are those that are stained dark blue or violet; Gram-negative bacteria do not react that way.

  1. Smoothing problem – One might think that if life expectancy is great the murder rate cannot be. But although murder does not take a huge toll on a population perhaps it is an indicant of other life-threatening processes going on in society.

(a) Plot life expectancy as a function of murder rate, then

(b) smooth life expectancy by adding the 53h smooth to the plot. What have you learned?

(c) Make a separate plot of residuals from the smooth vs. murder rate. What has this taught you?

(d) Add a straight-line fit to the plot. Does this help us to understand things better? Or does it hide things that the smooth has told us? Explain.

STATE NAME / LIFE EXPECT. / MURDER / HSGRAD / INCOME / ILLITERACY
Alabama / 69.1 / 15.1 / 41.3 / 3624 / 2.1
Alaska / 69.3 / 11.3 / 66.7 / 6315 / 1.5
Arizona / 70.6 / 7.8 / 58.1 / 4530 / 1.8
Arkansas / 70.7 / 10.1 / 39.9 / 3378 / 1.9
California / 71.7 / 10.3 / 62.6 / 5114 / 1.1
Colorado / 72.1 / 6.8 / 63.9 / 4884 / 0.7
Connecticut / 72.5 / 3.1 / 56.0 / 5348 / 1.1
Delaware / 70.1 / 6.2 / 54.6 / 4809 / 0.9
Florida / 70.7 / 10.7 / 52.6 / 4815 / 1.3
Georgia / 68.5 / 13.9 / 40.6 / 4091 / 2.0
Hawaii / 73.6 / 6.2 / 61.9 / 4963 / 1.9
Idaho / 71.9 / 5.3 / 59.5 / 4119 / 0.6
Illinois / 70.1 / 10.3 / 52.6 / 5107 / 0.9
Indiana / 70.9 / 7.1 / 52.9 / 4458 / 0.7
Iowa / 72.6 / 2.3 / 59.0 / 4628 / 0.5
Kansas / 72.6 / 4.5 / 59.9 / 4669 / 0.6
Kentucky / 70.1 / 10.6 / 38.5 / 3712 / 1.6
Louisiana / 68.8 / 13.2 / 42.2 / 3545 / 2.8
Maine / 70.4 / 2.7 / 54.7 / 3694 / 0.7
Maryland / 70.2 / 8.5 / 52.3 / 5299 / 0.9
Massachusetts / 71.8 / 3.3 / 58.5 / 4755 / 1.1
Michigan / 70.6 / 11.1 / 52.8 / 4751 / 0.9
Minnesota / 73.0 / 2.3 / 57.6 / 4675 / 0.6
Mississippi / 68.1 / 12.5 / 41.0 / 3098 / 2.4
Missouri / 70.7 / 9.3 / 48.8 / 4254 / 0.8
Montana / 70.6 / 5.0 / 59.2 / 4347 / 0.6
Nebraska / 72.6 / 2.9 / 59.3 / 4508 / 0.6
Nevada / 69.0 / 11.5 / 65.2 / 5149 / 0.5
NewHampshire / 71.2 / 3.3 / 57.6 / 4281 / 0.7
NewJersey / 70.9 / 5.2 / 52.5 / 5237 / 1.1
NewMexico / 70.3 / 9.7 / 55.2 / 3601 / 2.2
NewYork / 70.6 / 10.9 / 52.7 / 4903 / 1.4
NorthCarolina / 69.2 / 11.1 / 38.5 / 3875 / 1.8
NorthDakota / 72.8 / 1.4 / 50.3 / 5087 / 0.8
Ohio / 70.8 / 7.4 / 53.2 / 4561 / 0.8
Oklahoma / 71.4 / 6.4 / 51.6 / 3983 / 1.1
Oregon / 72.1 / 4.2 / 60.0 / 4660 / 0.6
Pennsylvania / 70.4 / 6.1 / 50.2 / 4449 / 1.0
RhodeIsland / 71.9 / 2.4 / 46.4 / 4558 / 1.3
SouthCarolina / 68.0 / 11.6 / 37.8 / 3635 / 2.3
SouthDakota / 72.1 / 1.7 / 53.3 / 4167 / 0.5
Tennessee / 70.1 / 11.0 / 41.8 / 3821 / 1.7
Texas / 70.9 / 12.2 / 47.4 / 4188 / 2.2
Utah / 72.9 / 4.5 / 67.3 / 4022 / 0.6
Vermont / 71.6 / 5.5 / 57.1 / 3907 / 0.6
Virginia / 70.1 / 9.5 / 47.8 / 4701 / 1.4
Washington / 71.7 / 4.3 / 63.5 / 4864 / 0.6
WestVirginia / 69.5 / 6.7 / 41.6 / 3617 / 1.4
Wisconsin / 72.5 / 3.0 / 54.5 / 4468 / 0.7
Wyoming / 70.3 / 6.9 / 62.9 / 4566 / 0.6
  1. A. Exact exponential growth – Fred and Alice were born the same year, and each began life with $500. Fred added $100 each year but kept his treasure under his mattress so he earned no interest. Alice added nothing, but earned interest at 7.5% annually. After 25 years, Fred and Alice are getting married. Who has more money? How much does each have? Alice’s cousin Charlie thinks that Fred is a paranoid loser and that Alice is cheap. He used a combined strategy and added $100 a year and obtained 7.5% interest. How much did he have after 25 years? All three continued with their strategies in the hopes of using the money to fund retirement. How much did each have at age 65?
  2. Generate accumulations for each person for 65 years
  3. Plot both series.
  4. Answer the questions.
  5. Fit linear function to Fred
  6. Based on this experiment which retirement savings strategy works better, (a) add money regularly or (b) start early.
  1. In Table 2 below are a number of state statistics. Some are correct and some are made up.
  1. Through plots, correlations and regression lines discuss the relationship between the correct data and their imaginary counterparts.
  2. Compare the four NAEP scores and see if the mean NAEP score adequately represents all states.
  3. How would you characterize Gore and Bush states vis-à-vis their income and academic performance?
  4. Has this characterization changed for the 2004 election?
  5. And what about obesity (Table 3)? Include in your answer some discussion of fat blue states and thin red ones (i.e. states with large residuals).

Table 2. Correct state data on income and academic accomplishment
Median / NAEP Scores
State / Income / Math-4 / Rdg - 4 / Math-8 / Rdg-8 / mean NAEP / '00 election / IQ / FakeIncome
Massachusetts / $50,587 / 242 / 228 / 287 / 273 / 257 / Gore / 111 / 24059
New Hampshire / $53,549 / 243 / 228 / 286 / 271 / 257 / Bush / 102 / 18834
Vermont / $41,929 / 242 / 226 / 286 / 271 / 256 / Gore / 102 / 20049
Minnesota / $54,931 / 242 / 223 / 291 / 268 / 256 / Gore / 113 / 26979
Connecticut / $53,325 / 241 / 228 / 284 / 267 / 255 / Gore / 99 / 18287
North Dakota / $36,717 / 238 / 222 / 287 / 270 / 254 / Bush / 111 / 26457
South Dakota / $38,755 / 237 / 222 / 285 / 270 / 254 / Bush / 100 / 18226
Montana / $33,900 / 236 / 223 / 286 / 270 / 254 / Bush / 100 / 18727
Wyoming / $40,499 / 241 / 222 / 284 / 267 / 253 / Bush / 102 / 20398
Iowa / $41,827 / 238 / 223 / 284 / 268 / 253 / Gore / 109 / 23534
New Jersey / $53,266 / 239 / 225 / 281 / 268 / 253 / Gore / 103 / 21451
Virginia / $49,974 / 239 / 223 / 282 / 268 / 253 / Bush / 99 / 18202
Kansas / $42,523 / 242 / 220 / 284 / 266 / 253 / Bush / 101 / 20253
Maine / $37,654 / 238 / 224 / 282 / 268 / 253 / Gore / 99 / 19508
Colorado / $49,617 / 235 / 224 / 283 / 268 / 252 / Bush / 104 / 21608
Wisconsin / $46,351 / 237 / 221 / 284 / 266 / 252 / Gore / 105 / 22974
Ohio / $43,332 / 238 / 222 / 282 / 267 / 252 / Bush / 107 / 20299
North Carolina / $38,432 / 242 / 221 / 281 / 262 / 252 / Bush / 106 / 21218
Nebraska / $43,566 / 236 / 221 / 282 / 266 / 251 / Bush / 101 / 21278
Washington / $44,252 / 238 / 221 / 281 / 264 / 251 / Gore / 92 / 15353
Indiana / $41,581 / 238 / 220 / 281 / 265 / 251 / Bush / 105 / 22934
Missouri / $43,955 / 235 / 222 / 279 / 267 / 251 / Bush / 92 / 16854
New York / $42,432 / 236 / 222 / 280 / 265 / 251 / Gore / 90 / 16558
Delaware / $50,878 / 236 / 224 / 277 / 265 / 250 / Gore / 90 / 16062
Utah / $48,537 / 235 / 219 / 281 / 264 / 250 / Bush / 89 / 17423
Oregon / $42,704 / 236 / 218 / 281 / 264 / 250 / Gore / 100 / 20629
Idaho / $38,613 / 235 / 218 / 280 / 264 / 249 / Bush / 96 / 19376
Pennsylvania / $43,577 / 236 / 219 / 279 / 264 / 249 / Gore / 99 / 20124
Michigan / $45,335 / 236 / 219 / 276 / 264 / 249 / Gore / 99 / 18624
Illinois / $45,906 / 233 / 216 / 277 / 266 / 248 / Gore / 93 / 17667
Maryland / $55,912 / 233 / 219 / 278 / 262 / 248 / Gore / 95 / 19084
Kentucky / $37,893 / 229 / 219 / 274 / 266 / 247 / Bush / 94 / 18043
Texas / $40,659 / 237 / 215 / 277 / 259 / 247 / Bush / 98 / 18835
South Carolina / $38,460 / 236 / 215 / 277 / 258 / 246 / Bush / 87 / 15325
Florida / $38,533 / 234 / 218 / 271 / 257 / 245 / Bush / 87 / 16067
West Virginia / $30,072 / 231 / 219 / 271 / 260 / 245 / Bush / 92 / 16534
Alaska / $55,412 / 233 / 212 / 279 / 256 / 245 / Bush / 92 / 17892
Rhode Island / $44,311 / 230 / 216 / 272 / 261 / 245 / Gore / 89 / 15989
Oklahoma / $35,500 / 229 / 214 / 272 / 262 / 244 / Bush / 98 / 19397
Georgia / $43,316 / 230 / 214 / 270 / 258 / 243 / Bush / 93 / 15065
Arkansas / $32,423 / 229 / 214 / 266 / 258 / 242 / Bush / 98 / 21603
Tennessee / $36,329 / 228 / 212 / 268 / 258 / 241 / Bush / 90 / 16198
Arizona / $41,554 / 229 / 209 / 271 / 255 / 241 / Bush / 92 / 18130
Nevada / $46,289 / 228 / 207 / 268 / 252 / 239 / Bush / 92 / 15439
Hawaii / $49,775 / 227 / 208 / 266 / 251 / 238 / Gore / 94 / 17341
California / $48,113 / 227 / 206 / 267 / 251 / 238 / Gore / 94 / 17119
Louisiana / $33,312 / 226 / 205 / 266 / 253 / 238 / Bush / 99 / 20266
Alabama / $36,771 / 223 / 207 / 262 / 253 / 236 / Bush / 90 / 15712
Mississippi / $32,447 / 223 / 205 / 261 / 255 / 236 / Bush / 90 / 16220
New Mexico / $35,251 / 223 / 203 / 263 / 252 / 235 / Gore / 85 / 14088
NAEP data were gathered in February, 2003.

Table 3

State / % Obese / Voted For / State / % Obese / Voted For
Hawaii / 17 / Kerry / Wisconsin / 22 / Kerry
Colorado / 17 / Bush / Nevada / 22 / Bush
Connecticut / 18 / Kerry / Alaska / 23 / Bush
Massachusetts / 18 / Kerry / Iowa / 23 / Bush
New Hampshire / 18 / Kerry / Kansas / 23 / Bush
Utah / 18 / Bush / Missouri / 23 / Bush
California / 19 / Kerry / Nebraska / 23 / Bush
Maryland / 19 / Kerry / North Dakota / 23 / Bush
New Jersey / 19 / Kerry / Ohio / 23 / Bush
Rhode Island / 19 / Kerry / Oklahoma / 23 / Bush
Vermont / 19 / Kerry / Pennsylvania / 24 / Kerry
Florida / 19 / Bush / Arkansas / 24 / Bush
Montana / 19 / Bush / Georgia / 24 / Bush
Oregon / 20 / Kerry / Indiana / 24 / Bush
Arizona / 20 / Bush / North Carolina / 24 / Bush
Idaho / 20 / Bush / Virginia / 24 / Bush
New Mexico / 20 / Bush / Michigan / 25 / Kerry
Wyoming / 20 / Bush / Kentucky / 25 / Bush
Maine / 21 / Kerry / Tennessee / 25 / Bush
New York / 21 / Kerry / Alabama / 26 / Bush
Washington / 21 / Kerry / Louisiana / 26 / Bush
D.C / 21 / Kerry / South Carolina / 26 / Bush
South Dakota / 21 / Bush / Texas / 26 / Bush
Delaware / 22 / Kerry / Mississippi / 27 / Bush
Illinois / 22 / Kerry / West Virginia / 28 / Bush
Minnesota / 22 / Kerry
Fat data from
NY Times Feb. 1, 2004
Page 12
Centers for Disease Control & Prevention
  1. What is the pricing structure of convertibles? How would you answer someone who asked “how much does a convertible cost? Do the costs of convertibles fall into specific groups?” A transformation is most useful in the revelation of the underlying price structure. Include informative displays and a narrative explaining both what you did and what you found.

Car / Price
Acura NSX-T / $88,725
Aston Martin DB7 Volante / $136,300
Audio Cabrio / $35,100
Bentley Azure / $329,400
BMW 318i / $33,720
BMW 328i / $41,960
BMW Z3 1.9 / $29,995
BMW Z3 2.8 / $36,470
Chevrolet Camaro / $22,295
Chevrolet Camaro RS / $23,695
Chevrolet Camaro Z28 / $26,045
Chevrolet Cavalier LS / $18,435
Chevrolet Corvette convertible / $46,000
Chrysler Sebring JX / $20,685
Chrysler Sebring JXi / $25,295
Dodge Viper RT/10 / $66,700
Ferrari F355 Spider / $137,075
Ferrari F50 / $487,000
Ford Mustang / $21,280
Ford Mustang Cobra / $28,660
Ford Mustang GT / $24,510
Honda del Sol / $15,475
Jaguar XK8 / $70,480
Lamborghini Diablo Roadster VT / $275,100
Mazda Miata M-Edition / $24,935
Mazda MX-5 Miata / $19,575
Mercedes-Benz SL320 / $80,195
Mercedes-Benz SL500 / $90,495
Mercedes-Benz SL600 / $123,795
Mercedes-Benz SLK230 / $40,295
Mitsubishi Eclipse Spyder GS / $20,360
Mitsubishi Eclipse Spyder GS-T Turbo / $26,200
Pontiac Firebird / $23,609
Pontiac Firebird Formula / $27,049
Pontiac Firebird Trans Am / $28,969
Pontiac Sunfire SE / $19,399
Porsche 911 Cabriolet / $73,765
Porsche 911 Carrera 4 Cabriolet / $79,115
Porsche Boxster / $40,745
Saab 900 SE Talledega Turbo / $42,520
Saab 900 SE Turbo / $41,995
Saab 900 SE V6 / $43,495
Saab 900S / $36,195
Toyota Celica GT / $24,858
Toyota Paseo / $17,188
Volkswagon Cabrio / $18,425
Volkswagon Cabrio Highline / $22,175
Source: / The New York Times
8-Jun-97
Section 11, page 1
  1. In the table below are life insurance premiums. Find the underlying policy that Jackson National applied in setting rates for the four groups shown.
    HINT: plotting rates will help you uncover a sensible transformation, after which some sort of decomposition may be helpful. Accompany your result with a descriptive narrative.

Jackson National's 10 Year Level-term Policy
Monthly Life Insurance Premiums for $100,000

Male

/ Female
Age / NonSmoker / Smoker / NonSmoker / Smoker
30 / 12.34 / 22.34 / 10.85 / 17.71
31 / 12.51 / 23.23 / 11.03 / 17.89
32 / 12.69 / 24.21 / 11.29 / 18.07
33 / 12.78 / 25.19 / 11.46 / 18.25
34 / 13.04 / 26.26 / 11.55 / 18.42
35 / 13.21 / 27.41 / 11.81 / 18.60
36 / 13.74 / 29.01 / 12.16 / 19.49
37 / 14.35 / 30.71 / 12.51 / 20.47
38 / 14.96 / 32.57 / 12.95 / 21.54
39 / 15.58 / 34.53 / 13.39 / 22.61
40 / 16.28 / 36.67 / 13.91 / 23.85
41 / 17.15 / 39.25 / 14.44 / 25.10
42 / 17.94 / 42.10 / 15.05 / 26.43
43 / 18.81 / 45.12 / 15.66 / 27.86
44 / 19.78 / 48.51 / 16.28 / 29.37
45 / 20.83 / 52.07 / 17.06 / 30.97
46 / 22.14 / 55.18 / 17.85 / 32.66
47 / 23.63 / 58.56 / 18.73 / 34.44
48 / 25.20 / 62.21 / 19.60 / 36.40
49 / 27.04 / 66.04 / 20.65 / 38.45
50 / 28.79 / 70.13 / 21.70 / 40.67
51 / 30.63 / 74.67 / 22.75 / 43.25
52 / 32.64 / 79.57 / 23.89 / 46.01
53 / 34.83 / 84.73 / 25.11 / 49.04
54 / 37.01 / 90.34 / 26.43 / 52.24
55 / 39.55 / 96.30 / 27.83 / 55.71
56 / 42.53 / 103.06 / 29.14 / 58.65
57 / 45.76 / 110.27 / 30.71 / 61.68
58 / 49.35 / 118.01 / 32.29 / 64.97
59 / 53.20 / 126.38 / 34.13 / 68.35
60 / 57.31 / 135.37 / 35.88 / 72.00
61 / 63.35 / 150.77 / 39.29 / 79.30
62 / 70.18 / 168.03 / 43.23 / 87.40
63 / 77.61 / 187.35 / 47.51 / 96.39
64 / 85.93 / 208.97 / 52.41 / 106.36
65 / 95.38 / 233.18 / 57.75 / 117.48
66 / 106.49 / 260.59 / 63.18 / 130.56
67 / 119.26 / 291.30 / 69.30 / 145.25
68 / 133.70 / 325.74 / 75.95 / 161.71
69 / 149.89 / 364.37 / 83.30 / 180.05
70 / 168.09 / 407.62 / 91.44 / 200.52
  1. Fit a linear model to an average male’s growth and compare its predictions with the growth of Robert Wadlow. What characterizes Wadlow’s deviance (Slope? Intercept? Both?). Would a logistic function fit the data better? An advanced project might be to fit the sum of two logistics; try it only if you feel adventuresome.

Average Male / Robt. Wadlow
Age / HT(in) / HT(in)
1 / 30.1
2 / 34.1
3 / 37.9
4 / 41.4
5 / 44.6 / 60.0
6 / 47.4
7 / 49.9
7.5 / 50.9
8 / 51.9 / 72.0
9 / 53.7 / 74.5
10 / 55.5 / 77.0
11 / 57.5 / 79.0
12 / 60.4 / 82.5
13 / 63.8 / 86.0
13.5 / 65.4
14 / 66.8 / 89.5
14.5 / 67.8
15 / 68.6 / 92.0
15.5 / 69.2
16 / 69.6 / 94.5
16.5 / 69.9
17 / 70.1 / 96.5
17.5 / 70.2
18 / 70.3 / 99.5
18.5 / 70.4
19 / 70.5 / 101.5
20 / 103.5
21 / 104.5
22 / 107.0

.

  1. A. Decompose the table below using iterative median polish and display the final result in a compelling tabular format. Then display the result graphically. Accompany your result with a verbal description of what you have found.

Infant Mortality-rates in the United State, all races, 1964-1966

(Entries are numbers of deaths per 1000 live births)

Education of father

Region / 8 / 9 to 11 / 12 / 13-15 / 16
Northeast / 25.3 / 25.3 / 18.2 / 18.3 / 16.3
North Central / 32.1 / 29.0 / 18.8 / 24.3 / 19.0
South / 38.8 / 31.0 / 19.3 / 15.7 / 16.8
West / 25.4 / 21.1 / 20.3 / 24.0 / 17.5
  1. Reanalyze the data in (A) above using means. Compare the results of the two analyses.
  2. Find a table of reasonable size (e.g. at least 5 x 10) in a scientific journal of your choice (e.g. Journal of the American Medical Association, Science, Nature, Psychological Bulletin, etc.) and:
    (i) Revise it according to the rules in Reference 20 or Ref. 14, chapter 10).
    (ii) Describe what you found that was not obvious initially.
    Be sure to include the initial table, the revision, and details about where the table came from and what the inferences that the original authors were making from the table.
  1. A. In the relatively recent past there was a news article in the paper that reported that circumcision among men helped to prevent cervical cancer among women.
  2. Describe what sorts of data were likely to have been used to derive this causal conclusion.
  3. What would be the ideal data gathering experiment to allow such an inference?
  4. How close is (a) to (b)?
  5. Schools sometimes advise parents that their child’s academic future would be rosier if she/he repeated kindergarten.
  1. What sorts of prior evidence do you think the teacher was using to justify such a recommendation?
  2. What would be the ideal data gathering experiment to allow such an inference?
  3. How close is (a) to (b)?
  1. M&M (12.38 in 7th edition) Do poets die young? Parts a, b, c,
  2. Cereals were analyzed by their protein content. It was also noted that different kinds of cereal were placed on different shelves. The mean and standard deviation of protein content are shown in the table below by shelf position as is the results of an analysis of variance and box plots of the results.

Analysis of Variance
Sum of / Mean
Source / Squares / DF / Square / F-ratio / P-value
Shelf / 12.4 / 2 / 6.2 / 5.8 / 0.004
Error / 78.7 / 74 / 1.1
Total / 91.1 / 76
Means and Std. Deviations
Shelf / Standard
Level / n / Mean / Deviation
1 / 20 / 2.65 / 1.46
2 / 21 / 1.90 / 0.99
3 / 36 / 2.86 / 0.72

a)What are the null and alternative hypotheses being tested in the ANOVA

b)What does the ANOVA results say about the null hypothesis? Be sure to report in terms of protein content and shelves.

c)Can we conclude that cereals on shelf 2 have a lower mean protein content than cereals on shelf 3? Can we conclude that cereals on shelf 2 have a lower mean protein content than cereals on shelf 1? What can we conclude?

d)To check for significant differences between the shelf means we can use a Bonferroni test, do so and show all of the pairwise comparisons. What does it say about the questions in part c?

  1. M&M12.38 in 7th edition– this time answer question (f) doing all pair-wise comparisons using the Bonferroni inequality with an overall  = 0.05.
  1. University of Pennsylvania Professor Ted Hershberg uses the results obtained by North Carolina researcher William Sanders in his plans to revamp American Public education. Specifically, he cites Sander’s finding that quality of teachers are the largest factor in students’ performance; that big improvements in student performance are caused by their teacher. Sanders makes this inference by looking at the gain (value-added) in test scores for each student over the year that student was in a specific teacher’s class and adjusts for all other factors by using them as covariates.
  2. What issues would concern you about this inference?
  3. How would you design a study that would allow such inferences?
  4. How close do you think the data-gathering scheme from Sanders’ observational study in Tennessee comes to the ideal case you have described in (b)?