STAT 110: Assignment #10 (42 pts.)

Due Tuesday, May 22ndNames:

1)Bias in the U.S. Open Tennis opening round seeding?

Watch the following two videos:

These videos discuss the “random” assignment of opponents to the two highest ranked players in the U.S. Open Tennis Tournament in both men’s and women’s tennis. Seeding for this tournament is supposed to be done as follows:

  • The top 32 players are fairly assigned spots in two separate brackets.
  • The remaining 96 players (ranked 33 to 128) are randomly assigned to the games in the brackets.
  • If they are truly randomly assigned to the games in the brackets, we would expect the players assigned to face the top two players in both the men’s and women’s tournament to have an average rank of 80.5. For example, Federer might get the 40th ranked player and Nadal might get the player ranked 121st, which result in an average rank of 80.5 assigned to the top two players.

Note: Of course, as these are random draws, it is just as likely that these top two players might get assigned to opponents ranked 33 and 34 (average = 33.5) or 127 and 128 (average = 127.5), for example, in any given year. On average, however, we would expect the ranks assigned to the top two players to be 80.5.

During the years 2001-2010 the average rank assigned to the top two men’s players were as follows:

Year / Seed / Player / Opponent / Opponent rank (33-128) / Average Rank
2001 / 1 / Kuerten / Vacek / 127 / 125.5
2 / Agassi / Bryan / 124
2002 / 1 / Hewitt / Coutelot / 94 / 77
2 / Safin / Kiefer / 60
2003 / 1 / Agassi / Corretja / 81 / 76
2 / Federer / Acasuso / 71
2004 / 1 / Federer / Costa / 41 / 84.5
2 / Roddick / Jenkins / 128
2005 / 1 / Federer / Minar / 73 / 90.5
2 / Nadal / Reynolds / 108
2006 / 1 / Federer / Wang / 98 / 99
2 / Nadal / Philippoussis / 100
2007 / 1 / Federer / Jenkins / 125 / 115.5
2 / Nadal / Jones / 106
2008 / 1 / Nadal / Phau / 109 / 103.5
2 / Federer / Gonzalez / 98
2009 / 1 / Federer / Britton / 128 / 108
2 / Murray / Gulbis / 88
2010 / 1 / Nadal / Gabashvili / 94 / 92.5
2 / Federer / Dabul / 91
Average rank over the 10 years: / 97.2

Note that the average rank assigned to the top two players over the 10 years was 97.2.

During one of the videos, it is stated that in only 3 of the 1,000 simulated draws conducted did they get a 10-year average as large as the one for the men’s side of the tournament. Can you confirm ESPN’s findings using the results of a 1000 simulations from Tinkerplots which are shown below?

  1. Recall that the average rank assigned to the top two players over the 10 years in the men’s tournament was 97.2. Also, recall that the p-value is by definition the probability we would observe results at least this extreme just by chance, assuming the seeding is truly done at random. Determine the p-value associated with the observed actual average of 97.2. (2 pts.)
    p-value: ______
  2. On the basis of your results, do you agree with ESPN’s claim that the U.S. Open pairings for the top two men’s seeds for the past years are high unlikely to happen by “Luck of the Draw”? Explain. (3 pts.)
  1. During the years 2001-2010 the average rank assigned to the top two women’s players were as follows:

Year / Seed / Player / Opponent / Opponent rank (33-128) / Average Rank
2001 / 1 / Hingis / Granville / 127 / 120
2 / Capriati / Hopmans / 113
2002 / 1 / S. Williams / Morariu / 126 / 122.5
2 / V. Williams / Lucic / 119
2003 / 1 / Clijsters / Liu / 127 / 119.5
2 / Henin / Kapros / 112
2004 / 1 / Henin / Vaidisova / 111 / 97
2 / Mauresmo / Irvin / 83
2005 / 1 / Sharapova / Daniilidou / 60 / 48.5
2 / Davenport / Li / 37
2006 / 1 / Mauresmo / Barrois / 117 / 84
2 / Henin / Camerin / 51
2007 / 1 / Henin / Goerges / 116 / 81.5
2 / Sharapova / Vinci / 47
2008 / 1 / Ivanovic / Dushevina / 59 / 92.5
2 / Jankovic / Vandeweghe / 126
2009 / 1 / Safina / Rogowska / 118 / 108.5
2 / S. Williams / Glatch / 99
2010 / 1 / Wozniacki / Gullickson / 123 / 110.5
2 / Clijsters / Arn / 98
Average rank over the 10 years: / 98.45

Note that the women’s 10-year average rank actually observed in the U.S. Open was 98.45.
Give an estimated p-value for this result, as well. Note that you can use your results from part (b); you don’t need to run the simulation again. (2 pts.)

p-value: ______

  1. How does the fact that both men’s and women’s pairings have an extreme result factor into it? The ESPN investigative reporter makes this point in both videos. Explain. (2 pts.)
  1. The Bayley Scales of Infant Development yield scores on two indices – the Psychomotor Development Index (PDI) and the Mental Development Index (MDI). These can be used to assess a child’s level of function in each of these areas at approximately one year of age. Among healthy infants, both indices have a mean value of 100. As part of the study assessing the development and neurologic status of children who have undergone reparative heart surgery during the first three months of life, the Bayley Scales were administered to a random sample of 144 one-year-old infants born with congenital heart disease.
    The file PDI_MDI.JMP contains data collected on the following variables:
  • PDI = psychomotor development index
  • MDI = mental development index

Research Question: Is there evidence that children born with congenital heart disease who undergo reparative heart surgery during the first three months of life have a mean PDI score less than 100 (which is the mean for healthy infants)?

  1. Use JMP to find both the mean and the standard deviation of the PDI and MDI scores. Enter these values in the following table. (2 pts)

Variable / Mean / Standard Deviation
PDI
MDI
  1. Is the t-test an appropriate analysis procedure for these data? Hint: consider checking the assumption behind the t-test for a single mean. (1 pt)
  1. Set up the null and alternative hypotheses to test the research question. State them in words and using statements about .(2pts)

Ho:

Ha:

  1. Find the p-value to test the research question. (1 pt)

p-value: ______

  1. Write a conclusion to address the research question in the context of the problem. (2 pts)
  1. Use JMP to find the 95% CI for the mean PDI score for children born with congenital heart disease who undergo reparative heart surgery during the first three months of life. Interpret this interval. (2 pts)
  2. Does this interval agree with your conclusion given in part e? Explain your reasoning.
    (2 pts.)
  3. Next, you have been asked to investigate the following:

Research Question: Is there evidence that children born with congenital heart disease who undergo reparative heart surgery during the first three months of life have a mean MDI score different from 100, the mean for healthy infants?

Set up the null and alternative hypotheses to test the research question given in part b. Again state them in words and using  notation. (2 pts)

Ho:

Ha:

  1. Find the p-value to test the research question given in part h. (1 pt)

p-value: ______

  1. Write a conclusion in the context of the problem. (2 pts)
  1. Consumer Reports tested 14 brands of vanilla yogurt and found the following numbers of calories per serving:

160200220230120180140

130170190 80120100170

A diet guide claims that you will get 120 calories from a serving of vanilla yogurt. Do these data provide evidence the diet guide’s claim is false? Use either a hypothesis test or confidence interval to address this research question. (4 pts)

Be sure to check the normality assumption!

  1. Using the STAT 110 student survey data conduct a test to determine if the mean GPA differs from a B average (i.e. 3.00) for two groups of students: those that skip at least one class per week and those that do not. To do this in JMP select Distribution and put Skip Classes?in the By box and GPA in the Y, Columns box.

a) Complete the table below (2pts.)

GPA Statistics

Skip Classes? / Mean / Standard Deviation
Yes
No

b)Conduct an appropriate test to determine if the mean GPA of students who skip at least one class per week is different from 3.00. State your conclusions. (3 pts.)

c)Conduct an appropriate test to determine if the mean GPA of students who do not skip classes is different from 3.00. State your conclusions. (3 pts.)

d)Examine and interpret confidence intervals for the mean GPA of both groups of students. What do you conclude on the basis intervals about their mean GPA’s? (4 pts.)

1