Problem Set 4: HRP/STAT 261: Due February 22, 2012

1. A study was undertaken to examine the prevalence of abnormal hematologic (blood cell) profiles in elite cross-country skiers at the 2001 World Ski Championships. Abnormal hematologic profiles—measured as increased red blood cells or hemoglobin—may indicate blood doping. Sixty-eight percent of all skiers and 92% of those finishing in the top 10 places were tested. Hemoglobin levels in the athletes were compared against established reference data (hemoglobin concentration is normally distributed). Values of >2 SD above average were classified as “abnormal,” and values >3 SD above average were classified as “highly abnormal.” Results for the top 50 finishers in each of the 9 races of the Championships are represented in the figure below (assume each individual skier only competed once):

Each of the 9 races is represented by a column. The hematologic result for an athlete is placed in the position of their race result (1st to 50th) in each column. A black oval indicates a “highly abnormal” (>3 SD) hematologic profile in the skier who obtained that race result. A speckled oval indicates an “abnormal” (>2 SD but <3 SD) hematologic profile in the skier who obtained that race result. A white oval indicates a “normal” hematologic profile in the skier who obtained that race result. A blank area indicates that a sample was not obtained from the athlete who achieved that race result.

Note: Missing data are missing at random and can be ignored. (Selection for drug testing is random, but top finishers have higher probabilities of being selected).

I’ve converted these data for you into a SAS usable form (shown at the end of this document). You can import the data from the excel file posted on the course website:

(a)What is the probability of having a “highly abnormal” test result by “decade” of finishing place (1-10, 11-20, 21-30, 31-40, 41-50)?

What is the probability of having an “abnormal” test result in each decade of finishing place (1-10, 11-20, 21-30, 31-40, 41-50)?

(b)Plot decade of finishing place against the logit of the outcome “highly abnormal” test result (Note: a hand-drawn sketch would be fine; if you want to plot this in SAS, you can adapt code from the logit plot macro from lab 4, but it will not work directly because of the grouped nature of the data).

(c)Calculate the odds ratio that represents the increase in the odds of a “highly abnormal” test result for every one-unit higher finishing place (e.g., going from 15th place to 16th place or from 40th to 41st place). Note: use “abnormal” (gray circles) and “normal” skiers combined as the reference group.

(d)Calculate the odds ratio that represents the increase in the odds of a “highly abnormal” test result for every ten-unit increase in finishing place (e.g., going from 15th place to 25th place or from 38th to 48th place).

(e)Calculate the odds ratio that represents the increase in the odds of a highly abnormal test result for every jump in “decade” of finishing place (e.g., going from a 11-20 finisher to a 21-30 finisher).

(f)Compare the odds ratios in (d) and (e). Explain why they differ.

(g)Calculate the odds ratio that represents the increase in the odds of having a “highly abnormal” test result for top-ten finishers (compared to all other finishers).

(h)Calculate the odds ratio that represents the increase in the odds of being in the top ten given that you have an “abnormal” test result.

(i)Calculate the odds ratio that represents the increase in the odds of being in the top ten given that you have a “highly abnormal” test result.

(j)Briefly interpret these results.

DATA FOR SAS:

PlaceAb HiAbFrequency

10 0 6

1013

2 004

2014

3 002

3101

3015

4 001

4104

4012

5 005

5014

6 004

6101

6013

7 004

7015

8 003

8103

8012

9 004

9014

10 005

10013

11006

11101

12006

12101

13005

13011

14005

14102

15008

15101

16004

16101

16011

17005

17102

18004

18104

19002

19102

19011

20004

20011

21005

21101

22002

22101

22013

23004

24004

24101

24011

25001

25103

25011

26005

26101

27002

27102

28003

28102

29004

29101

29012

30003

30011

31004

31102

32003

32011

33004

34004

34101

35004

35011

36003

36101

36012

37004

37101

38004

38103

39004

39011

40003

41006

41101

42003

42101

43003

43101

43011

44001

44102

45005

46002

46102

47003

47102

48006

49003

49101

50003