Applied Statistical Methods
HSRP 734
Summer 2008
Homework 2 (55 points total)
Solution Key
Due: 6/5/2008
General instructions: 1) you may discuss any and all portions of the assignment with other members of the class. However, the homework you turn in must be your own. 2) For problems that require a statistical package, you must do your own programming and provide your output with your answers. 3) A final answer is not sufficient. Show all your work and/or SAS output. When you give SAS output, clearly indicate your answers to the questions.
Q1. Describe desirable properties of the odds ratio (2 points).
The odds ratio is easy to estimate for a 2x2 table. One of best properties of the odds ratio is that we can use it to estimate the population odds ratio using data from cross-sectional, cohort or case-control studies. That is not the case with other estimators, like the risk ratio. Thus from this perspective (and also again after learning about logistic regression) we can answer a prospective question (estimating population odds ratio) even using retrospective data with the odds ratio.
Q2. Open the “q2.sas7bdat” dataset in SAS Enterprise and answer the following questions (47 points):
Here in the q2 dataset, agegrp represents 3 age groups you have already slyly created in SAS. Trt is the treatment group variable (Trt=treatment, zControl=control). Died = Y for passed away at 1 year into this study and = zN for alive at 1 year. Count = frequency counts of the particular age group, treatment group and mortality combinations.
2a. What is the overall 1 year mortality rate of those in the treatment group (2 points)? What is the overall 1 year mortality rate of those in the control group (2 points)?
We can get the group mortality rates in Enterprise by going to Describe->Table Analysis using count as the Frequency count, and trt, died as Table variables:
Which gives:
FrequencyPercent
Row Pct
Col Pct
/ Table of trt by died
/
trt
/ died
/ Total
/
Y
/ zN
/
Trt / 612
26.10
64.35
34.44 / 339
14.46
35.65
59.68 / 951
40.55
zControl / 1165
49.68
83.57
65.56 / 229
9.77
16.43
40.32 / 1394
59.45
Total / 1777
75.78 / 568
24.22 / 2345
100.00
The mortality rates are given by the Row %’s in the “y” column for “died”:
Thus the treatment group had about a 20% lower mortality rate.
2b. What is the relative risk (risk ratio) of death for treatment vs. control (3 points)? Estimate the 95% confidence interval around this estimate (3 points). Interpret the estimate of relative risk (2 points).
We can get this in Enterprise by re-running the same 2x2 table but checking the “Relative risk for 2x2 tables” option under Measures of Association in the Table Statistics-> Association menu to get an appropriate risk ratio and confidence interval.
Doing so gives:
FrequencyRow Pct
/ Table of trt by died
/
trt
/ died
/ Total
/
Y
/ zN
/
Trt / 612
64.35 / 339
35.65 / 951
zControl / 1165
83.57 / 229
16.43 / 1394
Total / 1777 / 568 / 2345
Statistics for Table of trt by died
Estimates of the Relative Risk (Row1/Row2) /Type of Study / Value / 95% Confidence Limits /
Case-Control (Odds Ratio) / 0.3549 / 0.2923 / 0.4309
Cohort (Col1 Risk) / 0.7700 / 0.7305 / 0.8117
Cohort (Col2 Risk) / 2.1699 / 1.8752 / 2.5110
Sample Size = 2345
Thus the RR = 0.7700 with a 95% CI = (0.7305, 0.8117). Thus the risk of death was reduced by 23% in the treatment group compared to control.
2c. Estimate the overall odds ratio with a 95% confidence interval (6 points). Interpret the odds ratio (is there evidence of treatment effect?) (2 points).
The odds ratio was given in the last output and was OR=0.3549 (95% CI = [0.2923, 0.4309]). Thus, the odds of death was reduced by about 65% in the treatment group compared to control.
2d. Give the age group specific odds ratios for treatment vs. control along with their 95% confidence intervals (6 points). Which age groups possibly show a more beneficial treatment effect? (2 points)
We can run separate 2x2 tables for each age group and estimate their ORs and 95% CIs by going to a Table Analysis again but now using agegrp as a variable for “Group analysis by.”
Doing so gives the following relevant output:
agegrp=18-30
FrequencyRow Pct
/ Table of trt by died
/
trt
/ died
/ Total
/
Y
/ zN
/
Trt / 153
64.56 / 84
35.44 / 237
zControl / 378
79.75 / 96
20.25 / 474
Total / 531 / 180 / 711
Statistics for Table of trt by died
Estimates of the Relative Risk (Row1/Row2) /Type of Study / Value / 95% Confidence Limits /
Case-Control (Odds Ratio) / 0.4626 / 0.3267 / 0.6550
Cohort (Col1 Risk) / 0.8095 / 0.7291 / 0.8989
Cohort (Col2 Risk) / 1.7500 / 1.3658 / 2.2422
Sample Size = 711
agegrp=31-45
FrequencyRow Pct
/ Table of trt by died
/
trt
/ died
/ Total
/
Y
/ zN
/
Trt / 207
63.69 / 118
36.31 / 325
zControl / 386
83.91 / 74
16.09 / 460
Total / 593 / 192 / 785
Statistics for Table of trt by died
Estimates of the Relative Risk (Row1/Row2) /Type of Study / Value / 95% Confidence Limits /
Case-Control (Odds Ratio) / 0.3363 / 0.2403 / 0.4707
Cohort (Col1 Risk) / 0.7590 / 0.6928 / 0.8316
Cohort (Col2 Risk) / 2.2570 / 1.7515 / 2.9084
Sample Size = 785
agegrp=46-70
FrequencyRow Pct
/ Table of trt by died
/
trt
/ died
/ Total
/
Y
/ zN
/
Trt / 252
64.78 / 137
35.22 / 389
zControl / 401
87.17 / 59
12.83 / 460
Total / 653 / 196 / 849
Statistics for Table of trt by died
Estimates of the Relative Risk (Row1/Row2) /Type of Study / Value / 95% Confidence Limits /
Case-Control (Odds Ratio) / 0.2706 / 0.1920 / 0.3816
Cohort (Col1 Risk) / 0.7431 / 0.6852 / 0.8060
Cohort (Col2 Risk) / 2.7458 / 2.0883 / 3.6104
Sample Size = 849
Thus,
Age = 18-30:
OR=0.4626, 95% CI = (0.3267, 0.6550)
Age = 31-45:
OR=0.3363, 95% CI = (0.2403, 0.4707)
Age = 46-70:
OR=0.2706, 95% CI = (0.1920, 0.3816)
It appears that older age groups show a slightly more beneficial treatment effect.
2e. Conduct a stratified analysis of treatment and mortality, stratifying by age groups (12 points). Determine the appropriate statistics to report, with justification (3 points). Does it appear that controlling for age group matters? (2 points) Does the direction of the effect change by age group? (1 point) Is there a treatment effect on 1 year mortality? (1 point)
We can conduct a stratified analysis on the 2x2x2 table formed by age group x treatment x death using the CMH approach. We first check the directions and magnitudes of the . We also conduct a formal test of the homogeneity of the age-group specific OR’s using a Breslow-Day test.
We already computed in parts 2c) and 2d).
We can compute the CMH test, ORMH and the Breslow Day test in Enterprise by adding agegrp as a 3rd and last variable into the Table Analysis as a Table variable under Task roles and then as the 3rd variable in Tables. Then, we can get the CMH analysis by choosing CMH statistics under Table Statistics->Association menu.
Doing so gives the following output:
Summary Statistics for trt by died
Controlling for agegrp
Statistic / Alternative Hypothesis / DF / Value / Prob /
1 / Nonzero Correlation / 1 / 117.2836 / <.0001
2 / Row Mean Scores Differ / 1 / 117.2836 / <.0001
3 / General Association / 1 / 117.2836 / <.0001
Estimates of the Common Relative Risk (Row1/Row2) /
Type of Study / Method / Value / 95% Confidence Limits /
Case-Control / Mantel-Haenszel / 0.3446 / 0.2831 / 0.4194
(Odds Ratio) / Logit / 0.3469 / 0.2847 / 0.4227
Cohort / Mantel-Haenszel / 0.7664 / 0.7271 / 0.8078
(Col1 Risk) / Logit / 0.7646 / 0.7255 / 0.8059
Cohort / Mantel-Haenszel / 2.2234 / 1.9157 / 2.5805
(Col2 Risk) / Logit / 2.1821 / 1.8804 / 2.5321
Breslow-Day Test for
Homogeneity of the Odds Ratios /
Chi-Square / 4.6855
DF / 2
Pr > ChiSq / 0.0961
Total Sample Size = 2345
Which can be summarized by:
and that the Breslow-Day test p-value = 0.0961.
Here, the age-group specific odds ratios do appear to be somewhat different (>10% change) from the crude OR. Thus age group could be a potential confounder or effect modifier of treatment group and mortality. Since the Breslow-Day test was not significant we could report the ORMH=0.34, which adjusts for age group, and where the CMH Chi-square test statistic = 117.2836 with p-value<0.0001. Here, the p-value of this test implies there is a significant association of treatment group and mortality, after adjusting for age group.
However the age-group specific odds ratios appear to be different from each other such that age group could be considered an effect modifier. Thus if in addition our substantive considerations dictate that these OR’s are meaningfully different from one another we should report the three separate odds ratios for each age group and their associated Chi-square tests. All three individual Chi-square tests were significant with p-value < 0.0001 (output not shown), indicating a significant association of treatment group and mortality within the given age group.
Concluding either of these with justified reasoning was acceptable.
Q3. In a study of the effective of a drug for curing high blood pressure, five hospitals were selected as study sites. A total of 600 patients were recruited and each was randomly assigned to either the treatment or the control group. The outcome is dichotomized as reduction in blood pressure or no reduction of blood pressure. The researcher decided to perform an outcome by treatment by site analysis. In the 2x2x5 CMH analysis, it is reported that the Breslow-Day test gave a p-value of 0.076 for hospital, and the CMH Chi-square test gave a p-value <0.0001. What does that mean in plain language (6 points)?
This means that barring no flipping of the direction of the site-specific OR’s that there was not evidence against homogeneity of the OR’s (Breslow-Day p=0.076). Thus, in an analysis adjusting for site there was a significant association of treatment and outcome (CMH Chi-square p<0.0001) in the study of 600 patients in 5 sites. Without seeing the odds ratios is hard to comment further on the direction of this effect for treatment vs. control.
8