STA 6505, Fall 2008, Homework #3 Solutions
Chapter 3: 3.4acd, 3.9b, 3.11a, 3.13ab (no need to discuss how the small-sample C.I. was calculated; it is somewhat complicated)
3.4. We have the following contingency table, with explanatory variable X = race, having two levels, Black and White, and response variable Y = party affiliation, having three levels, Democrat, Independent, and Republican.
Party AffiliationRace / Democrat / Independent / Republican
Black / 103 / 15 / 11
White / 341 / 105 / 405
a) Using X2 and G2, test the hypothesis of independence between Party Affiliation and Race. Report the p-values and interpret.
Step 1: H0: , for all i = 1, 2 and j = 1, 2, 3.
HA: , for some i and j.
Step 2: We have n = 980, I = 2, J = 3, and we choose a = 0.05.
Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 2.
Step 4: We will reject the null hypothesis if either , .
Step 5: From the table above or the SAS output below, we have with a
p-value < 0.0001 , and with a p-value < 0.0001.
Comparison Between Race and Party
The FREQ Procedure
Table of race by party
race party
Frequency‚Democrat‚Independ‚Republic‚ Total
‚ ‚ent ‚an ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Black ‚ 103 ‚ 15 ‚ 11 ‚ 129
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
White ‚ 341 ‚ 105 ‚ 405 ‚ 851
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 444 120 416 980
Statistics for Table of race by party
Statistic DF Value Prob
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square 2 79.4310 <.0001
Likelihood Ratio Chi-Square 2 90.3311 <.0001
Mantel-Haenszel Chi-Square 1 79.3336 <.0001
Phi Coefficient 0.2847
Contingency Coefficient 0.2738
Cramer's V 0.2847
Sample Size = 980
Step 6: We reject the null hypothesis at the 0.05 level of significance. We have sufficient evidence to conclude that Race and Party Affiliation are not independent.
c) Partition chi-squared into components regarding the choice between Democrat and Independent and between those two combined and Republican. Interpret.
First subtable; X = Race, Y = Party Affiliation (Democrat v. Independent).
Step 1: H0: , for all i = 1, 2 and j = 1, 2.
HA: , for some i and j.
Step 2: We have n = 564, I = 2, J = 2, and we choose a = 0.05.
Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 1.
Step 4: We will reject the null hypothesis if either , .
Step 5: From the table above or the SAS output below, we have with a
p-value = 0.0106 , and with a p-value = 0.0075.
Comparison Between Race and Party
The FREQ Procedure
Table of race by party
race party
Frequency‚Democrat‚Independ‚ Total
‚ ‚ent ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Black ‚ 103 ‚ 15 ‚ 118
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
White ‚ 341 ‚ 105 ‚ 446
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 444 120 564
Statistics for Table of race by party
Statistic DF Value Prob
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square 1 6.5350 0.0106
Likelihood Ratio Chi-Square 1 7.1561 0.0075
Continuity Adj. Chi-Square 1 5.9044 0.0151
Mantel-Haenszel Chi-Square 1 6.5234 0.0106
Phi Coefficient 0.1076
Contingency Coefficient 0.1070
Cramer's V 0.1076
The FREQ Procedure
Statistics for Table of race by party
Statistic Value ASE
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Gamma 0.3578 0.1299
Kendall's Tau-b 0.1076 0.0361
Stuart's Tau-c 0.0717 0.0246
Somers' D C|R 0.1083 0.0367
Somers' D R|C 0.1070 0.0362
Pearson Correlation 0.1076 0.0361
Spearman Correlation 0.1076 0.0361
Lambda Asymmetric C|R 0.0000 0.0000
Lambda Asymmetric R|C 0.0000 0.0000
Lambda Symmetric 0.0000 0.0000
Uncertainty Coefficient C|R 0.0123 0.0087
Uncertainty Coefficient R|C 0.0124 0.0087
Uncertainty Coefficient Symmetric 0.0123 0.0087
Estimates of the Relative Risk (Row1/Row2)
Type of Study Value 95% Confidence Limits
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Case-Control (Odds Ratio) 2.1144 1.1789 3.7921
Cohort (Col1 Risk) 1.1417 1.0476 1.2442
Cohort (Col2 Risk) 0.5400 0.3270 0.8916
Sample Size = 564
Step 6: We reject the null hypothesis at the 0.05 level of significance. We have sufficient evidence to conclude that Race and Party Affiliation are not independent. The phi coefficient is 0.1076, showing a relatively weak positive association between Race and Party Affiliation, when Party Affiliation includes only Democrat v. Independent.
Second subtable – X = Race, Y = Party Affiliation (Democrat/Independent v. Republican).
Step 1: H0: , for all i = 1, 2 and j = 1, 2.
HA: , for some i and j.
Step 2: We have n = 564, I = 2, J = 2, and we choose a = 0.05.
Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 1.
Step 4: We will reject the null hypothesis if either , .
Step 5: From the table above or the SAS output below, we have with a
p-value < 0.0001 , and with a p-value < 0.0001.
Comparison Between Race and Party Affiliation
The FREQ Procedure
Table of race by party
race party
Frequency‚Democrat‚Republic‚ Total
‚/Indepen‚an ‚
‚dent ‚ ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Black ‚ 118 ‚ 11 ‚ 129
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
White ‚ 446 ‚ 405 ‚ 851
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 564 416 980
Statistics for Table of race by party
Statistic DF Value Prob
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square 1 69.9721 <.0001
Likelihood Ratio Chi-Square 1 83.1750 <.0001
Continuity Adj. Chi-Square 1 68.3822 <.0001
Mantel-Haenszel Chi-Square 1 69.9007 <.0001
Phi Coefficient 0.2672
Contingency Coefficient 0.2582
Cramer's V 0.2672
The FREQ Procedure
Statistics for Table of race by party
Statistic Value ASE
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Gamma 0.8138 0.0545
Kendall's Tau-b 0.2672 0.0222
Stuart's Tau-c 0.1786 0.0185
Somers' D C|R 0.3906 0.0300
Somers' D R|C 0.1828 0.0188
Pearson Correlation 0.2672 0.0222
Spearman Correlation 0.2672 0.0222
Lambda Asymmetric C|R 0.0000 0.0000
Lambda Asymmetric R|C 0.0000 0.0000
Lambda Symmetric 0.0000 0.0000
Uncertainty Coefficient C|R 0.0623 0.0117
Uncertainty Coefficient R|C 0.1090 0.0191
Uncertainty Coefficient Symmetric 0.0792 0.0143
Estimates of the Relative Risk (Row1/Row2)
Type of Study Value 95% Confidence Limits
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Case-Control (Odds Ratio) 9.7411 5.1758 18.3332
Cohort (Col1 Risk) 1.7454 1.6065 1.8963
Cohort (Col2 Risk) 0.1792 0.1014 0.3167
Sample Size = 980
Step 6: We reject the null hypothesis at the 0.05 level of significance. We have sufficient evidence to conclude that Race and Party Affiliation are not independent. In particular, the phi coefficient is 0.2672, showing a somewhat weak positive correlation between Race and Party Affiliation when Party Affiliation is dichotomized as Democrat/Independent v. Republican.
d) Summarize association by constructing a 95% confidence interval for the odds ratio between Race and whether a Democrat or Republican. Interpret. If we look only at Democrats and Republicans, we have n = 860, and a 95% confidence interval for the odds of a Black person being a Democrat are 11.1210 times the odds of a White person being a Democrat. A 95% confidence interval for the odds ratio is (5.8747, 21.0524). Hence, we conclude that the odds ratio is statistically significant.
Estimates of the Relative Risk (Row1/Row2)
Type of Study Value 95% Confidence Limits
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Case-Control (Odds Ratio) 11.1210 5.8747 21.0524
Cohort (Col1 Risk) 1.9766 1.7911 2.1813
Cohort (Col2 Risk) 0.1777 0.1010 0.3129
Sample Size = 860
3.9. The table below classifies a sample of psychiatric patients by their diagnosis and by whether their treatment prescribed drugs.
Diagnosis / Drugs / No drugsSchizophrenia / 105 / 8
Affective disorder / 12 / 2
Neurosis / 18 / 19
Personality disorder / 47 / 52
Special symptoms / 0 / 13
b) Partition chi-squared into three components to describe differences and similarities among the diagnoses, by comparing i) the first two rows, ii) the third and fourth rows, and iii) the last row to the first and second rows combined and the third and fourth rows combined.
i) The comparison of X = Diagnosis v. Y = Treatment, with X having two values: 1 = Schizophrenia and 2 = Affective Disorder.
Step 1: H0: , for all i = 1, 2 and j = 1, 2.
HA: , for some i and j.
Step 2: We have n = 127, I = 2, J = 2, and we choose a = 0.05.
Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 1.
Step 4: We will reject the null hypothesis if either , .
Step 5: From the table above or the SAS output below, we have with a
p-value =0.3450 , and with a p-value =0.3855.
Relationship Between Diagnosis
And Treatment
The FREQ Procedure
Table of diag by drug
diag drug
Frequency ‚Drugs ‚No Drugs‚ Total
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Schizophrenia ‚ 105 ‚ 8 ‚ 113
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Affective Disord ‚ 12 ‚ 2 ‚ 14
er ‚ ‚ ‚
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 117 10 127
Statistics for Table of diag by drug
Statistic DF Value Prob
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square 1 0.8917 0.3450
Likelihood Ratio Chi-Square 1 0.7530 0.3855
Continuity Adj. Chi-Square 1 0.1750 0.6757
Mantel-Haenszel Chi-Square 1 0.8847 0.3469
Phi Coefficient 0.0838
Contingency Coefficient 0.0835
Cramer's V 0.0838
WARNING: 25% of the cells have expected counts less
than 5. Chi-Square may not be a valid test.
Relationship Between Diagnosis
And Treatment
The FREQ Procedure
Statistics for Table of diag by drug
Statistic Value ASE
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Gamma 0.3725 0.3648
Kendall's Tau-b 0.0838 0.1110
Stuart's Tau-c 0.0283 0.0384
Somers' D C|R 0.0721 0.0966
Somers' D R|C 0.0974 0.1296
Pearson Correlation 0.0838 0.1110
Spearman Correlation 0.0838 0.1110
Lambda Asymmetric C|R 0.0000 0.0000
Lambda Asymmetric R|C 0.0000 0.0000
Lambda Symmetric 0.0000 0.0000
Uncertainty Coefficient C|R 0.0108 0.0265
Uncertainty Coefficient R|C 0.0085 0.0211
Uncertainty Coefficient Symmetric 0.0095 0.0234
Estimates of the Relative Risk (Row1/Row2)
Type of Study Value 95% Confidence Limits
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Case-Control (Odds Ratio) 2.1875 0.4157 11.5117
Cohort (Col1 Risk) 1.0841 0.8701 1.3506
Cohort (Col2 Risk) 0.4956 0.1166 2.1054
Sample Size = 127
Step 6: We fail to reject the null hypothesis at the 0.05 level of significance. We do not have sufficient evidence to conclude that there is a relationship between Diagnosis and Treatment when Diagnosis is dichotomized as either 1 = Schizophrenia or 2 = Affective Disorder.
ii) The comparison of X = Diagnosis v. Y = Treatment, with X having two values: 1 = Neurosis and 2 = Personality Disorder.
Step 1: H0: , for all i = 1, 2 and j = 1, 2.
HA: , for some i and j.
Step 2: We have n = 136, I = 2, J = 2, and we choose a = 0.05.
Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 1.
Step 4: We will reject the null hypothesis if either , .
Step 5: From the table above or the SAS output below, we have with a
p-value =0.9029 , and with a p-value =0.9029.
Relationship Between Diagnosis
And Treatment
The FREQ Procedure
Table of diag by drug
diag drug
Frequency ‚Drugs ‚No Drugs‚ Total
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Neurosis ‚ 18 ‚ 19 ‚ 37
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Personality Diso ‚ 47 ‚ 52 ‚ 99
rder ‚ ‚ ‚
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 65 71 136
Statistics for Table of diag by drug
Statistic DF Value Prob
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square 1 0.0149 0.9029
Likelihood Ratio Chi-Square 1 0.0149 0.9029
Continuity Adj. Chi-Square 1 0.0000 1.0000
Mantel-Haenszel Chi-Square 1 0.0148 0.9033
Phi Coefficient 0.0105
Contingency Coefficient 0.0105
Cramer's V 0.0105
Relationship Between Diagnosis
And Treatment
The FREQ Procedure
Statistics for Table of diag by drug
Statistic Value ASE
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Gamma 0.0235 0.1927
Kendall's Tau-b 0.0105 0.0858
Stuart's Tau-c 0.0093 0.0763
Somers' D C|R 0.0117 0.0963
Somers' D R|C 0.0093 0.0764
Pearson Correlation 0.0105 0.0858
Spearman Correlation 0.0105 0.0858
Lambda Asymmetric C|R 0.0000 0.0000
Lambda Asymmetric R|C 0.0000 0.0000
Lambda Symmetric 0.0000 0.0000
Uncertainty Coefficient C|R 0.0001 0.0013
Uncertainty Coefficient R|C 0.0001 0.0015
Uncertainty Coefficient Symmetric 0.0001 0.0014
Estimates of the Relative Risk (Row1/Row2)
Type of Study Value 95% Confidence Limits
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Case-Control (Odds Ratio) 1.0482 0.4923 2.2318
Cohort (Col1 Risk) 1.0247 0.6934 1.5143
Cohort (Col2 Risk) 0.9777 0.6785 1.4087
Sample Size = 136
Step 6: We fail to reject the null hypothesis at the 0.05 level of significance. We do not have sufficient evidence to conclude that there is a relationship between Diagnosis and Treatment, when Diagnosis is dichotomized as either 1 = Neurosis or 2 = Personality Disorder.
iii) The comparison of X = Diagnosis v. Y = Treatment, with X having three values: 1 = Schizophrenia/Affective Disorder, 2 = Neurosis/Personality Disorder, or 3 = Special Diagnosis.
Step 1: H0: , for all i = 1, 2, 3 and j = 1, 2.
HA: , for some i and j.
Step 2: We have n = 276, I = 3, J = 2, and we choose a = 0.05.
Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 2.