STA 6505, Fall 2008, Homework #3 Solutions

Chapter 3: 3.4acd, 3.9b, 3.11a, 3.13ab (no need to discuss how the small-sample C.I. was calculated; it is somewhat complicated)

3.4. We have the following contingency table, with explanatory variable X = race, having two levels, Black and White, and response variable Y = party affiliation, having three levels, Democrat, Independent, and Republican.

Party Affiliation
Race / Democrat / Independent / Republican
Black / 103 / 15 / 11
White / 341 / 105 / 405

a) Using X2 and G2, test the hypothesis of independence between Party Affiliation and Race. Report the p-values and interpret.

Step 1: H0: , for all i = 1, 2 and j = 1, 2, 3.

HA: , for some i and j.

Step 2: We have n = 980, I = 2, J = 3, and we choose a = 0.05.

Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 2.

Step 4: We will reject the null hypothesis if either , .

Step 5: From the table above or the SAS output below, we have with a

p-value < 0.0001 , and with a p-value < 0.0001.

Comparison Between Race and Party

The FREQ Procedure

Table of race by party

race party

Frequency‚Democrat‚Independ‚Republic‚ Total

‚ ‚ent ‚an ‚

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Black ‚ 103 ‚ 15 ‚ 11 ‚ 129

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

White ‚ 341 ‚ 105 ‚ 405 ‚ 851

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Total 444 120 416 980

Statistics for Table of race by party

Statistic DF Value Prob

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Chi-Square 2 79.4310 <.0001

Likelihood Ratio Chi-Square 2 90.3311 <.0001

Mantel-Haenszel Chi-Square 1 79.3336 <.0001

Phi Coefficient 0.2847

Contingency Coefficient 0.2738

Cramer's V 0.2847

Sample Size = 980

Step 6: We reject the null hypothesis at the 0.05 level of significance. We have sufficient evidence to conclude that Race and Party Affiliation are not independent.

c) Partition chi-squared into components regarding the choice between Democrat and Independent and between those two combined and Republican. Interpret.

First subtable; X = Race, Y = Party Affiliation (Democrat v. Independent).

Step 1: H0: , for all i = 1, 2 and j = 1, 2.

HA: , for some i and j.

Step 2: We have n = 564, I = 2, J = 2, and we choose a = 0.05.

Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 1.

Step 4: We will reject the null hypothesis if either , .

Step 5: From the table above or the SAS output below, we have with a

p-value = 0.0106 , and with a p-value = 0.0075.

Comparison Between Race and Party

The FREQ Procedure

Table of race by party

race party

Frequency‚Democrat‚Independ‚ Total

‚ ‚ent ‚

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Black ‚ 103 ‚ 15 ‚ 118

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

White ‚ 341 ‚ 105 ‚ 446

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Total 444 120 564

Statistics for Table of race by party

Statistic DF Value Prob

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Chi-Square 1 6.5350 0.0106

Likelihood Ratio Chi-Square 1 7.1561 0.0075

Continuity Adj. Chi-Square 1 5.9044 0.0151

Mantel-Haenszel Chi-Square 1 6.5234 0.0106

Phi Coefficient 0.1076

Contingency Coefficient 0.1070

Cramer's V 0.1076

The FREQ Procedure

Statistics for Table of race by party

Statistic Value ASE

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Gamma 0.3578 0.1299

Kendall's Tau-b 0.1076 0.0361

Stuart's Tau-c 0.0717 0.0246

Somers' D C|R 0.1083 0.0367

Somers' D R|C 0.1070 0.0362

Pearson Correlation 0.1076 0.0361

Spearman Correlation 0.1076 0.0361

Lambda Asymmetric C|R 0.0000 0.0000

Lambda Asymmetric R|C 0.0000 0.0000

Lambda Symmetric 0.0000 0.0000

Uncertainty Coefficient C|R 0.0123 0.0087

Uncertainty Coefficient R|C 0.0124 0.0087

Uncertainty Coefficient Symmetric 0.0123 0.0087

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Case-Control (Odds Ratio) 2.1144 1.1789 3.7921

Cohort (Col1 Risk) 1.1417 1.0476 1.2442

Cohort (Col2 Risk) 0.5400 0.3270 0.8916

Sample Size = 564

Step 6: We reject the null hypothesis at the 0.05 level of significance. We have sufficient evidence to conclude that Race and Party Affiliation are not independent. The phi coefficient is 0.1076, showing a relatively weak positive association between Race and Party Affiliation, when Party Affiliation includes only Democrat v. Independent.

Second subtable – X = Race, Y = Party Affiliation (Democrat/Independent v. Republican).

Step 1: H0: , for all i = 1, 2 and j = 1, 2.

HA: , for some i and j.

Step 2: We have n = 564, I = 2, J = 2, and we choose a = 0.05.

Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 1.

Step 4: We will reject the null hypothesis if either , .

Step 5: From the table above or the SAS output below, we have with a

p-value < 0.0001 , and with a p-value < 0.0001.

Comparison Between Race and Party Affiliation

The FREQ Procedure

Table of race by party

race party

Frequency‚Democrat‚Republic‚ Total

‚/Indepen‚an ‚

‚dent ‚ ‚

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Black ‚ 118 ‚ 11 ‚ 129

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

White ‚ 446 ‚ 405 ‚ 851

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Total 564 416 980

Statistics for Table of race by party

Statistic DF Value Prob

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Chi-Square 1 69.9721 <.0001

Likelihood Ratio Chi-Square 1 83.1750 <.0001

Continuity Adj. Chi-Square 1 68.3822 <.0001

Mantel-Haenszel Chi-Square 1 69.9007 <.0001

Phi Coefficient 0.2672

Contingency Coefficient 0.2582

Cramer's V 0.2672

The FREQ Procedure

Statistics for Table of race by party

Statistic Value ASE

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Gamma 0.8138 0.0545

Kendall's Tau-b 0.2672 0.0222

Stuart's Tau-c 0.1786 0.0185

Somers' D C|R 0.3906 0.0300

Somers' D R|C 0.1828 0.0188

Pearson Correlation 0.2672 0.0222

Spearman Correlation 0.2672 0.0222

Lambda Asymmetric C|R 0.0000 0.0000

Lambda Asymmetric R|C 0.0000 0.0000

Lambda Symmetric 0.0000 0.0000

Uncertainty Coefficient C|R 0.0623 0.0117

Uncertainty Coefficient R|C 0.1090 0.0191

Uncertainty Coefficient Symmetric 0.0792 0.0143

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Case-Control (Odds Ratio) 9.7411 5.1758 18.3332

Cohort (Col1 Risk) 1.7454 1.6065 1.8963

Cohort (Col2 Risk) 0.1792 0.1014 0.3167

Sample Size = 980

Step 6: We reject the null hypothesis at the 0.05 level of significance. We have sufficient evidence to conclude that Race and Party Affiliation are not independent. In particular, the phi coefficient is 0.2672, showing a somewhat weak positive correlation between Race and Party Affiliation when Party Affiliation is dichotomized as Democrat/Independent v. Republican.

d) Summarize association by constructing a 95% confidence interval for the odds ratio between Race and whether a Democrat or Republican. Interpret. If we look only at Democrats and Republicans, we have n = 860, and a 95% confidence interval for the odds of a Black person being a Democrat are 11.1210 times the odds of a White person being a Democrat. A 95% confidence interval for the odds ratio is (5.8747, 21.0524). Hence, we conclude that the odds ratio is statistically significant.

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Case-Control (Odds Ratio) 11.1210 5.8747 21.0524

Cohort (Col1 Risk) 1.9766 1.7911 2.1813

Cohort (Col2 Risk) 0.1777 0.1010 0.3129

Sample Size = 860

3.9. The table below classifies a sample of psychiatric patients by their diagnosis and by whether their treatment prescribed drugs.

Diagnosis / Drugs / No drugs
Schizophrenia / 105 / 8
Affective disorder / 12 / 2
Neurosis / 18 / 19
Personality disorder / 47 / 52
Special symptoms / 0 / 13

b) Partition chi-squared into three components to describe differences and similarities among the diagnoses, by comparing i) the first two rows, ii) the third and fourth rows, and iii) the last row to the first and second rows combined and the third and fourth rows combined.

i) The comparison of X = Diagnosis v. Y = Treatment, with X having two values: 1 = Schizophrenia and 2 = Affective Disorder.

Step 1: H0: , for all i = 1, 2 and j = 1, 2.

HA: , for some i and j.

Step 2: We have n = 127, I = 2, J = 2, and we choose a = 0.05.

Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 1.

Step 4: We will reject the null hypothesis if either , .

Step 5: From the table above or the SAS output below, we have with a

p-value =0.3450 , and with a p-value =0.3855.

Relationship Between Diagnosis

And Treatment

The FREQ Procedure

Table of diag by drug

diag drug

Frequency ‚Drugs ‚No Drugs‚ Total

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Schizophrenia ‚ 105 ‚ 8 ‚ 113

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Affective Disord ‚ 12 ‚ 2 ‚ 14

er ‚ ‚ ‚

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Total 117 10 127

Statistics for Table of diag by drug

Statistic DF Value Prob

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Chi-Square 1 0.8917 0.3450

Likelihood Ratio Chi-Square 1 0.7530 0.3855

Continuity Adj. Chi-Square 1 0.1750 0.6757

Mantel-Haenszel Chi-Square 1 0.8847 0.3469

Phi Coefficient 0.0838

Contingency Coefficient 0.0835

Cramer's V 0.0838

WARNING: 25% of the cells have expected counts less

than 5. Chi-Square may not be a valid test.

Relationship Between Diagnosis

And Treatment

The FREQ Procedure

Statistics for Table of diag by drug

Statistic Value ASE

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Gamma 0.3725 0.3648

Kendall's Tau-b 0.0838 0.1110

Stuart's Tau-c 0.0283 0.0384

Somers' D C|R 0.0721 0.0966

Somers' D R|C 0.0974 0.1296

Pearson Correlation 0.0838 0.1110

Spearman Correlation 0.0838 0.1110

Lambda Asymmetric C|R 0.0000 0.0000

Lambda Asymmetric R|C 0.0000 0.0000

Lambda Symmetric 0.0000 0.0000

Uncertainty Coefficient C|R 0.0108 0.0265

Uncertainty Coefficient R|C 0.0085 0.0211

Uncertainty Coefficient Symmetric 0.0095 0.0234

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Case-Control (Odds Ratio) 2.1875 0.4157 11.5117

Cohort (Col1 Risk) 1.0841 0.8701 1.3506

Cohort (Col2 Risk) 0.4956 0.1166 2.1054

Sample Size = 127

Step 6: We fail to reject the null hypothesis at the 0.05 level of significance. We do not have sufficient evidence to conclude that there is a relationship between Diagnosis and Treatment when Diagnosis is dichotomized as either 1 = Schizophrenia or 2 = Affective Disorder.

ii) The comparison of X = Diagnosis v. Y = Treatment, with X having two values: 1 = Neurosis and 2 = Personality Disorder.

Step 1: H0: , for all i = 1, 2 and j = 1, 2.

HA: , for some i and j.

Step 2: We have n = 136, I = 2, J = 2, and we choose a = 0.05.

Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 1.

Step 4: We will reject the null hypothesis if either , .

Step 5: From the table above or the SAS output below, we have with a

p-value =0.9029 , and with a p-value =0.9029.

Relationship Between Diagnosis

And Treatment

The FREQ Procedure

Table of diag by drug

diag drug

Frequency ‚Drugs ‚No Drugs‚ Total

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Neurosis ‚ 18 ‚ 19 ‚ 37

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Personality Diso ‚ 47 ‚ 52 ‚ 99

rder ‚ ‚ ‚

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Total 65 71 136

Statistics for Table of diag by drug

Statistic DF Value Prob

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Chi-Square 1 0.0149 0.9029

Likelihood Ratio Chi-Square 1 0.0149 0.9029

Continuity Adj. Chi-Square 1 0.0000 1.0000

Mantel-Haenszel Chi-Square 1 0.0148 0.9033

Phi Coefficient 0.0105

Contingency Coefficient 0.0105

Cramer's V 0.0105

Relationship Between Diagnosis

And Treatment

The FREQ Procedure

Statistics for Table of diag by drug

Statistic Value ASE

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Gamma 0.0235 0.1927

Kendall's Tau-b 0.0105 0.0858

Stuart's Tau-c 0.0093 0.0763

Somers' D C|R 0.0117 0.0963

Somers' D R|C 0.0093 0.0764

Pearson Correlation 0.0105 0.0858

Spearman Correlation 0.0105 0.0858

Lambda Asymmetric C|R 0.0000 0.0000

Lambda Asymmetric R|C 0.0000 0.0000

Lambda Symmetric 0.0000 0.0000

Uncertainty Coefficient C|R 0.0001 0.0013

Uncertainty Coefficient R|C 0.0001 0.0015

Uncertainty Coefficient Symmetric 0.0001 0.0014

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Case-Control (Odds Ratio) 1.0482 0.4923 2.2318

Cohort (Col1 Risk) 1.0247 0.6934 1.5143

Cohort (Col2 Risk) 0.9777 0.6785 1.4087

Sample Size = 136

Step 6: We fail to reject the null hypothesis at the 0.05 level of significance. We do not have sufficient evidence to conclude that there is a relationship between Diagnosis and Treatment, when Diagnosis is dichotomized as either 1 = Neurosis or 2 = Personality Disorder.

iii) The comparison of X = Diagnosis v. Y = Treatment, with X having three values: 1 = Schizophrenia/Affective Disorder, 2 = Neurosis/Personality Disorder, or 3 = Special Diagnosis.

Step 1: H0: , for all i = 1, 2, 3 and j = 1, 2.

HA: , for some i and j.

Step 2: We have n = 276, I = 3, J = 2, and we choose a = 0.05.

Step 3: The test statistic is either or , where for all i, j, and under the null hypothesis, either statistic has an approximate chi-square distribution with d.f. = 2.