STAT 701 – Assignment #7 (Due between Oct. 20th)
(49 points)
1 - Prostate-Specific Antigen (PSA) Levels and Cancer Diagnosis
Babaian et al. "The Role of Prostate-Specific Antigen as Part of the Diagnostic Triad and as a Guide When to Perform a Biopsy", Cancer, 68, (1991) state that prostate-specific antigen (PSA), found in the ductal epithelial cells of the prostate, is specific for prostatic tissue and is detectable in serum from men with normal prostates and men with either benign or malignant diseases of this gland. They determined the PSA values in sample of 124 men who underwent a prostate biopsy. Sixty-seven of the men had elevated PSA values (> 4 ng/ml). Of these, 46 were diagnosed as having cancer. Ten of the 57 men with PSA values 4 ng/ml had cancer.
Research Question: On the basis of these data may we conclude that, in general, men with elevated PSA values are more likely to have prostate cancer?
Use both the standard normal test statistic and Fisher's Exact Test to test the hypothesis of interest. To perform Fisher's Exact Test you will need to enter the data in JMP. You will need three columns. The first column should contain information about their PSA level (high or low) and the second column should contain the results of their colon cancer diagnosis (yes or no) and the last column (Freq) should specify the appropriate cell frequencies. Make sure you place Freq in the appropriate field in the Fit Y by X dialog box.
a) Conduct a standard normal test and find a 95% CI for the difference in the proportion
of men from each PSA group that have colon cancer. (6 pts.)
b) Give/show the results of Fisher's Exact Test from JMP (2 pts.)
c) Summarize the results of parts (a) and (b) in terms of the question of interest. (2 pts.)
d) Conduct hypothesis tests to determine if there is sufficient evidence to suggest that the
RR and OR are greater than 1. Briefly summarize your results. (4 pts.)
e) Find and interpret 95% CI’s for both the RR and OR for cancer associated with
elevated PSA levels. Use may use JMP to do this, but it is worth trying to do one of
them “by hand”. Your interpretations of these CI’s should relate back to the question
of interest. (4 pts.)
2 – HIV Status and IV Drug Use History of Women in
NY Prison System
In a study of HIV infection among women entering the New York State prison system, 475 inmates were cross-classified with respect to HIV seropositivity and their histories of intravenous drug use. The variables you will be working with are coded as follows:
· IV Drug Use – indicator of previous intravenous drug use (Yes or No)
· HIV Status – results of HIV seropositivity test (positive or negative)
and the study results are contained in the data file: Prison HIV-Drug Use.JMP .
Research Question: Is there evidence that intravenous drug use is associated with HIV seropositivity?
a) Among women who have used drugs intravenously, what proportion are HIV-positive? Among women who have not used drugs intravenously, what proportion are HIV-positive? (2 pts.)
b) Use Fisher’s Exact Test to determine if being HIV-positive is positively associated with a previous history of intravenous drug use for this population of women. State your conclusion along with a supporting p-value. (2 pts.)
c) Find a 95% CI for the risk difference and interpret. This difference is also referred to as the attributable risk (AR) = pexposed - punexposed . (3 pts.)
d) Use your answers to calculate the relative risk (RR) for being HIV-positive associated with intravenous drug use for this population of women. Also find a 95% CI for the RR. Interpret. (4 pts.)
e) Compute the odds ratio (OR) for being HIV-positive associated with intravenous drug use for this population of women. Also find a 95% CI for the OR. Interpret. (4 pts.)
f) Number Need to Harm (NNH) – This is described in the Powerpoint that goes along with this material. You can also go to the following website which is actually the first hit you will get when you Google Search: Number Needed to Harm.
http://en.wikipedia.org/wiki/Number_needed_to_harm
Read through the Wikipedia entry on this website and then find the Number Need to Harm for this study. Also find a 95% CI for NNH and interpret. (4 pts.)
3 – Retirement Status and Heart Disease
In their paper “Retirement and Primary Cardiac Arrest in Males”, Siscovick et al. (1990) were interested in the investigating the potential association between retirement status and heart disease. One concern might be the age of the subjects as an older person is more likely to be retired, and also more likely to have heart disease. In one study, therefore, 127 victims of cardiac arrest were matched on a number of characteristics that included age with 127 healthy control subjects; retirement status was then ascertained for each subject. The following data were obtained.
Retired / Not Retired
Retired / 27 / 12 / 39
Not Retired / 20 / 68 / 88
Total / 47 / 80 / 127
Is there evidence to suggest that proportion of retired individuals is higher for males who have had a cardiac arrest when compared to similar males who are healthy? Conduct an appropriate test to answer this question and summarize your result. (4 pts.)
4 – Use of Oral Contraceptives and Thromboembolism
In their paper “Thromboembolism and Oral Contraceptives: an Epidemiologic Case-Control Study” Sartwell et al. (1969) conducted a case-control study to examine the potential relationship between thromboembolism and oral contraceptive use. The cases were 175 women of reproductive age (15 – 44), discharged alive from 43 hospitals in five cities after initial attacks of idiopathic (i.e. of unknown cause) thrombophlebitis, pulmonary embolism, or cerebral thrombosis or embolism. The controls were matched with their cases for hospital, residence, time of hospitalization, race, age, marital status, parity, and pay status. More specifically, the controls were female patients from the same hospital during the same 6-month interval. The controls were within 5 years of age and matched on parity (0, 1, 2, 3, or more prior pregnancies). The hospital pay status (ward, semi-private, or private) was the same. The data for oral contraceptive use are presented in the table below:
Case OC Use? / Control OC Use? / TotalYes / No
Yes / 10 / 57 / 67
No / 13 / 95 / 108
Total / 23 / 152 / 175
a) Do these data provide evidence that the population proportion of oral contraceptive use amongst cases is greater than the population proportion of oral contraceptive use amongst similar controls? (4 pts.)
b) OR for Dependent Samples or Matched-Pair Data
(This is NOT in Powerpoint, you are seeing it here for the first time)
The estimated odds ratio (OR) from matched-pair data is given by
and one form of the 95% CI for the OR is as follows:
Find a 95% CI for the OR in this study and interpret. (4 pts.)
4