BSTA 6652 Survival Analysis Winter, 2016
Problem Set 5
Reading: Klein: Chapter 12; SAS textbook: Chapter 4
ATTACH YOUR SAS CODE WITH YOUR ANSWERS.
The data in BMTH.txt (posted on the class webpage) was collected on 43 bone marrow transplant patients at the Ohio State University Bone Marrow Transplant Unit. All patients had either Hodgkin’s disease (HOD) or non-Hodgkin’s lymphoma (NHL) and were given either an Allogeneic (Allo) transplant from an HLA match sibling donor or an Autogeneic (Auto) transplant. Also included are two possible explanatory variables, Karnofsky score at transplant and the waiting time in months from diagnosis to transplant. Of interest is to study the difference in the leukemia-free survival rate between patients given an Allo or Auto transplant, adjusting for the patient’s disease state (HOD or NHL) and other covariates. The variables in this data set are as follows:
GRAFT / Transplant type (1=Allo, 2=Auto)Disease / Disease state (1=NHL, 2=HOD)
Time / Survival time in days
Status / Status of patient (0 =alive, 1 = dead or relapse)
Score / Karnofsky score at transplant
Wait / Waiting time in months from diagnosis to transplant
The data set BMTH.txt is posted online.
(a) Create a SAS dataset including all these variables and ONE new variable for the combination of GRAFT and Disease. Write your SAS code here.
(b) Fit the Accelerated Failure Time (AFT) model including all covariates under the assumption of Lognormal survival times. Write the fitted AFT model and your SAS code.
(c) Fit the Accelerated Failure Time (AFT) model including all covariates under the assumption of Log-logistic survival times. Write the fitted AFT model and your SAS code.
(d) Fit the Accelerated Failure Time (AFT) model including all covariates under the assumption of Weibull survival times. Write the fitted AFT model and SAS code.
(e) Which model is the best among the three models in parts (a), (b), and (c)? Defend your answer. Remove all insignificant prognostic variables at 10% level and fit the reduced model.
i. Write the fitted reduced model.
ii. Interpret based on the reduced model how each variable is associated with the survivorship.
(f) Check the goodness of fit of the reduced model by
i. Probability plot and residual analysis
ii. Compare it to the Generalized Gamma AFT model
Does the reduced model fit the data adequately?