Supplementary Material To The Manuscript:

Genome-Wide Approach to Identify Risk Factors for Therapy-Related Myeloid Leukemia

Alessia Bogni, Cheng Cheng, Wei Liu, Wenjian Yang, Jessica Pfeffer, Suraj Mukatira, Deborah French, James R. Downing, Ching-Hon Pui, and Mary V. Relling

Departments of Pharmaceutical Sciences, Biostatistics, Hartwell Center, Pathology, Hematology/Oncology St. Jude Children’s Research Hospital, Colleges of Medicine and Pharmacy, The University of Tennessee, Memphis, TN, USA.

This work was supported by NCI CA 51001, CA 78224, CA 36401, CA21765 and the NIH/NIGMS Pharmacogenetics Research Network and Database (U01 GM61393, U01GM61374 http://pharmgkb.org/) from the National Institutes of Health; by a Center of Excellence grant from the State of Tennessee; and by American Lebanese Syrian Associated Charities (ALSAC). C-H Pui is the American Cancer Society F.M. Kirby Clinical Research Professor.

Corresponding author: Mary V. Relling, Department of Pharmaceutical Sciences, St. Jude Children’s Research Hospital, 332 North Lauderdale, Memphis TN 38105-2794. Phone: 901-495-2348; Fax: 901-525-6869: e-mail:


B-Lineage: Cumulative Incidence Regression Model Results

Using Fine and Gray’s cumulative incidence regression model, we identified 256 probe sets significantly associated with time to t-ML development (a = 0.01; Supplementary Table S1). Principal component analysis (PCA) indicated that the expression of these 256 selected probe sets differentiated patients who developed t-ML from patients who remained in remission or relapsed (Supplementary Figure S2). On the basis of the differential expression of the 256 probe sets in patients who developed t-ML and in those who did not, we classified the 228 patients into three groups by hierarchical clustering (Supplementary Table S5): all the 13 t-ML patients were classified into cluster 1, with high association between clustering and type of event (exact Chi-square test, P 0.0001). The hierarchical clustering with 256 probe sets, gave a higher 7-year cumulative incidence of t-ML in cluster 1 (100%±7.7%) than in cluster 2 (0%) and in cluster 3 (0%) (Gray’s test P 0.0001) (Supplementary Figure S3). No association between genetic subtype of ALL and patient clusters defined by gene expression and hierarchical clustering was observed (Monte Carlo exact Chi-square test P = 0.01). To address whether these 256 probe sets and the three clusters could be obtained by chance, the permutation test performed as described yielded a P-value of 0.031 (k = 13).

B-Lineage: Cox Regression Model Results

Using a Cox regression model, we identified 83 probe sets significantly associated with t-ML development (a = 0.01; Supplementary Table S2). PCA of these 83 probe sets differentiated the patients who went on to develop t-ML from those who did not (Supplementary Figure S4). On the basis of the differential expression of the 83 probes, hierarchical clustering classified the 228 patients into 3 groups (Supplementary Table S6) with 12 of the 13 t-ML in the same group, with high association between clustering and type of event (exact Chi-square test P 0.0001). Using hierarchical clustering with these 83 probe sets, the 7-year cumulative incidence of t-ML was higher in cluster 3 (92%±11%), than in cluster 2 (2%±2%) and cluster 1 (0%) (Gray’s test P 0.0001) (Supplementary Figure S5). The permutation test yielded a P-value of 0.045 (k = 12).

Statistical Analysis Of The Entire Group Of Patients (including T- as well as B- Lineage ALL)

There were 267 patients with T- or B-lineage ALL enrolled on the treatment protocols Total XIIIA and XIIIB who had gene expression profiles evaluable. The frequency of t-ML and other events is as follows: fourteen patients developed t-ML, 47 patients experienced a competing event (relapse or death) and 206 patients were in complete remission at the time of the analysis. Of the 14 patients who experienced t-ML, only one patient had T-lineage ALL as a primary malignancy.

Using the same techniques described in the main manuscript for the largest immunophenotypic group (B-lineage ALL), we found 238 probe sets using Fine and Gray’s cumulative incidence regression model and 84 probe sets using Cox regression model, significantly associated with t-ML development in the entire group of 267 patients (T and B-lineage ALL) (Supplementary Table S7 and S8, respectively). Principal component analysis (PCA) shows that either the 238 or the 84 probe sets distinguished patients with t-ML from the others (Supplementary Figure S6 and S7, respectively). On the basis of the differential expression of the 238 probe sets or the 84 probe sets, the 267 patients were classified into groups by conducting hierarchical clustering.

Using the 238 probe sets, the hierarchical clustering classified all the 14 t-ML patients into the same cluster. There was high association between clustering and type of event (exact Chi-square test P 0.0001, Supplementary Table S9). The cumulative incidences of t-ML in each cluster after hierarchical clustering at year 7 were 100%±7.1% for cluster 3, and 0% for cluster 1 and 2 (Gray’s test P 0.0001, Supplementary Figure S8). The permutation test yielded a P-value <0.001 (k = 14).

Using the 84 probe sets, the hierarchical clustering classified ten of the 14 t-ML patients into cluster 3 (the total number of patient in the cluster was 10). There was high association between clustering and type of event (exact Chi-square test P-value < 0.0001, Supplementary Table S10). The cumulative incidences of t-ML in each cluster after hierarchical clustering at year 7 were 100%±10% for cluster 3, 1.8%±1.3% for cluster 1 and 1.4%±1% for cluster 2 (Gray’s test P 0.0001, Supplementary Figure S9). The permutation test yielded a P-value of 0.13 (k = 10).

Sixty eight probes were in common between those selected by the two models (Supplementary Table S11), and were used to re-perform the hierarchical clustering analysis. PCA of the 68 genes could separate the t-ML patients from the others (Supplementary Figure S10). Hierarchical clustering classified 13 t-ML patients into cluster 3, with a total number of 14 patients in that cluster. There was high association between clustering and type of event (exact Chi-square test, P-value < 0.0001, Supplementary Table S12). The cumulative incidences of t-ML in each cluster after hierarchical clustering at year 7 were 100%±10.7% for cluster 3, 0.5%±0.5% for cluster 1 and 0% for cluster 2 (Gray’s test P 0.0001, Supplementary Figure S11). The permutation test yielded a P-value of 0.001 (k = 13).

Real Time RT-PCR

Gene expression was verified by real time RT-PCR assays (reverse transcription polymerase chain reaction) in a subset of patient samples for four genes (POLIIa, Myb, SLC25A6, HIST1H4C), that were discriminating between patients who did vs did not develop t-ML.

One mg of total RNA was treated with DNase I, according to the manufacturer instructions (Invitrogen, Carlsbad, California). Complementary DNA (cDNA) was then generated using Superscript II Rnase H- reverse transcriptase and random hexamers as primers (Invitrogen). Additionally, controls that contained either no template or no reverse transcriptase were included as negative controls in each run. To quantify gene expression in each cDNA sample, aliquots of RT reaction mixture (20 ml) were used for real-time PCR with specific probes designed for POLIIa, Myb, SLC25A6, HIST1H4C and RNase P (Applied Biosystems, Foster City, California). The expression level of RNase P was used for normalization.

Real-time RT-PCR was performed as described1.To estimate the amount of each mRNAs in the patient samples, we used linear regression analysis based on a standard curve representing six serial dilutions of cDNA (100 ng, 20 ng, 0.4 ng, 0.08 ng, 0.016 ng and 0.0032 ng) obtained from the Nalm6 human leukemia cells (American Type Culture Collection, Rockville, Maryland). In the standard curve, fluorescent signal intensities were plotted against the number of PCR cycles on a semi-logarithmic scale.

Each normalized and log transformed real-time RT-PCR gene expression level (POLIIa, Myb, SLC25A6, HIST1H4C) was compared to the log transformed signal of gene expression from the Affymetrix MAS 5.0 output. The correlation between real-time RT-PCR and Affymetrix GeneChip® was 71% for POL2a, 72% for SLC25A6, 73% for Myb, and 73% for HIST1H4C, consistent with expression levels determined using the gene expression array (Supplementary Figure S1).

Reference List

1 Cheok MH, Yang W, Pui CH, Downing JR, Cheng C, Naeve CW, Relling MV, Evans WE. Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells. Nat Genet 2003; 34:85-90.


Supplementary Table S1: Candidate genes associated with t-ML development after ALL treatment. Using a cumulative incidence regression model among the 228 B-lineage patients, we found 256 probes significantly associated with time to t-ML development (a = 0.01).

Probe ID / Gene Symbol / Description / Relative Expression / P value
1005_at / DUSP1 / dual specificity phosphatase 1 / -0.61 / 0.0088
1108_s_at / EPHA1 / EPH receptor A1 / 0.84 / 0.0029
1130_at / MAP2K1 / mitogen-activated protein kinase kinase 1 / 2.89 / 0.0012
1189_at / CDK8 / cyclin-dependent kinase 8 / -0.80 / 0.0049
1280_i_at / --- / --- / 0.69 / 0.0096
1352_at / IL8RA / interleukin 8 receptor, alpha / 1.10 / 0.0003
1388_g_at / VDR / vitamin D (1,25- dihydroxyvitamin D3) receptor / -0.67 / 0.0034
1403_s_at / CCL5 / chemokine (C-C motif) ligand 5 / 0.89 / 0.0026
152_f_at / HIST2H4 / histone 2, H4 / 0.89 / 0.0019
1550_at / MRPL28 / mitochondrial ribosomal protein L28 / -0.76 / 0.0010
1551_g_at / MRPL28 / mitochondrial ribosomal protein L28 / 2.36 / 0.0081
1652_at / PIM2 / pim-2 oncogene / -1.07 / 0.0067
1685_at / SPHAR / S-phase response (cyclin-related) / 1.02 / 0.0075
1722_at / MAP2K5 / mitogen-activated protein kinase kinase 5 / 0.49 / 0.0021
1801_at / BARD1 / BRCA1 associated RING domain 1 / -1.29 / 0.0061
1832_at / MCC / mutated in colorectal cancers / -0.42 / 0.0019
1852_at / TNF / tumor necrosis factor (TNF superfamily, member 2) / 0.98 / 0.0054
1877_g_at / --- / --- / 0.95 / 0.0063
1913_at / CCNG2 / cyclin G2 / -1.46 / 0.0040
1920_s_at / CCNG1 / cyclin G1 / -1.13 / 0.0002
2017_s_at / CCND1 / cyclin D1 (PRAD1: parathyroid adenomatosis 1) / -0.92 / 0.0008
2053_at / CDH2 / cadherin 2, type 1, N-cadherin (neuronal) / -0.83 / 0.0000
229_at / CEBPZ / CCAAT/enhancer binding protein zeta / 2.55 / 0.0026
278_at / NPR2 / natriuretic peptide receptor B/guanylate cyclase B (atrionatriuretic peptide receptor B) / 1.09 / 0.0000
293_at / --- / --- / 0.76 / 0.0021
299_i_at / --- / --- / 0.50 / 0.0075
31315_at / IGLC2 / Immunoglobulin lambda variable 3-21 / -0.70 / 0.0022
31348_at / RFC1 / replication factor C (activator 1) 1, 145kDa / 0.55 / 0.0064
31472_s_at / CD44 / CD44 antigen (homing function and Indian blood group system) / 0.80 / 0.0068
31488_s_at / PGK1 / phosphoglycerate kinase 1 / 2.05 / 0.0034
31506_s_at / DEFA1 /// DEFA3 / defensin, alpha 1, myeloid-related sequence /// defensin, alpha 3, neutrophil-specific / -0.65 / 0.0014
31603_at / AIM2 / absent in melanoma 2 / 0.61 / 0.0082
31738_at / --- / --- / 0.60 / 0.0028
31773_at / --- / --- / 0.72 / 0.0098
40791_AT / POLR2A / RNA polymerase II (220kD) / -2.892 / 0.0080
31903_at / SS18L1 / synovial sarcoma translocation gene on chromosome 18-like 1 / -0.81 / 0.0048
31936_s_at / LKAP / limkain b1 / -0.99 / 0.0058
31943_g_at / TULP3 / tubby like protein 3 / 0.79 / 0.0049
31948_at / RPS21 / ribosomal protein S21 / 1.68 / 0.0020
31987_at / --- / DKFZp564G103 (from clone DKFZp564G103) / 0.90 / 0.0038
32023_at / LOC346157 / similar to dJ153G14.3 (novel C2H2 type Zinc Finger protein) / 0.58 / 0.0053
32038_s_at / SRP46 / Splicing factor, arginine/serine-rich, 46kD / -0.77 / 0.0021
32048_at / --- / Gene from PAC 886K2, chromosome 1 / 1.32 / 0.0090
32102_at / SACS / spastic ataxia of Charlevoix-Saguenay (sacsin) / 1.49 / 0.0003
32162_r_at / --- / --- / 1.38 / 0.0019
32170_g_at / FBXO21 / F-box protein 21 / -0.90 / 0.0001
32197_at / SLC25A11 / solute carrier family 25 (mitochondrial carrier; oxoglutarate carrier), member 11 / 1.88 / 0.0094
32230_at / EIF3S2 / eukaryotic translation initiation factor 3, subunit 2 beta, 36kDa / 2.30 / 0.0035
32234_at / TOR1A / torsin family 1, member A (torsin A) / 1.33 / 0.0057
32253_at / RERE / arginine-glutamic acid dipeptide (RE) repeats / -2.15 / 0.0005
32297_s_at / KLRC1 /// KLRC2 / killer cell lectin-like receptor subfamily C, member 1 /// killer cell lectin-like receptor subfamily C, member 2 / 1.15 / 0.0015
32354_at / NPAS3 / neuronal PAS domain protein 3 / 0.75 / 0.0018
32391_g_at / LOC388714 / similar to Putative dimethylaniline monooxygenase [N-oxide forming] 6 (Flavin-containing monooxygenase 6) (FMO 6) (Dimethylaniline oxidase 6) / 1.38 / 0.0000
32440_at / RPL17 / ribosomal protein L17 / 2.89 / 0.0015
32528_at / CLPP / ClpP caseinolytic protease, ATP-dependent, proteolytic subunit homolog (E. coli) / 2.16 / 0.0036
32551_at / EFEMP1 / EGF-containing fibulin-like extracellular matrix protein 1 / 0.89 / 0.0002
32566_at / CHPF / chondroitin polymerizing factor / -1.14 / 0.0046
32588_s_at / ZFP36L2 / zinc finger protein 36, C3H type-like 2 / -0.76 / 0.0015
326_i_at / --- / --- / 2.47 / 0.0024
32611_at / PBP / prostatic binding protein / 2.42 / 0.0054
32619_at / FETUB / Fetuin B / 0.61 / 0.0093
32647_at / VTI1B / vesicle transport through interaction with t-SNAREs homolog 1B (yeast) / -1.65 / 0.0000
32658_at / VPS52 / vacuolar protein sorting 52 (yeast) / -3.00 / 0.0002