SUPPLEMENTARY INFORMATION TO:

Combined high resolution array-based comparative genomic hybridization and expression profiling of ETV6/RUNX1-positive acute lymphoblastic leukemias reveal a high incidence of cryptic Xq duplications and identify several putative target genes within the commonly gained region

Henrik Lilljebjörn*1, Markus Heidenblad1, Björn Nilsson1, Carin Lassen1, Andrea Horvat1, Jesper Heldrup2, Mikael Behrendtz3, Bertil Johansson1, Anna Andersson1, and Thoas Fioretos1

1Department of Clinical Genetics, Lund University Hospital, Lund, Sweden; 2Department of Pediatrics, Lund University Hospital, Lund, Sweden; 3Department of Pediatrics, Linköping University Hospital, Linköping, Sweden

Section I: Supplementary Methods

-Array CGH

-Data Analyses

-RT-PCR

-RQ-PCR

-Bisulfite sequencing

Section II: Legends to Supplementary Figures

-Legend to Supplementary Figure 1.

-Legend to Supplementary Figure 2.

-Legend to Supplementary Figure 3.

-Legend to Supplementary Figure 4.

-Legend to Supplementary Figure 5.

Section III: Supplementary Figures

-Supplementary Figure 1.

-Supplementary Figure 2.

-Supplementary Figure 3.

-Supplementary Figure 4.

-Supplementary Figure 5.

Section IV: Supplementary Tables

-Supplementary Table 1.

Section V: References

Supplementary Methods

Array CGH

In brief, DNA was extracted from bone marrow (BM) cells at the time of diagnosis or relapse and analyzed using 32K array slides containing 32 433 bacterial artificial chromosome (BAC) clones (BACPAC Resources, Oakland, CA) covering 98% of the human genome at a 100 kb resolution (Swegene DNA Microarray Resource Center, Lund University). Pretreatment of the slides was performed using the Pronto Microarray Reagent System (Promega, Madison, WI). DNA was labeled using the BioPrime Array CGH Genomic Labeling system (Invitrogen, Carlsbad, CA), and purified with the CyScribe GFX purification kit (GE Healthcare, Little Chalfont, UK). 1.5 µg Cy3-labeled patient DNA, 1.5 µg Cy5-labeled male reference genomic DNA (Promega), and 100 µg Cot-I DNA (Invitrogen) were mixed and dried in a speed vacuum centrifuge. The samples were dissolved in 57 µl hybridization solution (50% formamide, 10% dextran sulphate, 2 x SSC, 2% SDS, 4 µg/µl yeast tRNA) and hybridized to the slides in a damp environment at 37 °C for 48-72 h. Posthybridization washes were performed in wash solution 1 (2 x SSC, 0.1% SDS) for 15 min at room temperature followed by wash solution 2 (50% formamide, 2 x SSC) for 15 min at 45 °C, wash solution 1 for 30 min at 45 °C, and finally 0.2 x SSC for 15 min at room temperature followed by scanning in an Agilent G2565AA microarray scanner (Agilent Technologies, Santa Clara, CA).

Data analyses

The images from the CGH and gene expression arrays were analyzed using the GenePix Pro 4.0 software (Axon instruments, Foster City, CA), and the obtained data matrices were uploaded to the BioArray Software Environment (BASE) (1). For the array CGH, each spot was corrected for background by calculating median feature minus median local background for both channels. Spots with a signal to noise ratio lower than 3 were excluded from further analyses. The data were normalized using the Lowess method and visualized by the use of the MATLAB toolbox CGH-plotter (2) using moving average over three clones and deletion/amplification limits of log2 > 0.2. Identification of genomic imbalances was based on CGH-plotter images followed by visual inspection of log2 values. Only imbalances present in two or more cases or detected in 10 or more overlapping BACs and not described as a copy number polymorphism in the database of common genetic variants (3) ( were considered. Data analysis of the expression profiles for the ETV6/RUNX1-positive ALLs included in the present study was performed as described previously (4). When investigating chromosome-specific expression patterns, high hyperdiploid (>51 chromosomes) leukemias were excluded in order to avoid introducing a bias from the high expression of genes located on the gained chromosomes in this ALL subset. The remaining data were mean-centered and analyzed with a BASE implementation of the MATLAB toolbox CGH-plotter (2) as above. To compare expression of genes on the X chromosome between t(12;21)-positive and -negative ALLs, all expression values were converted to z-scores, representing the relative expression of each gene, using zgi= (xgi-mg)/sg, where xgi represents the expression value (converted to log scale) for gene g in case i and mg and sg represent the mean and standard deviation, respectively, in all t(12;21)-negative cases. When the expression of individual genes was studied, the normalized raw data were used.

RT-PCR

Expression of the SPANX genes was investigated by RT-PCR using previously published primers (5). The forward primers were: 5’-CTGCCRCWGACATTGAAGAA-3’ (SPANXA, C and D) and 5’-ACTGTAGACATCGAAGAACC-3’ (SPANXB). The common reverse primer was: 5’-TTGATTCTGTTCTCTCGGGC-3’. Amplification of ACTB was used as control of cDNA quality. Total RNA was extracted from BM at the time of diagnosis or relapse and 0.5 µg RNA was reversely transcribed using M-MLV Reverse Transcriptase (Invitrogen) according to the manufacturer’s protocol. 1 µl cDNA was added to a 50 µl reaction containing 15 pmol of each primer, 2 mM MgCl2, 10 nmol of each dNTP, and 1 U Platinum Taq DNA Polymerase (Invitrogen) in 1 x PCR buffer and amplified by 35 cycles of 95 °C for 1 min, 60 °C for 1 min, and 72 °C for 2 min.

RQ-PCR

Expression of SPANXB, HMGB3, FAM50A, HTATSF1, and RAP2C was investigated by RQ-PCR using pre-designed real time PCR assays (Applied Biosystems) or with primers designed with Primer Express Software (Applied Biosystems). The ΔΔCt method was used to calculate relative expression of each gene, normalized against 18S rRNA expression.

Bisulfite sequencing

The methylation status of all CpG dinucleotides 600 bp upstream of the SPANXB initiating codon in four t(12;21)-positive and three t(12;21)-negative cases was examined using bisulfite sequencing. Unmethylated cytosine residues were converted to uracil using the EpiTect bisulfite kit (Qiagen), the treated DNA was used as PCR-template together with the following primers: Forward: 5’-GGTAAAAGAATATGGGTTGA-3’, Reverse: 5’-TAATTCTTCAATATCTACAATAAAT-3’. The PCR-products were purified using theQIAquick PCR purification kit (Qiagen) and cloned into pCR4-TOPO vectors (Invitrogen) according to the manufacturer’s instructions. For each case, plasmid DNA from at least 5 colonies was sequenced using M13 primers and the BigDye Terminator v1.1 Cycle Sequencing Kit(Applied Biosystems).

Legend to Supplementary Figure 1. The der(6)t(X;6) in case 1 identified by array CGH and FISH. (A) DNA copy number profile for chromosome X. BACs are plotted along the chromosome and the log2 scale is indicated above the profile; chromosome bands are indicated on the left. Chromosome regions with log2 ratios above 0.2 are colored in red and indicate gain of material. Thus, the segment distal of Xq21.31 is gained. (B) DNA copy number profile for chromosome 6; chromosome regions with log2 ratios below 0.2 are colored in green and indicate loss of material. Thus, the segment distal of 6q14.1 is hemizygously deleted. (C) Metaphase FISH using whole chromosome paint probes for chromosomes X (red) and 6 (green). In addition to one X chromosome and one chromosome 6, a der(6)t(X;6) is seen, confirming the array CGH findings.

Legend to Supplementary Figure 2. Impact of genomic imbalances on gene expression patterns. DNA copy number and expression profiles of chromosomes 6 and X in case 8. BACs and cDNAs are plotted along the chromosome and the log2 scale is indicated above the profile; chromosome bands are indicated on the left. Chromosome regions with log2 values below 0.2 are colored in green and regions with log2 values above 0.2 are colored in red. (A) DNA copy number profile of chromosome 6, showing a hemizygous deletion distal of 6q14.1. (B)Centered gene expression array data for chromosome 6, visualized with CGH-plotter. Genes on the deleted region of 6q (see image A) show a reduced expression level. (C) Array CGH plot of chromosome X, revealing a dup(X)(q21.31q28). (D) Centered gene expression data for chromosome X, visualized with CGH-plotter; a higher expression of genes in the duplicated region is clearly seen.

Legend to Supplementary Figure 3. Cumulative distribution of differential gene expression in the minimally gained Xq region of ETV6/RUNX1-positive ALLs.

T-statistics were calculated for the expression of genes in ETV6/RUNX1-positive ALLs both with (6 cases) and without (10 cases) gain of Xq versusETV6/RUNX1-negative cases. The cumulative distribution of the t-statistics were then plotted to illustrate the effect of dup(Xq) on gene expression. (A) Cumulative plot of t-statistics of genes in the minimally gained region for t(12;21)-positive ALLs with dup(Xq) (blue) and without dup(Xq) (green). The blue curve is undoubtedly right-shifted compared to the green curve, demonstrating the relative overexpression in t(12;21)-positive cases with Xq duplication as compared to t(12;21)-positive cases without dup(Xq). (B) Cumulative plot of t-statistics of genes in the minimally gained region (blue) and the whole genome (green) for t(12;21)-positive ALLs with gain of Xq. The blue curve is right-shifted compared to the green curve, consistent with a relative enrichment of overexpressed genes in the minimally gained region. (C)Cumulative plot of t-statistics of genes in the minimally gained region (blue) and the whole genome (green) for t(12;21)-positive ALLs without gain of Xq. In sharp contrast to (B), the blue curve is considerably closer to the green curve, illustrating that there is no relative overexpression of genes within the minimally gained region for ETV6/RUNX1-positive cases without dup(Xq).

Legend to Supplementary Figure 4. RT-PCR identifies t(12;21)-specific expression of SPANXB. Ten ETV6/RUNX1-negative and five ETV6/RUNX1-positive ALLs were investigated with RT-PCR using primers specific for SPANXB. Three of the ETV6/RUNX1-positive cases were males and two of these displayed a gain of Xq by array CGH. The ten ETV6/RUNX1-negative ALLs comprised one case with MLL rearrangement, two cases with the TCF3/PBX1 fusion gene, two cases with the BCR/ABL1 fusion gene, three with a normal karyotype (NK), and two cases without cytogenetic information (indicated n/a). NTC indicates no template control and M the 100 bp DNA ladder. A 260 bp DNA fragment, corresponding to SPANXB, was amplified in all five ETV6/RUNX1-positive cases, regardless of sex. Low expression of SPANXB was also seen in four of the ETV6/RUNX1-negative ALLs; the two BCR/ABL1-positive cases, one case with normal karyotype, and one case without cytogenetic information. A 900 bp DNA fragment, consistent with genomic SPANXB, was seen in four cases. ACTB was used as a control of cDNA quality.

Legend to Supplementary Figure 5. RQ-PCR analysis of SPANXB, HMGB3, FAM50A, HTATSF1, and RAP2C. The relative expression levels of the five most differentially expressed genes between t(12;21)-positive and negative ALLs in the data set of Ross et al (6) was investigated with RQ-PCR in eightETV6/RUNX1-negative and five ETV6/RUNX1-positive ALLs. A significant increase in expression is seen in ETV6/RUNX1-positive cases for the genes SPANXB (A; p=0.00078), HMGB3 (B; p=0.033), and FAM50A(C; p=0.023), but not for the genes HTATSF1 (D) and RAP2C (E) in the investigated cases. The most prominent difference is seen for SPANXB, whose high expression in ETV6/RUNX1-positive cases is in sharp contrast to the low, or non-existant, expression in ETV6/RUNX1-negative cases.

Supplementary Figure 1.

Supplementary Figure 2.

Supplementary Figure 3.

Supplementary Figure 4.

Supplementary Figure 5.

Supplementary Table 1. The top ten probe sets within the minimally gained Xq region differing between ETV6/RUNX1-positive and other ALLs in the data set of Ross et al (6)

Nr / Probe set / Entrez Gene ID / Gene symbol / Cytoband / t / q
1 / 220922_s_at / 30014, 64663, 64694, 728712 / SPANXA1, SPANXC, SPANXB, SPANXA2 / Xq27.1 / 9.776 / 0.00001
2 / 220921_at / 64649 / SPANXB / Xq27.1 / 6.230 / 0.00001
3 / 203744_at / 3149 / HMGB3 / Xq28 / 6.024 / 0.00001
4 / 203262_s_at / 9130 / FAM50A / Xq28 / 5.962 / 0.00001
5 / 220217_x_at / 64663 / SPANXC / Xq27.1 / 5.949 / 0.00001
6 / 202602_s_at / 27336 / HTATSF1 / Xq26.1 / 5.613 / 0.000015
7 / 224032_x_at / 30014, 728712 / SPANXA1, SPANXA2 / Xq27.1 / 5.434 / 0.000015
8 / 218668_s_at / 57826 / RAP2C / Xq25 / 5.392 / 0.000015
9 / 225383_at / 10838 / ZNF275 / Xq28 / 5.270 / 0.000038
10 / 225556_at / 203547 / LOC203547 / Xq28 / 5.136 / 0.000045

1.Saal LH, Troein C, Vallon-Christersson J, Gruvberger S, Borg Å, and Peterson C BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray data. Genome Biology 2002;3:SOFTWARE0003.

2.Autio R, Hautaniemi S, Kauraniemi P, Yli-Harja O, Astola J, Wolf M, et al.CGH-Plotter: MATLAB toolbox for CGH-data analysis. Bioinformatics 2003;19:1714-1715.

3.Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y, et al. Detection of large-scale variation in the human genome. Nature Genetics 2004;36:949-951.

4.Andersson A, Olofsson T, Lindgren D, Nilsson B, Ritz C, Edén P, et al. Molecular signatures in childhood acute leukemia and their correlations to expression patterns in normal hematopoietic subpopulations. Proceedings of the NationalAcademy of Sciences of the United States of America 2005;102:19069-19074.

5.Westbrook VA, Schoppee PD, Diekman AB, Klotz KL, Allietta M, Hogan KT, et al. Genomic organization, incidence, and localization of the SPAN-x family of cancer-testis antigens in melanoma tumors and cell lines. Clin Cancer Res 2004;10:101-112.

6.Ross ME, Zhou X, Song G, Shurtleff SA, Girtman K, Williams WK, et al. Classification of pediatric acute lymphoblastic leukemia by gene expression profiling. Blood 2003;102:2951-2959.

1