Supplementary Material

References included in Table 1

1.  Teicher BA: Hypoxia and drug resistance. Cancer Metastasis Rev 1994; 13: 139-168.

2.  Montagner M, Enzo E, Forcato M, Zanconato F, Parenti A, Rampazzo E et al. SHARP1 suppresses breast cancer metastasis by promoting degradation of hypoxia-inducible factors. Nature 2012; 487: 380-384.

3.  Oloumi A, MacPhail SH, Johnston PJ, Banáth JP, Olive PL. Changes in subcellular distribution of topoisomerase IIalpha correlate with etoposide resistance in multicell spheroids and xenograft tumors. Cancer Res 2000; 60: 5747-5753.

4.  Turner JG, Engel R, Derderian JA, Jove R, Sullivan DM. Human topoisomerase IIalpha nuclear export is mediated by two CRM-1-dependent nuclear export signals. J Cell Sci 2004; 117: 3061-3071.

5.  Li Y, Zou L, Li Q, Haibe-Kains B, Tian R,Li Y et al. Amplification of LAPTM4B and YWHAZ contributes to chemotherapy resistance and recurrence of breast cancer. Nat Med 2010; 16: 214-218.

6.  Oakman C, Moretti E, Di Leo A: Re-searching anthracycline therapy. Breast Cancer Res Treat 2010; 123: 171-175.

7.  Desmedt C, Di Leo A, de Azambuja E, Larsimont D,Haibe-Kains B,Selleslags J et al. Multifactorial approach to predicting resistance to anthracyclines. J Clin Oncol 2011; 29: 1578-86.

8.  Haibe-Kains B, Desmedt C, Loi S, Culhane AC, Bontempi G, Quackenbush J et al. A three-gene model to robustly identify breast cancer molecular subtypes. J Natl Cancer Inst 2012; 104: 311-325.

9.  Adorno M, Cordenonsi M, Montagner M, Dupont S, Wong C, Hann B et al. A Mutant-p53/Smad complex opposes p63 to empower TGFbeta-induced metastasis. Cell 2009; 137:87-98.

10.  Mattarollo SR, Loi S, Duret H, Ma Y, Zitvogel L, Smyth MJ. Pivotal role of innate and adaptive immunity in anthracycline chemotherapy of established tumors. Cancer Res 2011; 71: 4809-20.

11.  Zitvogel L, Apetoh L, Ghiringhelli F, André F, Tesniere A, Kroemer G. The anticancer immune response: indispensable for therapeutic success? J Clin Invest 2008; 118: 1991-2001.

12.  Zitvogel L, Apetoh L, Ghiringhelli F,Kroemer G. Immunological aspects of cancer chemotherapy. Nat Rev Immunol 2008; 8: 59-73.

13.  Denkert C, Loibl S, Noske A, Roller M, Müller BM, Komor M et al. Tumor-associated lymphocytes as an independent predictor of response to neoadjuvant chemotherapy in breast cancer. J Clin Oncol 2010; 28: 105-13.

14.  Teschendorff AE, Miremadi A, Pinder SE, Ellis IO, Caldas C. An immune response gene expression module identifies a good prognosis subtype in estrogen receptor negative breast cancer. Genome Biol 2007; 8: R157.

15.  Desmedt C, Haibe-Kains B, Wirapati P, Buyse M,Larsimont D,Bontempi G et al. Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes. Clin Cancer Res 2008; 14: 5158-5165.

16.  Ignatiadis M, Singhal SK, Desmedt C, Haibe-Kains B, Criscitiello C, Andre F et al. Gene modules and response to neoadjuvant chemotherapy in breast cancer subtypes: a pooled analysis. J Clin Oncol 2012; 30: 1996-2004.

17.  Farmer P, Bonnefoi H, Anderle P, Cameron D,Wirapati P,Becette V, et al. A stroma-related gene signature predicts resistance to neoadjuvant chemotherapy in breast cancer. Nat Med 2009; 15: 68-74.

Supplementary Methods

Dataset

We collected 27 datasets comprising microarray data of breast cancer samples and annotations on patients’ clinical outcome. All data were measured on Affymetrix arrays and have been downloaded from Gene Expression Omnibus (GEO) and ArrayExpress (http://www.ebi.ac.uk/arrayexpress/). Prior to analysis, we reorganized the datasets eliminating duplicate samples and samples without outcome information and renamed any original study after the medical center where patients were recruited. This re-organization resulted in a meta-dataset comprising 3661 unique samples from 25 independent cohorts (Supplementary Table 1). According to Cordenonsi et al. 2011, we standardized clinical information among the various datasets redefining the outcome descriptions based on the clinical annotations of each individual study.

Raw expression data (i.e., CEL files) obtained from different platforms have been integrated using an approach inspired by geometry and probe content of HG-U133 Affymetrix arrays (Fallarino et al., 2010). Briefly, probes with the same oligonucleotide sequence, but located at different coordinates on different type of arrays, may be arranged in a virtual platform grid. As for any other microarray geometry, this virtual grid may be used as a reference to create a virtual Chip Definition File (virtual-CDF), containing the probes shared among the various HG-U133 platforms and their coordinates on the virtual platform, and a virtual-CEL files containing the fluorescence intensities of the original CEL files properly re-mapped on the virtual grid. Once defined the virtual platform through the creation of the virtual-CDF and transformed the CEL files into virtual-CELs, raw data, originally obtained from different HG-U133 arrays, are homogeneous in terms of platform and can be preprocessed and normalized adopting standard approaches, as RMA (Irizarry et al., 2003). Specifically, expression values were generated from intensity signals using the virtual-CDF, obtained merging HG-U133A, HG-U133AAofAV2, and HG-U133 Plus2 original CDFs, and the transformed virtual-CEL files. Intensity values for a total of 21981 meta-probe sets have been background adjusted, normalized using quantile normalization, and gene expression levels calculated using median polish summarization (RMA). The meta-dataset has been corrected for batch effect using ComBat (Johnson et al., 2007). The entire procedure has been implemented as a R script.

Samples of the meta-dataset and of EORTC cohort were classified as TNBC by the SCMOD2 subtype clustering classifier (Subtype Clustering Model) (Wirapati et al., 2008) contained in genefu R package (Haibe-Kains et al., 2012). This model is a mixture of three Gaussians with equal shape, volume and variance which uses ESR1, ERBB2 and AURKA dimensions to identify the molecular subtypes: ER-/HER2- (Basal), HER2+ and ER+/HER2- (Low and High Proliferative) tumors.


Supplementary Table 1: Breast cancer re-organized cohorts comprised in the meta-dataset analyzed in this study

Cohort / Affymetrix platform / Samples / Data source / References
KI Stockholm / HG-U133A / 159 / GSE1456 / Pawitan et al., 2005
EMC-344 / HG-U133A / 344 / GSE2034; GSE5327 / Wang et al., 2005; Minn et al., 2007
MSKCC / HG-U133A / 82 / GSE2603 / Minn et al., 2005
KI Uppsala / HG-U133A / 253 / GSE3494; GSE4922; GSE6532 / Loi et al, 2008; Ivshina et al, 2006; Miller et al, 2005
OXF / HG-U133A / 178 / GSE6532 / Ivshina et al., 2006
TransBIG / HG-U133A / 198 / GSE7390 / Desmedt et al., 2007
Mainz / HG-U133A / 200 / GSE11121 / Schmidt et al., 2008
Veridex / HG-U133A / 136 / GSE12093 / Zhang et al., 2009
GUYS / HG-U133 Plus 2.0 / 164 / GSE6532; GSE9195 / Loi et al., 2007; Loi et al., 2008; Loi et al., 2010
UCSF / HG-U133AAofAV2 / 166 / E-TABM-158; GSE7378 / Chin et al., 2006; Zhou et al., 2007; Yau et al., 2008
IJB TOP / HG-U133 Plus 2.0 / 114 / GSE16446 / Desmedt et al., 2011; Li et al., 2010; Juul et al., 2010
US NCI / HG-U133 Plus 2.0 / 115 / GSE19615 / Li et al., 2010
CRCM / HG-U133 Plus 2.0 / 252 / GSE21653 / Sabatier et al., 2011a; Sabatier et al., 2011b
KOOF / HG-U133 Plus 2.0 / 327 / GSE20685 / Kao et al., 2011
Goethe / HG-U133A / 64 / GSE31519 / Rody et al., 2011; Karn et al., 2011; Karn et al., 2012
MDACC_GSE25066 / HG-U133A / 313 / GSE25066 - GSE20194 / Hatzis et al., 2011; Popovici et al., 2010; Shi et al., 2010
I-SPY-1 / HG-U133A / 83 / GSE25066 / Hatzis et al., 2011
LBJ INEN GEICAM / HG-U133A / 58 / GSE25066 / Hatzis et al., 2011
USO-02103 / HG-U133A / 95 / GSE25066; GSE23988 / Hatzis et al., 2011; Iwamoto et al., 2011
MDACC IGR / HG-U133A / 61 / GSE22093 / Iwamoto et al., 2011;
MDACC GSE20271 / HG-U133A / 100 / GSE20271 / Tabchy et al., 2010
MDACC MAQC-II / HG-U133A / 39 / GSE20194 / Popovici at al., 2010; Shi et al., 2010
Osaka / HG-U133 Plus 2.0 / 115 / GSE32646 / Miyake et al., 2012
St. Louis / HG-U133 Plus 2.0 / 24 / GSE19697 / Lin et al., 2010
UW / HG-U133 Plus 2.0 / 21 / GSE18728 / Korde et al., 2010

1

Supplementary Table 2. Gene lists for signatures evaluated as ConSig components

Hypoxia / SHARP1 / AURKA / Minimal / STAT1 / PLAU
ADM / AGTBP1 / ABCA6 / H2AFX / SUPV3L1 / BHLHE41 / ACP5 / VAV1 / ACAN
ALDOC / CHN2 / ADRM1 / HMBOX1 / SUV39H1 / CCNG2 / ADAMDEC1 / ZC3HAV1 / ADAM12
BHLHE40 / COBL / AHSA1 / HNRNPA2B1 / TACO1 / APOC1 / ANGPTL2
BIRC2 / DSC2 / AK5 / HOXB6 / TENC1 / ARHGAP15 / ANKRD46
BNIP3 / EPS8L2 / ANG / HPRT1 / TFPT / BIRC3 / ATP6V1B2
CENPF / F2RL1 / APLP1 / HSD17B10 / TGFB3 / BST2 / ATPIF1
DDIT4 / FSCN1 / AQP1 / HSF1 / TGFBR3 / BTN3A2 / BASP1
ENO2 / GBE1 / ARHGAP11A / ICT1 / TGIF2 / C19orf66 / BCL3
GLRX / GPR56 / ARHGDIB / IGF1 / THBD / CCL5 / BICD2
HK2 / HSF2 / ATAD5 / INSIG1 / THOP1 / CCL8 / BMP1
INSIG1 / IFIT3 / AURKA / ISOC2 / THYN1 / CCRL2 / CADM1
MAFF / IGF2BP3 / AZIN1 / ITGBL1 / TIMELESS / CD2 / CAP1
NDRG1 / IMPA2 / BCKDHB / JMJD4 / TIMM10 / CD3G / COL3A1
PDK1 / ITGB2 / BIRC5 / KERA / TK1 / CD40LG / COL5A2
PDPK1 / LPIN1 / BLM / KIF20A / TLR3 / CD48 / COL6A1
PFKP / LYZ / C11orf63 / KIF2C / TMC5 / CD69 / CPNE1
PGK1 / ME1 / C17orf59 / KIF4A / TMEM177 / CECR1 / CTSA
SAP30 / NOX5 / C1orf112 / LACTB2 / TMEM204 / CLEC4A / DBN1
SLC2A1 / PLCE1 / C1orf135 / LAPTM4B / TOMM34 / CTSC / DDR2
SLC2A3 / RAPGEF5 / C1RL / LAS1L / TPD52L1 / CXCL10 / DNASE1L1
VEGFA / S100A3 / C20orf20 / LETM1 / TPX2 / CXCL11 / ENC1
WSB1 / SH2D3A / CACNA2D3 / LHFP / TRIM24 / CXCL9 / EPB41L2
SLC5A3 / CAV1 / LIMCH1 / TRIP13 / CYTIP / EPHB4
SLCO4A1 / CCHCR1 / LMNB2 / TRPM2 / DDAH2 / EPYC
SREBF1 / CCNB1 / LPAR6 / TSEN34 / DDX58 / FAP
TGFA / CCNB2 / LRRC17 / TTC38 / DDX60 / FKBP14
WWTR1 / CDC20 / LRRC59 / TTF2 / DHX58 / GFPT2
CDC6 / MAD2L1 / TUBG1 / DNAL4 / GLB1
CDC7 / MCM10 / TYMS / EBI3 / GPR89A
CDCA4 / MCM2 / UBE2C / EFNA1 / GREM1
CDH5 / MCM3 / UCKL1 / ETV7 / ISLR
CDKN3 / MLF1IP / VTCN1 / FASLG / JPH2
CENPA / MYO9A / VWA5A / FGL2 / MEF2A
CENPN / NCAPH / WDR45L / GLRX / MFAP2
CHCHD2 / NDUFS1 / WLS / GPR171 / MMP14
CHMP1A / NEIL3 / WNT5A / GPR18 / MMP3
CKAP5 / NEK2 / XPOT / GPR183 / MXRA8
CKS1B / NKRF / YBX2 / GZMK / MYO1B
CKS2 / NR2F2 / YTHDF1 / HCLS1 / NBL1
CNIH4 / NUDT1 / ZFHX4 / HCP5 / NID1
COIL / NUP155 / ZNF165 / HERC5 / NOL8
COQ2 / OGN / ZNF250 / HERC6 / OFD1
CSTF2 / OLFML3 / ZNF706 / IDO1 / OLFML2B
CTDSP1 / OS9 / IFI30 / OSMR
CYBRD1 / PELI2 / IFI44L / PARVA
DDX39A / PER1 / IFI6 / PDGFB
DDX56 / PIR / IFIT3 / PDGFRB
DHTKD1 / PKMYT1 / IFIT5 / PDLIM3
DIRAS3 / PLK1S1 / IFITM1 / PDLIM7
DKK2 / PLSCR4 / IGSF6 / PLAU
DLX2 / PLXDC1 / IL18 / PPP1R15A
DOCK1 / POLD1 / IPCEF1 / RPL18
DSCR6 / POLR2D / IRF1 / RPS27A
DST / PRIM1 / IRF8 / SDS
DTYMK / PUF60 / ISG15 / SERPINH1
DUSP6 / RBMS3 / ITGB2 / SNAI2
EEF1B2 / RCE1 / KLRC3 / STAB1
EIF4A3 / RFC4 / KLRK1 / TAGLIN
EIF6 / RNASEH2A / LAG3 / TGFB2
EMG1 / RPP38 / LAMP3 / THY1
ENOX1 / RPS6KB2 / LAPTM5 / TNFRSF12A
ERCC6L / RRM2 / LILRA4 / TPST2
ERMAP / RRS1 / LYZ / TRAM2
ESD / RUNX1T1 / MX1 / TRIM33
ESRP1 / RUVBL1 / MX2 / UBL5
EXO1 / S100P / MZB1 / ULK1
EXOSC4 / SCAPER / NMI / VDR
F3 / SCNN1B / P2RX5 / ZNF518
FADD / SEMA5A / PIM2
FAM172A / SF3B2 / PLA2G7
FAM64A / SFRP4 / PLAC8
FCER1A / SHCBP1 / PSME1
FCGBP / SLC26A3 / PTPN7
FMOD / SLC2A1 / RAB8A
FOS / SLC39A4 / RASGRP1
FOXM1 / SLIT2 / REC8
FRY / SNRK / RFX5
FRZB / SNRPC / RSAD2
FST / SNRPD3 / RTP4
FZD4 / SNX1 / SECTM1
GALC / SORBS1 / SH2D1A
GAS6 / SOX17 / SNX10
GCAT / SPAG5 / SP140
GEM / SPARCL1 / SPOCK2
GGH / SPRY1 / STAT1
GINS2 / SPRY2 / STAT4
GINS3 / SS18L1 / TAP1
GLRX2 / STARD13 / TARP
GNAZ / STEAP1 / TFEC
GNG11 / STIL / TRAF3
GNG12 / STMN1 / TRIM22
GPR124 / STRA13 / TYMP
GTSE1 / STRAP / UBD

1

Supplementary Table 3. Association of single components with pathological complete response in the training set

Components / AUC / 95% CI / P value
STAT1 / 0.68 / 0.57 - 0.79 / 4.2 x 10-4
HIF1 / 0.58 / 0.48 - 0.68 / 0.069
topoIIα / 0.58 / 0.47 - 0.70 / 0.074
LAPTM4B / 0.58 / 0.46 - 0.69 / 0.087
SHARP1 / 0.55 / 0.43 - 0.68 / 0.197
PLAU / 0.53 / 0.40 - 0.66 / 0.325
YWHAZ / 0.50 / 0.38 - 0.61 / 0.516
MS / 0.41 / 0.31 - 0.51 / 0.960

AUC: area under the ROC curve

1

Supplementary Figure 1. ROC diagrams in the training set for a) the individual components considered in the construction of the Consensus Signature and b) ConSig1 in patients treated with anthracyclines but not taxanes; c) ConSig1 and d) ConSig2 in the control group of patients treated with anthracyclines plus taxanes.

1