Supplementary information Materials and Methods
Construction and packaging of lentiviral vectors. A 500 bp human genomic fragment encoding miR-335 (MIR335) was obtained by PCR from human genomic DNA (Novagen) using the primers MIR335_2-F (5’- CTT GGA TCC GTG TAA CTG TGA TTT AAG TC -3’) and MIR335_2-R (5’- CTT CTC GAG AGC CTA AGA GTT TTG TTC T -3’).
To construct the lentiviral vector encoding miR-335 (pLV-EmGFP-MIR335) and the control vector (pLV-EmGFP-miR-Mock), we cloned a synthetic polylinker sequence containing restriction sites for BamHI, XhoI, and ApaI into the XbaI-SalI restriction sites of the lentiviral vector pRRLsin18.PPT.CMV.GFP.Wpre (Manuscript reference 23), to obtain the pRRL-LINK2 plasmid. We then cloned the SnaBI-XhoI small fragment of plasmid pcDNA6.2-EmGFP-miR-mock (Invitrogen) into the corresponding restriction sites of pRRL-LINK2, to obtain pLV-EmGFP-miR-Mock. Finally, we cloned MIR335 into the BamHI-XhoI restriction sites of pLV-EmGFP-miR-Mock, to render pLV-EmGFP-MIR335.
Lentiviral supernatants were obtained by transient transfection of 293T cells as previously described (1). The supernatants were filtered (0.22 mm) to remove cellular debris, and 1-ml aliquots were stored at -80ºC. Lentiviral stocks were titrated in HeLa cells as described (1). For lentiviral transduction, bone marrow-derived hMSCs (5x105) were seeded in 10 cm plates and, after one day in culture, incubated for 6 h with the corresponding lentiviral supernatant at a multiplicity of infection (MOI) of 5, in the presence of 8 mg/ml Polybrene (Sigma-Aldrich). The medium was then replaced with regular growth medium and cells were cultured for an additional 48 h, when gfp+ hMSC populations were isolated with a FACS Aria cell sorter (BD Biosciences). Only cells with medium levels of gfp fluorescence intensity were selected (Fig. S3A). A cell sorting purity of >97% was consistently achieved.
Microarray experiments. Agilent Human microRNA Microarray v2.0 (G4470B, Agilent Technologies) was used to identify miRNAs expressed at relatively high level in undifferentiated hMSCs. Total RNA (100 ng per sample) was hybridized to the microarrays. We compared the miRNA expression profiles of undifferentiated hMSCs with the same primary cell lines after 9 days of adipogenic or osteogenic induction, as well as with skin fibroblasts. A total of four independent samples were used for each condition, but in the case of adipogenic and osteogenic differentiation, the RNA samples were combined into two pools (two samples/pool) before labeling. MicroRNA labeling, hybridization and washing were carried out according to the manufacturer’s instructions. Hybridized microarrays were scanned with a DNA microarray scanner (Agilent G2565BA) and features were extracted using the Agilent Feature Extraction (AFE) image analysis tool (version A.9.5.3) with default protocols and settings. Data pre-processing and differential expression analysis of the microRNA and gene expression data were done in R (2) using available Bioconductor packages (http://www.bioconductor.com) (3).
The Agilent Whole Human Genome Microarray Kit (G4112F, Agilent Technologies) was used to identify genes downregulated in hMSCs overexpressing miR-335. Total RNA (1 mg) from each sample was hybridized to the microarrays. We compared the mRNA expression profiles of hMSCs transduced with the lentiviral vector pLV-EmGFP-MIR335 with hMSCs transduced with the control vector pLV-EmGFP-Mock. A total of three independent samples were used for each condition. RNA labeling, hybridization and washing were carried out according to the manufacturer’s instructions. Image acquisition, feature extraction, and signal normalization were as described for miRNA microarrays.
MicroRNA microarray data analysis. Data pre-processing and differential expression analysis were done using the Bioconductor AgiMicroRna package (4). The Total Gene Signal provided by the Agilent Feature Extraction image analysis software was used for data analysis. Data were normalized between arrays using the quantile method (5). The image analysis software attached a flag to each feature that identifies different quantification errors of the signal and can be used to filter out microRNAs data that do not reach a minimum quality. This filtering was done after normalization of the Total Gene Signal. The gIsGeneDetected filtering removes microRNAs not expressed in any experimental condition, so that analysis was limited to those microRNA genes flagged by Agilent Feature Extraction software as detected in at least one experimental condition (gIsGeneDetected=1). For differential expression analysis, the AgiMicroRna package incorporates the linear modeling features of the Bioconductor limma package (6). Limma fits a linear model to the expression value for each gene, to assess the significance of differential expression between different experimental conditions. In addition, limma uses empirical Bayes methods (7) to construct moderated statistics and incorporates statistical tools to adjust for the multiplicity of the tests. The Benjamini and Hochberg’s method (8) was used to control for false discovery, and a false discovery rate (fdr) of 0.15 was selected.
Gene expression microarray data analysis. Data were pre-processed with the Bioconductor Agi4x44PreProcess package (9). The data were background corrected and normalized between arrays in order to compensate for systematic technical differences between chips. We selected the MeanSignal and the BGMedianSignal for the foreground and background signals, respectively, from the collection of signals provided by the Agilent Feature Extraction image analysis software. These signals were used for background correction and normalization. First, we produced a Background Subtracted Signal using the half option in Agi4x44Preprocess. According to this method, background signal is subtracted from the foreground signal and any intensity below 0.5 is reset to 0.5 to produce positive corrected intensities. After background correction, data were normalized between arrays using the quantile method (5). An offset equal to 50 was added to the intensities before log-transforming, so that the log ratios are shrunk towards zero at the lower intensities. The AFE software attaches a flag to each feature that identifies different quantification errors of the signal. These quantification flags can be used to filter out signals that don’t reach a minimum arbitrary criterion of quality. The data were filtered to 1) keep features within the dynamic range of the scanner and 2) retain features of good quality. To keep features within the dynamic range, we demanded that, for every replicated spot across the whole set of samples, at least 75% of the replicated probes in at least one experimental condition had a quantification flag denoting the signal as within the dynamic range. To retain good quality features for the analysis, we filtered out, for each replicated spot across the whole set of samples, those probes for which more than 25% of the replicates in at least one experimental condition had a flag indicating the presence of outliers. The Bioconductor annotation package hgug4112a.db (10) was used to assign the corresponding gene accession number code to each Agilent probe ID. Differential expression was analyzed using the limma package (7) in the same way as described for microRNA microarrays. In this case, a fdr of 0.05 was selected.
Supplementary Information Table and Figure Legends
Table S1. miRNAs regulated in differentiated human mesenchymal cells and skin fibroblasts in comparison with undifferentiated hMSCs. miRNA genes that are differentially expressed (fdr<0.15) in the three conditions tested (adipogenically differentiated hMSCs, osteogenically differentiated hMSCs, and skin fibroblasts) in comparison with undifferentiated hMSCs were selected as described in Supplementary Methods. Gene names correspond to features present in the Agilent V2 Human miRNA Microarray Kit (Sanger database v.10.1).
Table S2. List of genes downregulated in hMSCs exogenously overexpressing miR-335 that are also predicted by MiRanda, TargetScan, or PicTar algorithms as targets of this miRNA.
Figure S1. Signal normalization in miRNA microarrays. Smooth curves fitted to the scatter plots of standard deviation (SD) replicates as a function of the average expression for each gene (log2 scale), using natural cubic splines with 5 knots. nor75, normalization by 75% percentile; norQ, quantile normalization. (a) Adipogenically and osteogenically differentiated hMSCs vs. undifferentiated hMSCs. (b) Skin fibroblasts vs. undifferentiated hMSCs.
Figure S2. Global analysis of miRNA microarray results. (a) Heatmap of differentially expressed miRNAs (fdr<15%) in differentiated human mesenchymal cells. For each miRNA gene the mean intensity (log2 scale) of each experimental group was obtained. Then, the intensity measures for each gene were standardized by substracting the mean and dividing by standard deviation, which allows a comparison of the intensities across genes. The experimental groups are A, adipogenically differentiated hMSCs; O, osteogenically differentiated hMSCs; F, primary skin fibroblasts. (b) A gene ontology analysis using the PANTHER software was performed for the predicted targets (using TargetScan release 5.1) of all the miRNAs up- or downregulated in at least two of the conditions tested. Enriched (p<0.001) categories in the Pathways Ontology are shown.
Figure S3. Exogenous regulation of miR-335 expression in hMSCs. (a) Bone marrow-derived hMSCs were transduced with the pLV-EmGFP-MIR335 or pLV-EmGFP-mock lentiviral vectors. The figure shows a representative FACS analysis of transduced (blue) and non-transduced (gray) cells. Transduced cells (gfp+), only those demonstrating intermediate signal of gfp (indicated by shading), were purified by cell sorting. (b) Relative expression of miR-335 was quantified in the purified transduced cells by means of real-time RT-PCR. (c) Primary bone marrow-derived hMSCs were transfected (Lipofectamine) with an inhibitor of miR-335 (anti-miRTM-335, Ambion) or with a negative control oligo (anti-miRTM negative control #2, Ambion). Relative expression of miR-335 was quantified in the transfected cells by real-time RT-PCR. RNU-48 was used as endogenous control in all real-time RT-PCR experiments. All PCR determinations were performed in triplicate. Error bars represent standard error * p<0.01 (two-tailed t test).
Figure S4. MiR-335 is downregulated in osteoblasts upon osteogenic differentiation. Relative expression of miR-335 was quantified in Saos-2 cells before and after osteogenic differentiation. PCR determinations were performed in triplicate. Error bars represent standard error * p<0.05 (two-tailed t test).
Figure S5. The expression levels of miR-335 in hMSCs correlate with those of its host coding-gene MEST. Relative expression levels of miR-335 (endogenous control RNU-48) and MEST (endogenous control GAPDH) were measured by real-time RT-PCR in bone marrow-derived hMSCs after adipogenic differentiation, osteogenic differentiation, 48 h incubation with 20% Wnt-3a-containing conditioned medium, and 48 h incubation with 3 ng/ml IFNg. Data are means ± standard error. N = 3.
Figure S6. Bioinformatic analysis of the 5’ upstream region of the human MEST locus. (a) Human, mouse, and dog MEST loci were aligned, and the extent of DNA sequence homology was computed with the web-based program MULAN (http://mulan.dcode.org). Using MULAN and multiTF (http://www.multitf.dcode.org) with the TRANSFAC professional V10.2 library database (http://www.biobase.de) database, a LEF1 and a TCF4-binding site were predicted with 0.95 matrix similarity in the 5 kb upstream region of the MEST locus. Three STAT1 binding sites were also predicted with 0.9 matrix similarity in the same region. (b) Sequence of the mapped conserved LEF1 (blue), TCF4 (cyan), and STAT1 (pink)-binding sites.
Figure S7. Gene enrichment analysis of the predicted miR-335 targets. A gene ontology analysis using the PANTHER software was performed for the non-redundant gene list of predicted miR-335 targets obtained by combining miRanda, TargetScan and PicTar computer algorythms. Upper panel, enriched (p<1E-06) Biological Processes. Lower panel, enriched (p<1E-03) Molecular Functions.
Supplementary Information References
1. Naldini L, Blomer U, Gage FH, Trono D, Verma IM. Efficient transfer, integration, and sustained long-term expression of the transgene in adult rat brains injected with a lentiviral vector. Proc Natl Acad Sci U S A 1996 Oct 15; 93(21): 11382-11388.
2. R: A language and environment for statistical computing. R Foundation for Statistical Computing: Vienna, Austria, 2005.
3. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004; 5(10): R80.
4. Lopez-Romero P. AgiMicroRna: Processing and Differential Expression Analysis of Agilent microRNA chips. R package version 1.0.0. Bioconductor, Open Source Software for Bioinformatics http://bioconductororg/packages/25/bioc/ html/AgiMicroRnahtml.
5. Bolstad BM, Irizarry RA, Astrand M, Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003 Jan 22; 19(2): 185-193.
6. Smyth GK. Limma: linear models for microarray data. Springer: New York, 2005.
7. Smyth GK. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 2004; 3: Article3.
8. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc B 1995; 57: 289-300.
9. Lopez-Romero P. Agi4x44PreProcess: PreProcessing of Agilent 4x44 array data. R package version 1.6.0. Bioconductor, Open Source Software for Bioinformatics http://bioconductororg/packages/25/bioc/html/Agi4x44PreProcesshtml
10. Carlson M, Falcon S, Pages H, Li N. hgug4112a.db: Agilent "Human Genome, Whole" annotation data (chip hgug4112a). R package version 2.2.5. Bioconductor, Open Source Software for Bioinformatics http://wwwbioconductororg/packages/25/data/annotation/html/hgug4112adbhtml.