Electronic Supplementary Material
Materials and methods
Correlation analysis
The profile search program of Spotfire DecisionSite 7.2 (Somerville, MA, USA) was used for correlation analysis between gene expression patterns and blood glucose levels. Pearson’s correlation coefficient was calculated between the signal intensity of every probe set and blood glucose level in each animal.
Unsupervised hierarchical clustering analysis
Unsupervised hierarchical clustering was carried out by Spotfire DecisionSite 7.2 using expression data derived from the same probe sets in every sample. The signal intensity derived from each probe set was transformed to a Z-score and used for unsupervised hierarchical clustering with cosine correlation.
Principal component analysis
Principal component analysis was carried out by Spotfire DecisionSite 7.2 for signal intensity value of given probe sets in every sample following conversion to Z-score.
Selection of genes in each metabolic pathway
A group of probe sets in each metabolic pathway was obtained from the NetAffx database (http://www.affymetrix.com/index.affx), which includes information of GeneChip probe sets in each pathway within the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database (http://www.genome.ad.jp/kegg/). Orthologous probe set information in the NetAffx database (December 2003, Affymetrix) was used to determine mouse probe sets for each KEGG pathway. Metabolic pathways containing more than four probe sets and having a ‘present call’ in at least one animal were chosen for pathway analysis.
Statistical significance test and discrimination analysis
Signal intensity values of the genes for each metabolic pathway were subjected to principal component analysis, with the three values of the first, second and third principal components for each sample obtained from this analysis being used for the following calculations. In order to examine metabolic pathways affected by the action of metformin, significance tests (p<0.01) were performed between animals given 400 mg/kg metformin (n=5) and 15 animals that either received no treatment, vehicle alone or 50 mg/kg metformin (n=5 each). The Wilks’ lambda [1, 2] program in R version 1.5.1 (http://www.r-project.org/) was used for this statistical significance test. For the selected pathways, discrimination analysis via a Support Vector Machine (SVM) [3, 4] approach was also performed to examine whether the signal intensity of the genes in each pathway was able to distinguish the animals treated with 400 mg/kg metformin from the others without error. The SVM program in Visual Mining Studio version 1.4 (Mathematical Systems, Tokyo, Japan) was used for this analysis.
References
1. Thanasoulias NC, Parisis NA, Evmiridis NP (2003) Multivariate chemometrics for the forensic discrimination of blue ball-point pen inks based on their Vis spectra. Forensic Sci Int 138:75-84
2. Hwang D, Alevizos I, Schmitt WA et al (2003) Genomic dissection for characterization of cancerous oral epithelium tissues using transcription profiling. Oral Oncol 39:259-268
3. Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16: 906-914
4. Brown MP, Grundy WN, Lin D et al (2000) Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci USA 97:262-267