Supplementary information guide:

All supplementary figures were combined in one PDF file as below:

Supplementary Figure 1:

The workflow of the entire experimental process. Briefly, whole saliva was collected and processed to obtain cell-free supernatant fraction. After extraction, total RNA was amplified with the ExpressArt kit. Amplified cRNA was then reverse-transcribed and fragmented for Exon array hybridization. Raw data was processed and analyzed to derive candidates depending on the purpose, and candidates were then validated using multiplex pre-amplification followed by qPCR quantification in the remaining fraction of originally extracted RNA samples. Independent cohort of validation was performed only with candidates that were validated in the initial discovery panel.

Supplementary Figure 2:

RNA profiles of representative samples, measured with Pico RNA chips on Agilent 2100 BioAnalyzer: RNA preparations obtained without (blue; 0.98 ng/µl) and with (red; 3.65 ng/µl) addition of 1% NucleoGuard in RLT lysis buffer. Arrows indicate rRNA peaks. This figure shows the improvement of RNA yield and recovery of long transcripts by using the RNase inhibitor NucleoGuard. nt: nucleotide. FU: arbitrary fluorescent unit. Note that the algorithm of the software may lead to slight differences between the calculated and the actual fragment lengths.

Supplementary Figure 3:

RNA profiles of 3 representative samples, measured with Pico RNA chips on Agilent 2100 BioAnalyzer. nt: nucleotide. FU: arbitrary fluorescent unit.

A. RNA profiles before amplification. RNA Integrity Numbers (RIN)(1), which are measures of RNA quality, are shown for each sample;

B. RNA profiles after amplification.

Supplementary Figure 4:

RNA profiles of representative samples, measured with Pico (A) and Nano (B) RNA chips on Agilent 2100 BioAnalyzer. nt: nucleotide. FU: arbitrary fluorescent unit. Note that the FU scales for Pico and Nano RNA chips are different.

A. RNA profile examples after first amplification round: sample 1 (red; 636 ng), sample 2 (blue; 360 ng), sample 3 (green; 180 ng);

B. RNA profile examples after second amplification round: sample 1 (red; 59 µg), sample 2 (blue; 65 µg), sample 3 (green; 54 µg).

Supplementary Figure 5:

Reference RNA was diluted in four fold steps from 25 ng to 0.1 ng, reverse transcribed and pre-amplified for 15 PCR cycles in a 10-plex reaction.

A. The melting curve analysis of the pre-amplified sample analyzed for S100A8 without clean up reveals formation of secondary products visible as additional peaks.

B. The amplification plots are not evenly spaced and two of the curves have flat slopes, a clear sign of competing non-specific reactions. The cycle numbers at the indicated threshold also do not correspond to the order of the concentrations in the dilution series. Colored numbers represent the corresponding concentrations, with the color matching melting curve color in panel A.

C. After the clean up only the specific peaks are visible in the melting curve analysis.

D. The amplification plots are parallel and evenly spaced, indicating quantitative reaction conditions. Colored numbers represent the corresponding concentrations, with the color matching melting curve color in panel C.

Supplementary Figure 6:

Distribution of the position of exon array probe sets along the corresponding transcripts. The SECT-specific probe sets belong to the 726 expanded genes, and the NSCT-common probe sets belong to the 125 genes that were defined previously in NSCT. Transcript lengths were normalized to 100 for the convenience of display. Probe sets that were aligned to multiple bins were counted in each bin. The relative frequency were calculated so that the total from all bins equaled 1.

Supplementary Table 1:

Sequences of all pre-amplification and qPCR primers. F and R are forward and reverse primers of qPCR, which are nested between OF and OR, the outer primers used for pre-amplification. In semi nested assays, F or R respectively is used for the pre-amplification together with one outer primer. T7F and T7R are the primers used to generate the IVT RNA standards.

Supplementary Table 2:

Raw data supporting Figure 3 A, C & D.

Supplementary Table 3:

Data collected from 26-plex pre-amplification of reference RNA dilution series (see also Fig. 4). Slopes and median standard deviations of triplicate pre-amplifications (SD) were shown.

(A) Uncorrected CT values.

(B) CT values were corrected by the arithmetic average of the three SIR transcripts. The slopes were close to 0.0. The median SD of triplicates of the same concentration for all assays except OAZ1 is reduced. Also the SD’s of the ∆CT values determined between 100 and 0.1 ng show that there is no variability due to total RNA concentration. These results suggest that part of the variability is shared by the marker and the SIR genes, and is therefore minimized by the normalization approach.

Supplementary Table4:

List of 1370 probe sets that belong to the Saliva Exon Core Transcriptome. Transcript annotation was obtained from SOURCE (

Supplementary Table 5:

Gene Ontology term search results from DAVID database. Terms and the assigned salivary genes were listed.

Supplementary Methods:

Detailed description about:in vitro transcription of RNA standards,exon array probe set position analysis, gender specific biomarker selection & analysis, validation of normalizer genes in saliva and analysis of 10 SIR candidates.

References:

1.Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, et al. The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol 2006;7:3.