Supplementary Information

Towards a molecular characterization of Autism Spectrum Disorders: An exome sequencing and systems approach

JoonYong An1, Alexandre S. Cristino1, Qiongyi Zhao1, Janette Edson1, Sarah M. Williams1, David Ravine2, John Wray3, Vikki M. Marshall1, Anna Hunt4, Andrew J.O. Whitehouse4,5, Charles Claudianos1,5*

  1. Queensland Brain Institute, University of Queensland, Building 79, St Lucia, 4072, Brisbane, QLD, Australia
  2. School of Pathology and Laboratory Medicine, Perth, WA, Australia
  3. State Child Development Centre, Child and Adolescent Health Service, Princess Margaret Hospital for Children, Perth, WA, Australia.
  4. Telethon Institute for Child Health Research, Centre for Child Health Research, University of Western Australia, 100 Roberts Rd, Subiaco, WA, Australia.
  5. Cooperative Research Centre for Living with Autism Spectrum Disorders (Autism CRC), Long Pocket, Brisbane, Queensland, Australia.

* Correspondence to C. Claudianos ()

Short title: Amolecular systems approach ofautism

Supplementary figure 1. Pedigree of 40 ASD families used in this study

Supplementary figure 2. Validation of de novo variants (DNVs).

Ten stringently filtered de novo variants (9 single nucleotide replacements and 1 frameshift variant) found by our whole-exome-sequencing (WES) pipeline were confirmed by Sanger sequencing. Our results confirm the high fidelity of our WES pipeline. Among current variant-calling pipelines including SAMtools, GATK, glftools, and Atlas2 (Liu et al. 2013)1, GATK generates the greatest percentage of positive variant calls when compared with microarray and Sanger sequencing validation. This is attributed to additional steps in GATK, such as local realignment around indels, Base Quality Score Recalibration (BQSR), and Variant Quality Score Recalibration (VQSR). Moreover, a multi-sample variant calling strategy and variant calls at ≥20x coverage were shown to increase accuracy.

Supplementary figure 3. Overview of workflows

Supplementary figure 4. Filtering SNVs based on public variant databases.

Supplementary figure 5. Relationship of paternal age with occurrence of DNVs

We categorized the fathers (age at birth of ASD probands)with occurrence of DNVs into three groups: 21-30, 31-40 and 41-50 years-old.We calculated the frequency by normalizing the number of DNVs by the number of fathers in each age group. The result showed there is a positive correlation (R2=0.934) between paternal age and frequency of DNVs.

Supplementary table1. AXAS scores of risk prediction

1) Causal genetic variants in ASD cases

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 1754 / 538 / 505.14 / 18.96 / 1.73 / 0.04
XLID / 1754 / 178 / 189.74 / 13.00 / -0.9 / 0.82
ADHD / 1754 / 201 / 219.64 / 13.86 / -1.34 / 0.91
SZ / 1754 / 366 / 361.89 / 16.94 / 0.24 / 0.41

2) Causal genetic variants in control parents

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 2607 / 783 / 750.80 / 23.12 / 1.39 / 0.08
XLID / 2607 / 255 / 282.01 / 15.86 / -1.70 / 0.96
ADHD / 2607 / 300 / 326.46 / 16.90 / -1.57 / 0.94
SZ / 2607 / 526 / 537.88 / 20.66 / -0.58 / 0.72

3) Causal genetic variants in control parents with BAP (need to be updated from here to below)

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 1053 / 333 / 303.26 / 14.69 / 2.02 / 0.02
XLID / 1053 / 117 / 113.91 / 10.07 / 0.31 / 0.38
ADHD / 1053 / 115 / 131.86 / 10.73 / -1.57 / 0.94
SZ / 1053 / 231 / 217.26 / 13.13 / 1.05 / 0.15

4) Causal genetic variants in control parents without BAP

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 2174 / 654 / 626.1 / 21.11 / 1.32 / 0.09
XLID / 2174 / 213 / 235.17 / 14.48 / -1.53 / 0.94
ADHD / 2174 / 260 / 272.24 / 15.43 / -0.79 / 0.79
SZ / 2174 / 439 / 448.54 / 18.86 / -0.51 / 0.69

5) Inherited variants from control parents with BAP

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 764 / 257 / 220.03 / 12.51 / 2.96 / 0
XLID / 764 / 90 / 82.64 / 8.58 / 0.86 / 0.19
ADHD / 764 / 94 / 95.67 / 9.14 / -0.18 / 0.57
SZ / 764 / 180 / 157.63 / 11.18 / 2.00 / 0.02

6) Inherited variants from control parents without BAP

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 1271 / 373 / 366.04 / 16.14 / 0.43 / 0.33
XLID / 1271 / 126 / 137.49 / 11.07 / -1.04 / 0.85
ADHD / 1271 / 147 / 159.16 / 11.79 / -1.03 / 0.85
SZ / 1271 / 251 / 262.24 / 14.42 / -0.78 / 0.78

7) Inherited variants from mother with BAP

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 316 / 109 / 91.01 / 8.04 / 2.24 / 0.01
XLID / 316 / 42 / 34.18 / 5.51 / 1.42 / 0.08
ADHD / 316 / 42 / 39.57 / 5.87 / 0.41 / 0.34
SZ / 316 / 80 / 65.20 / 7.18 / 2.06 / 0.02

8) Inherited variants from mother without BAP

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 883 / 259 / 254.3 / 13.45 / 0.35 / 0.36
XLID / 883 / 90 / 95.52 / 9.22 / -0.60 / 0.73
ADHD / 883 / 101 / 110.57 / 9.83 / -0.97 / 0.83
SZ / 883 / 179 / 182.18 / 12.02 / -0.26 / 0.60

9) Inherited variants from father with BAP

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 552 / 184 / 158.97 / 10.63 / 2.35 / 0.01
XLID / 552 / 66 / 59.71 / 7.29 / 0.86 / 0.19
ADHD / 552 / 66 / 69.12 / 7.77 / -0.4 / 0.66
SZ / 552 / 120 / 113.89 / 9.50 / 0.64 / 0.26

10) Inherited variants from father without BAP

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 570 / 178 / 164.16 / 10.8 / 1.28 / 0.10
XLID / 570 / 63 / 61.66 / 7.41 / 0.18 / 0.43
ADHD / 570 / 72 / 71.38 / 7.89 / 0.08 / 0.47
SZ / 570 / 112 / 117.60 / 9.65 / -0.58 / 0.72

11) De novo variants

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 10 / 6 / 2.88 / 1.36 / 2.30 / 0.01
XLID / 10 / 3 / 1.08 / 0.93 / 2.06 / 0.02
ADHD / 10 / 3 / 1.25 / 0.99 / 1.76 / 0.04
SZ / 10 / 4 / 2.06 / 1.60 / 1.60 / 0.06

12) Common variants in ASD cases (without any filtering process)

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 8730 / 2408 / 2514.2 / 42.31 / -2.51 / 0.99
XLID / 8730 / 806 / 944.36 / 29.02 / -4.77 / 1.00
ADHD / 8730 / 959 / 1093.20 / 30.92 / -4.34 / 1.00
SZ / 8730 / 1690 / 1801.19 / 37.81 / -2.94 / 1.00

13) Common variants of controls (without any filtering process)

PPI network / Number
of proteins / Observed
number
of proteins / Expected
number
of proteins / Stdev / Z-score / P-value
ASD / 9713 / 2733 / 2797.29 / 44.63 / -1.44 / 0.93
XLID / 9713 / 929 / 1050.69 / 30.61 / -3.98 / 1.00
ADHD / 9713 / 1105 / 1216.30 / 32.62 / -3.41 / 1.00
SZ / 9713 / 1923 / 2004.00 / 39.88 / -2.03 / 0.98

Description of Supplementary Tables 2to6 (attached Excel spreadsheets).

Supplementary Table2. Sample information and statistics of exome sequencing

This table includes clinical and behavioral profiles of ASD cases and control parents, and statistics of exome sequencing, including sequencing coverage, on-target rates and the number of DNA variants.

Supplementary Table3. The list of de novo variants identified in 48 ASD cases.

Supplementary Table4. Lists of genes for variant classification

This table includes the list of putative causal variants, including total variants found in cases and controls, inherited variants, and de novo variants.

Supplementary Table5. Convergent pathway analysis in ASD cases

This table includes the list of convergent pathways in ASD cases using GO, KEGG and Reactome database.

Supplementary Table 6. Details of genes that clusterDNA variations in the L1CAM pathway (Figure 4).

References

1Liu, X., Han, S., Wang, Z., Gelernter, J. & Yang, B. Z. Variant Callers for Next-Generation Sequencing Data: A Comparison Study. PloS one8, e75619, doi:10.1371/journal.pone.0075619 (2013).

2Yu, T. W. et al. Using whole-exome sequencing to identify inherited causes of autism. Neuron77, 259-273, doi:10.1016/j.neuron.2012.11.002 (2013).

1