Online Materials for

Mutation in ST6GALNAC5 identified in family with coronary artery disease

Kolsoum InanlooRahatloo1,2α, Amir Farhang Zand Parsa3, Klaus Huse2, Paniz Rasooli1, Saeid Davaran4, Matthias Platzer2, Marcel Kramer5, Jian-Bing Fan6, Casey Turk6, Sasan Amini6, Frank Steemers6, Kevin Gunderson6, Mostafa Ronaghi6, Elahe Elahi1,7*

correspondence to: ;

This file includes:

Materials and Methods

Figures S1 to S7

Tables S1 toS10

α, Present address: Dept. of Cardiology & Radiology, Stanford School of Medicine, Stanford, CA 94305-5454

Methods in detail

Genome-wide linkage analysis

Genome-wide SNP genotyping was carried out on DNA samples of eight individuals of the CAD-105 pedigree using HumanCytoSNP-12v1-0_D BeadChips and the iScan reader (Illumina; (GEO accession no.: GSE42137). The individuals included six CAD affected and two CAD unaffected individuals (Fig. 1). SNPs that had not been genotyped in one or more individual were removed from the analysis with appropriate options in the GenomeStudio_Genotyping_Module_V1.0) (Illumina). MERLIN was used to remove SNPs that exhibited Mendelian error and subsequently to attain parametric and nonparametric logarithm of odds (LOD) scores under two sets of criteria: disease allele frequency of 0.001, penetrance of 90%, and 10% phenocopies; disease allele frequency of 0.0001, penetrance of 99%, and 1% phenocopies1.

Genome wide exome sequencing

CAD affected individuals III-1 and III-2 were selected for exome sequencing. Genomic DNA was isolated from blood samples by standard methods. DNA libraries were enriched using the TruSeq® Exome Enrichment kit (Illumina, San Diego, CA, USA) and subsequently sequenced on an Illumina HiSeq® 2000 system (Illumina). The Truseq Exome assay targets 62 Mb of protein coding and regulatory untranslated regions of the genome. Base calling was performed by the Illumina pipeline with default parameters. Over eight gigabases of high quality sequences for each subject were generated. Sequence reads were mapped to the human reference genome UCSC NCBI37/hg19 using ELANDv2 software (Illumina). Variant detection was performed with CASAVA software (version 1.8.1; Illumina), and candidate variants were filtered to have a CASAVA quality threshold of 10. CASAVA filtered out duplicate reads and reads without matched pairs. In addition to CASAVA, variants were analyzed using Elnis Genomics ( and NextBio ( analysis softwares, again with reference to human genome reference sequence NCBI37/hg19. Absence of the variants in 60 whole-exome sequence data available within the Enlis Genomics data set ( and 15 other exome sequences sequenced along with the CAD-105 patients but derived from healthy Iranians or Iranians affected with unrelated disorders was also verified. The variants were finally systematically filtered to identify those that were positioned within the linked loci, that affected splicing or amino acid changes, and that were present in both patients. Development protocols and features of the TruSeq® Exome Enrichment kit are described for the first time below.

TruSeq exome content

The TruSeq Exome enrichment kit includes 340,427 probes, each constructed against the human genome NCBI37/hg19 reference genome. The probe set was designed to enrich 201,121 exons spanning 20,794 genes targeting a total complexity of around 62Mb. For exons larger than 150 bases, the probes are uniformly spaced roughly every 150 bases. Each 95-mer probe targets libraries of 300-400 bp (insert size 180-280bp) enriching 265-465 bases centered symmetrically on the midpoint of the probe. This means that, in addition to comprehensive coverage of the major exon data bases, the kit also provides broad coverage of non-coding DNA in exon flanking regions including promoters and UTRs. Databases covered by the kit are CCDS coding Exons (31.3 Mb, hg19, 97.2% covered), RefSeq (33.2Mb, hg19, 96.4% covered), RefSeq (regGene) exons plus (67.8 Mb, hg19, 88.3% covered), Encode/Gencode coding exons (25.6 Mb, hg19, 93.2% covered), and predicted microRNA targets (9.0 Mb, hg19, 77.6% covered).

TruSeq exome enrichment

Protocols, workflows, sequencing library preparation and pooling, and details regarding the TruSeq Exome enrichment assay can be found in the “TruSeq Exome guide” and “TruSeq Exome enrichment kit data sheet”:

The method consists of a 2.5-3 day workflow. An indexing solution is supplied for each DNA sample, which makes multi-sample pooling of up to twelve samples feasible in a single enrichment reaction. This is a key feature of the technology that enables an automation friendly workflow and allows processing of many samples simultaneously with minimum hands-on time. An overview of the enrichment scheme is shown in Figure S2. In summary, the enrichment workflow steps are: (A) preparation of indexed libraries, quantitation, and pooling of indexed libraries (see Preparation of sequencing libraries below), (B) denaturing of libraries, (C) solution-phase hybridization of biotinylated oligonucleotide probes; (D) affinity pull-down of targeted regions using magnetic streptavidin beads and high stringency washing, (E) elution of captured regions of interest , repeat of pull-down step starting at (C) , and (F) PCR amplification using universal primers (P5 and P7 in Figure S2) of the final eluted targeted libraries (not shown).

The enrichment method is unique in the sense that standard biotin-oligonucleotides are used, and that non-stringent high concentration capture probe annealing is combined with two rounds of affinity purification using Tm-normalizing stringency washes. The method is enabled by the high quality, relatively uniform representation, and low preparation cost of the biotin-oligonucleotides, making this method economically attractive for fixed sets with large sample volumes2.

Implementation of two rounds of biotin-oligonucleotide capture and release resulted in an increased enrichment specificity (>80%) and overall assay robustness compared to a single round (40-50%). Two key reagents, Cot-1 DNA (KREAcot DNA, Kreatech) and sequencing primer blockers (SBS3 and SBS12) in step C (Figure S2) have a significant effect on the enrichment specificity. For example, 100, 50, 10, 5, 1, and 0 ng/uL Cot-1 DNA yields 82.5%, 70.7%, 33.2%, 20.3%, 6.3%, and 2.1% enrichment specificity, respectively. In the second enrichment step, the effect of cross-hybridizing repeat elements is effectively blocked by Cot-1 DNA since fewer non-targeted libraries are present after the first enrichment step. Typical enrichment specificities are >80% and are relatively uniform across library insert sizes ranging from 150-500 bp. The protocol results in a relatively uniform read count distribution across the targeted regions. For example, at 0.2X of the mean coverage, >90% of the targeted bases are covered. This is accomplished by a relatively non-stringent overnight hybridization capture at sub-nM concentrations of capture probes to drive hybridization capture, and a highly-optimized Tm-normalizing stringency wash to reduce off- target enrichment.

Preparation of sequencing libraries

Sequencing libraries were prepared using the TruSeq® DNA sample preparation kit v2 (

using 1 ug of gDNA input. We also validated The Nextera® Exome enrichment kit with a few selected samples using only 50ng of gDNA input (

Preparation of biotinylated capture probes

The synthesis and purification of the biotinylated oligonucleotides for the Exome Capture Target Oligo (CTO) pool is described elsewhere2.

Effect of probe-target DNA mismatches on enrichment efficiency

We designed a set of oligonucleotide probes that enabled evaluation of the efficiency of target DNA enrichment with various types and degrees of mismatches with respect to the probes. Capture probes were designed to span a range of variant lesions consisting of deletions, insertions, consecutive substitutions, or staggered substitutions relative to the hg19 reference genome. Each “variant” category was covered by 100 capture probes chosen from the TruSeq® Exome capture probe set with the following selection rules: (1) random selection from regions with an average number of reads of a typical exome enrichment experiment, (2) performing consistently well, (3) and positioned within regions covered by only a single probe. The types of variant probes included probes with 0, 1, 2, 3, 4, 5, 7, 9, 12, 15 bp alterations where the alterations consist of insertions, deletions, and consecutive- and staggered substitutions. For insertions, an insert of specified length was inserted into the middle of the probe using a randomly generated sequence. Truncation was performed symmetrically so probe length was maintained (e.g. for 2 base insertions: ATAT->ATGGAT->TGGA). For deletions, a deletion in the middle of the probe of specified length was introduced with flanking sequence added symmetrically to maintain probe length (e.g. GGATATGG->NGGATGGN). For substitutions, homo-mismatched were used as a worst case scenario. Half of the designs had consecutive mismatches (substitutions), and the other half had uniformly spaced mismatches (staggered substitutions) (e.g. ATGATGAC->ATGTAGAC or ATGTTCAC). We also accessed the effect of probe tiling in cases where more than one probe covers a given genomic region. This was accomplished by designing a second set of probes such that three probes, rather than one probe, annealed to each target region. This trio of probes included the original probe and two flanking probes of equal length. The latter set is useful in determining how flanking probes can help recover the targeted region in cases of inefficient affinity enrichment with a single central probe.

Variant capture probe preparation

Capture probes were prepared using PCR amplification of a 90K pool of “in situ” array synthesized oligonucleotides from CustomArray (Bothall, WA). Common amplification primers (Primer 1: AGTCCGCGCAATCAG, and Primer 2: TGCAAGGATCACTCG) were included in the probe sequence and flanked the 80 base capture probe sequence for a final array probe length of 110 bases. The PCR reaction was performed with 1X Titanium Taq buffer (Clontech, Mountain view, CA), 1uM Biotin-Primer1 (Integrated DNA Technologies (IDT), Coralville, Iowa), 1uM Primer 2 (IDT), 200 uM dNTPs (Roche, Indianapolis, IN), 1ul Titanium Taq (Clontech, Mountain view, CA), 0.1ng template 90K oligonucleotide pool (CustomArray, 110 bp), and H2O to 100 uL. PCR cycling conditions: 95oC (5 min.), 95oC (30 s). 55oC (30 s), 72oC (60 s), cycle 30 times, 72oC (5 min.), 10oC (forever). Sera-Mag Magnetic Streptavidin (100 uL; MPB, Thermo Scientific, IN) were pre-washed with hybridization buffer (HB:1 M NaCl, 0.5 M phosphate buffer, 0.1% Tween-20). The biotinylated PCR products (8 uL, 60 uM) were incubated in hybridization buffer with the pre-washed MPB beads for 30 min at RT. The beads were subsequently washed with first with 1X HB, then with 0.2X HB, NaOH (0.1N), and finally with 0.2X HB. The biotin-oligonucleotides were eluted from MPB using water (100 uL) and heat (950C for 10 min) and obtained at a final concentration of ~2 uM3, 4.

Variant probe enrichment assays

Enrichment assays were tested with various wash temperatures (32, 42, and 52oC) of step D (Figure S2) in order to evaluate the effect of wash stringency on specificity, uniformity, and coverage of the targeted regions across the probe variant categories. Not unexpected, higher stringency wash temperatures generated higher enrichment efficiencies and lowered the uniformity across all probes. In Figures S3-S5, the average read count per category is shown for the single probe and three probe designs at, respectively, 32, 42, and 52oC. The current protocol requires a 42oC stringency wash temperature. At this stringency, targeted libraries with up to 11 bp staggered substitutions, 15 bp consecutive substitutions and 15 bp indels compared to the probes are efficiently enriched without much loss of coverage in the targeted regions. With the longer 95-mer probes in the current Truseq® Exome kit, it should be possible to detect even larger variations. Larger variations can still be detected without loss in coverage using the 3 probe design. This strongly indicates that flaking probes can mitigate against losses of capture efficiency of the mismatched probe centered in the middle.

Sanger sequencing

Genomic DNA fragments containing each of the 12 candidate exomic variations distributed in 11 genes and considered to be candidate CAD causing variations based on results of linkage analysis and exome sequence data were amplified by PCR and sequenced. Reference sequences used for design of primers were as follows: VPS13D:NC_000001.10,NM_015378.2;CRYZ: NC_000001.10, NM_001130042; ST6GALNAC5: NC_000001.10, NM_030965; LPHN2: NC_000001.10, NM_012302;TTN: NC_000002.11, NM_001256850; HSPD1: NC_000002.11, NM_002156;IRS1:NC_000002.11,NM_005544;GPR35: NC_000002.11,NM_001195381; IL7R: NC_000005.9, NM_002185;LILRA2: NC_000019.9, NM_001130917; NLRP2: NC_000019.9, NM_001174081. DNAs from the seven affected and three unaffected members of pedigree CAD-105 were analyzed with respect to these variations. The amplicon containing the p.Val99Met causing variation in ST6GALNAC5 wasalso sequenced in 800 ethnically matched control Iranians not affected with cardiac disorders, and the amplicon containing the p.*337Qext*20 variant was also sequenced in 800 controls.Finally all exons and adjacent intronic sequences of ST6GALNAC5 were amplified and sequenced in 100 of the control individuals and in 160 Iranian CAD patients unrelated to each other and to pedigree CAD-105. Amplified DNA fragments were sequenced using ABI Big Dye terminator chemistry and an ABI 3730XL genetic analyzer instrument (Applied Biosystems, Foster city, CA). Sequences were analyzed with the Sequencher 4.8 software (Gene Codes Corporation, Ann Arbor, MI). ST6GALNAC5 sequences derived from whole genome sequencing performed in the United States on 150 CAD affected individuals and 800 individuals diagnosed not to be affected with CAD were kindly provided by Dr. Leslie G. Biesecker (Genetic Diseases Research Branch,National Human Genome Research Institute, USA). In this study, CAD diagnosis was based on having had experienced MI, a stent placed, received a bypass, or presented with more than 50% occlusion upon computed tomography angiography (CTA) or cardiac catheterization. In addition to reference sequences described above, effects of nucleotide sequence variations observed were analyzed using protein reference sequencesNP_056193.2 (VPS13D),NP_001123514.1 (CRYZ), NP_112227.1 (ST6GALNAC5), NP_036434.1 (LPHN2), NP_001243779.1 (TTN), NP_002147.2 (HSPD1), NP_005535.1 (IRS1), NP_001182310.1(GPR35),NP_002176.2 (IL7), NP_001124389.1 (LILRA2), and NP_001167552.1 (NLRP2). The sequences of all primers used are presented in Table S5.

Creation of plasmids pcDNA3.3- ST6GALNAC5, pcDNA3.3- ST6GALNAC5- p.Val99Met, and pcDNA3.3- ST6GALNAC5- p.*337Qext*20

COOH-terminal FLAG-tagged ST6GALNAC5 cDNA was PCR amplified from a human heart cDNA panel (Clontech, Mountain View, CA, USA) using forward primer 5'-AAAATGAAGACCCTGATGCGC-3' and reverse primer 5′-TTACTTATCATCATCATCCTTATAATCGAACACAGGTTTATTCTCAGGA-3′. The amplicon was cloned into pcDNA3.3 (Invitrogen, Karlsruhe, Germany) using the Topo TA Cloning system (Invitrogen), and pcDNA3.3- ST6GALNAC5 was created. The plasmid was transformed into One Shot®TOP10 Chemically CompetentE. colicells(Invitrogen, Karlsruhe, Germany), and the sequence of the insert in plasmids isolated from ampicillin resistant cells was confirmed by direct sequencing. Subsequently, the c.G295A mutation that causes p.Val99Met was introduced using the QuickChange site-directed mutagenesis kit (Agilent Technology, Karlsruhe, Germany) according to the manufacturer’s instructions. PcDNA3.3- ST6GALNAC5- p.Val99Met was thus created. Primers used contained the sequence 5′- GGGACTGTGCCCTGATGACCAGCTCAG-3′(nucleotide causing the mutation is underlined) and the reverse complement of this sequence. Briefly, these primers which are complementary to opposite DNA strands of pcDNA3.3- ST6GALNAC5, are extended during a cycling reaction to create mutated plasmids with staggered nicks. DNA strands in plasmids without nicks are removed by digestion with Dpn1. Nicks in surviving plasmids are repaired in vivo after transfection into TOP10 E. coli cells. PcDNA3.3- ST6GALNAC5- p.*337Qext*20 was created by performing overlap extension PCR using forward primer 5′- AAAATGAAGACCCTGATGCGC-3′ and three reverse primers

5′- TGGGATTACAGTCTGGCATGCTCATTCCTTGGAACACAG-3′,

5′- GTGTCTCGGTGTCTGATGCAGTGAATACCTGGGATTACAG-3′,and

5′-TTACTTATCATCATCATCCTTATAATCGTGTCTCGGTGTCTGA-3′. The third reverse primer

contained the Flag sequence. Presence of the mutations was confirmed in isolated plasmids by sequencing.

ST6GALNAC5 expression in COS-7 cells

Transfection was performed using Lipofectamin 2000 (Invitrogen) according to the manufacturer’s instructions. African green monkey kidney COS-7 cells (ATCC, Rockville, MD, USA) were seeded at a density of 2 × 104/100 μl in a 12-well plate (for RT-PCR and Western blotting analyses) or onto poly-L-lysine coated coverslips (for immunofluorescent microscopy) and cultured in Dulbecco’s MEM medium GlutaMAX™ (Invitrogen) supplemented with 10% fetal calf serum and 1% antibiotic (Penicillin -Streptomycin; Sigma, Hamburg, Germany). Growth was in an atmosphere with 5% CO2 and 95% humidity at 37°C. Cells were transfected with pcDNA3.3- ST6GALNAC5, pcDNA3.3- ST6GALNAC5- p.Val99Met, and pcDNA3.3- ST6GALNAC5- p.*337Qext*20 after 24 hours. ST6GALNAC5 sequences in plasmids isolated from COS-7 cells were confirmed by sequencing. Expression analyses were performed 24 hours post transfection.

For RT-PCR, RNA was isolated using the RNeasy Mini Kit (Qiagen, Hilden, Germany) and cDNA synthesis was performed with Sprint RT Complete-Random Hexamer first-strand cDNA synthesis kit (Clontech-Takara Bio Europe, Saint-Germain-en-Laye, France). Five micrograms of total RNA was used for reverse transcription.

For Western blotting, PBS washed COS-7 cells were lysed in 1% NP-40 (v/v), 20 mM Tris-HCl, pH 7.6, 150 mM NaCl, 1:1000 protease inhibitor cocktail, and 1 mM EDTA5. Total protein content in supernatants recovered after centrifugation was determined using the BCA assay (Bio-Rad, Munich, Germany). Aliquots containing 20 microgram of protein were size-fractionated in NuPAGE 10% Bis-Tris gels (Invitrogen) using a buffer that contained 50 mM MOPS, 50 mM Tris Base, 0.1% SDS, 1 mM EDTA, pH 7.7. Proteins were then transferred onto nitrocellulose membranes, and probed with mouse monoclonal M2-anti-FLAG (1:1000; F3165, Sigma-Aldrich, Munich, Germany), rabbit anti-human sialylytransferase 7e (1:1000; ab69855, abcam,Cambridge, MA,USA), or goat anti-human lamin B (1:6000; sc-6216, Santa Cruz

Biotechnology, Santa Cruz, CA, USA) primary antibody and appropriate (anti- mouse IG, 1:2500, W4021, Promega, Mannheim, Germany; anti-rabbit IG,1: 4500, 65-6120, Invitrogen; and anti-goat IG, 1: 50000 ; sc-2768, Santa Cruz Biotechnology) secondary anti-IgG antibody anti-coupled to horseradish peroxidase. Lamin B served as internal control. Detection was performed using the enhanced chemiluminescence (ECL) Western blotting detection system (Invitrogen). Exposure times were 5 to15 seconds.

Sialyltransferase enzyme assay

Sialyltransferase enzyme activity in protein extracts of untransfected COS-7cells and COS-7 cells transfected with pcDNA3.3- ST6GALNAC5, pcDNA3.3- ST6GALNAC5- p.Val99Met, and pcDNA3.3- ST6GALNAC5- p.*337Qext*20 was assayed using the Sialyltransferase Activity Kit (R&D Systems, Wiesbaden-Nordenstadt, Germany), according to the manufacturer’s instructions. This kit utilizes the 5’-nucleotidase CD73 as a coupling phosphatase to remove inorganic phosphate quantitatively from the leaving nucleotide cytidine 5’-monophosphate that is generated during sialyltransferase reactions7. Assays were performed on protein extracts isolated from cells 24 hours after transfection; three independent transfections with each vector were performed and protein extractions and enzyme assays were done on cells of each transfection experiment. Varying amounts of protein were incubated with 25 nmol of CMP-Neu5Ac (C8271; Sigma), 1 mg of asialofetuin (A4781, Sigma), and 50 ng of Coupling Phosphatase 2 in 1X Assay Buffer for 20 minutes at 37° C. Released inorganic phosphate was measured by spectrophotometry.