MSeqDR/GEM.app Tutorial

Presentation:Marni Falk, MD, MSeqDR Organizer; Member, SIMD

Assistants: Xiaowu Gai, PhD

Lishuang Shen, ,

Colleen Clarke Muraresku, MS, LCGC

Elizabeth McCormick, MS, LCGC

Date: Saturday, June 20, 2015

Time: 11:45-1:15 pm

Location: Hyatt Dulles Hotel, 2015 UMDF Meeting, Herndon, Va

Welcome! Today we are going to walk you through a hands-on tutorial for using MSeqDR, GEM.app, and GeneMatcher. These databases can be utilized as tools for researchers and clinicians with real time queries, data organization, and more… all for FREE to academic users! After this tutorial you will be able to login, navigate within the database, submit, explore, and share data.

MSeqDR is described in more detail in the in the March edition of Mol Genet Metab:Falk MJ, et al.Mitochondrial Disease Sequence Data Resource (MSeqDR): A global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities.

*Please keep in mind both MSeqDR and GEM.app (new version will be known as The Genesis Project”) continuously are being updated and new versions will be released.*

MSeqDR WEB PORTAL:

PERKS OF SIGNING UP!

ACCESS ON HOME PAGE; CLICK ON LARGE BOX ICONS

PART 1. MSeqDR EXERCISES

I. HOW TO GET STARTED:

  • Login at if you already have already registered OR sign in using the single login for this session
  • Username UMDF15 password Mito15

* You must login to use some tools such as submission and to see ALL data you are authorized to access by MSeqDR and from your collaborators who share data with you*

-You can “Manage Account and Data Access” - request access, grant access, and create your lab/collaborator group.

*Exercise 1: Click on “About” - Place cursor over to hover on specific items in tool bar located on the top of the screen AND/OR over the large boxes on the bottom.

(5 minutes)

NEWS – includes publication, tutorial, and a place to give feedback and comments

GBrowse – helps to mine data

TOOLS - resources are located to link out to other sites

GEM.app link - **you can use the same login for GEM.app (to mine your exome-level datasets) that you have for MSeqDR

II. LOOK UP DATA

  • Click on “TOOLS” on the home page header or at the large box icon. A wide variety of distinct resources are accessible from here: GEM.app, MitoMap, POLG database, MT.AT, HBCR, Transcriptome data, etc

*Exercise 2: click on the large box ‘TOOLS’ icon (5 minutes)

1. Enter SURF1 in the search box

-How many entries are listed? Scroll down and see what else is there.

2. Enter MT-ND5 in the search box

-How many entries are listed? Scroll down and see what else is there.

III.USING “MSeqDR GBROWSE”, A CUSTOM GENOME BROWSER

OR click on the large box icon ‘VISUALIZATION’

  • Allows users to access and share data as “community track” input with the public or only collaborators, at your discretion.

*Exercise 3: (15 minutes)

A)MITOCHONDRIAL DNA DISEASE EXAMPLE

1. Click on the ‘M’ under the search bar next to the word examples (this views all the mtDNA, or Chromosome M)

2. Click on MT-ND1under the search bar next to the word examples

  • Click on Select Tracks – next to each track select all on – click back to browser
  • On the ruler highlight to zoom a small area over region 5 (drab mouse over the ruler to highlight region to zoom in)
  • Is there an Ensembl track?
  • What variant is present in MitoMap and ClinVar?
  • Are there any haplogroups-defining SNPs within this gene?

B) NUCLEAR GENE-BASED MITOCHONDRIAL DISEASE EXERCISE

Click on POLGunder the search bar next to the word examples

  • Scroll down and select the second POLG example under the chromosome
  • Click on ‘Select Tracks’ – next to each track select ‘all on’ – click back to browser
  • Scroll down and locate the POLG database
  • Hover over the icons to the left of the database track
  • Click on ‘share track’ – see various options to share track
  • Click on ‘download this track’ – see multiple file types to share data as
  • Click ‘question mark’ on each header to find information about this track

IV. USING “Phenome”, A LOCUS-SPECIFIC DISEASE PORTAL

  • The central system to obtain disease specific information by disease or Human Phenotype Ontology (HPO) term.
  • Gene View – Views ALL data that is made public in the database for >1350 genes! All known mitochondrial disease genes and MitoCarta-based nuclear genes that encode known mitochondrial proteins are supported.
  • Mitochondrial disease – View ALL nuclear-encoded gene information and related phenotypic features (Human Phenotype Ontology or HPO-terms supported) as relates to mitochondrial diseases caused by both nuclear and mtDNA genes.

*Exercise 4: (15 minutes)

1. Click on click on the icon ‘Tools’ and then click on ‘Phenome’ OR click on the icon on the top tool bar ‘Phenome’ then ‘Disease Portal.’

2. Use the drop down menu under mSeqDR Mitochondrial Disease Portal and

Click #18 Kearns-Sayre Syndrome

  • What are the Parent Nodes?
  • How many Child Nodes are present?
  • How many HPO terms are associated with KSS?
  • How many LSD variants are present?

3. Scroll down and click on 1st variant in the box below transcript variant

  • Scroll down and click on “add annotation” – here you can add information about the variant and comment
  • Click ClinVar style
  • Show/Hide community add-on annotation and revise/delete your own annotation

4. Click on ‘Leigh’ under dropdown menu

  • What are the Parent Nodes?
  • How many Child Nodes are present?
  • How many HPO terms are associated with KSS?
  • How many genes are associated with Leigh Disease?

V. ONE STOP VARIANT MAPPING AND ANNOTATION, mtDNA:

Multiple or single variant annotation and mapping engine for any variant using HGVS input, and also provides a variant name converter for individual genomic variants:

mvTool: Mapping engine exclusively for mtDNA format conversion and annotation:

*Exercise 5:

  1. mvTool try to convert a mixture of mtDNA variant formats, and access to all MSeqDR data associated, click “save” results as text file:

#Sample input of a mixture of multiple mtDNA variant nomenclatures:

m.8993T>G

8993G

T8993G

8993d

8527

8527A>G

chrMT.6328C>T

8042_8043d

1494.1T

7472.XA

2. Enter 1:g.215821999G>A (nuclear-encoded) and click ‘ANNOTATE’

  • What is the population frequency?
  • Is there a dbSNP associated with this position?
  • Click on NCBI and Ensembl links under ‘Disease and phenotype in-house data, ClinVar and more’

3. Enter AARS2:c.1774C>T (nuclear-encoded) and click ‘ANNOTATE’

  • What is the genomic position provided by mutalyzer?

4. Additional examples set to enter and Annotate

  • 8:g.10480578C>T
    NM_178857.5:c.134G>A
    m.13513G>A
    19:g.54621653_54621662del

VI. DATA SUBMISSION

New single variant submission to MSeqDR LOVD: minimal to no typing is required. Instead, input variant, and click “ANNOTATE” to pre-populate known information about the variant (HGVS format).

Click “Submission” Menu, go to Single Variant:

LOVD style:

ClinVar Style: (up to 24 fields can be Auto Filled).

***For the next 3 exercises, use the attached abstracts for pubmed ID, variant & disease and phenotype information, and practice inputting variant data into MSeqDR.

  • *Exercise 6. Submit a Single New Variant (information) to MSeqDR LOVD:

After the new variant is added with LOVD style, try entering additional information from within its specific variant description page:

a)Quick Comment blog,

b)Full ClinVar annotation.

**You can show or hide these variants after comment. They will be attributed to your name and email, and can be visualized by all other MSeqDR signed-in users.

  • *Exercise 7. Full Study (Data Set) Submission to MSeqDR:
  • Select ‘Full Study: Create’ under submission tab.
  • Select an example beginning with “Template”
  • Click use as a template underneath the selection
  • REVISE study name for the template
  • Click “Save/Continue”
  • Upload page - VCF file for variants, and other files can be added as you regard being helpful.
  • Then click “FINALIZE” for the submission; a study accession number will be generated.

3. *Exercise 8. Determine Haplogroup in your mtDNA genome data using Phy-Mer

  • Click on example #2 phymer_example.fastq or phymer_example .fasta and annotate
  • OR Download the sample files from this this page and try uploading, or copy- paste: SAMPLE INPUT: fasta, fastq, bam, csv

VII. DATA SHARING MECHANISMS

LSDB links out in real-time through gene variant tracks from the LSDB GENE page to the GBrowse or other public Genome browsers:

Graphical displays and utilities
Graphs / Graphs displaying summary information of all variants in the database »
MSeqDRGenomeBrowser / Show variants in the MSeqDR Genome Browser (full view, compact view)
UCSCGenomeBrowser / Show variants in the UCSC Genome Browser (full view, compact view)
EnsemblGenomeBrowser / Show variants in the Ensembl Genome Browser (full view, compact view)

* ACCOUNT MANAGEMENT. You manage your own account and submissions and determine who can have access to some data, certain projects, or ALL of your data, at your discretion.

* SINGLE GENE PAGES ( enable you to:

  • Explore related WEB links accessible from this page (diseases, GeneCard, etc)
  • View transcript isoforms, variants, and other users’ blog annotation
  • Data sharing and pushing variants from current gene to external genome browsers. UCSC is a good example where MSeqDR databases variants are shown We will also be pushing curated, annotated variants from MSeqDR to ClinVar.

CONTACT AND FEEDBACK:

Interested in joining MSeqDR consortium and collaboration, including depositing your aggregate data sets from your clinical or research laboratory, adding central access to additional data resources, or contributing to curation of specific gene(s)?

Please contact Dr. Marni Falk at , or Dr. Xiaowu Gai at

Technical questions about the tutorial, data, tools and website? Please contact Dr. Lishuang Shen at , or feedback.

Part II. GEM.app Exercises

I. Getting Started in GEM.app

  1. Log in to GEM.app
  2. Username UMDF15 password Mito15
  3. Enter “UMDF2015” and submit request

II. Navigate Dashboard

  1. There are two tabs “Phenotypes” and “Samples”
  2. How many families with HSP (Hereditary Spastic Paraplegia) do you have access to?
  3. How many samples (data sets) do you have access to?
  4. Click on the link for family 20065 to view the pedigree
  5. What is the expected inheritance for this family?
  6. Click on the link for family 1781 to view the pedigree
  7. What is the expected inheritance for this family?
  8. What is the number of sequence reads for the proband in family 1781?

III. Analyze NGS data for families 20065 and 1781

  1. On the left panel click “Variants Within Families”
  2. Displayed are all the annotations accessible to you while filtering genomic data
  3. GEM.app has an option to use “predefined filters”, which automatically populates specific filter options for ease of use.
  4. Relaxed – least stringent filters (Non-synonymous variants with MAF (minor allele frequency) < 2%)
  5. Moderate – moderately stringent filters (Non-synonymous variants with MAF < 0.5% and predicted to be under evolutionary constraint)
  6. Strict – most stringent filters (Nonsynonymous variants with MAF < 0.05%, predicted to be under evolutionary constraint, and predicted to have a functional impact)

Now we will analyze the exomes of family 20065:

  1. Select the suspected mode of inheritance
  2. Enter 20065 in the family selection field
  3. Using Relax filters how many variants do you observe?
  4. Using Relax filters, do you see any interesting candidates?
  5. Now analyze this family using Moderate filters. If you click on “Modify filters” at the top left, it will take you back to the filter options.
  6. Using Moderate filters how many variants do you observe?
  7. Do you see any clinically significant variants? (eg, any variants listed in ClinVar or PubMed?)
  8. Click the link for PubMed to view the paper associated with a particular variant.
  9. Are there any variants that appear to be associated to Charcot Marie Tooth disease?
  10. Once you find the suspected gene mutation for this family, what are the values for the following fields:
  11. MutationTaster
  12. MutationAssesor
  13. Minor Allele Frequency in NHLBI EVS
  14. GERP Conservation Score

Now we will analyze the exomes of family 1781

  1. Select the suspected mode of inheritance
  2. Enter 1781 in the family selection field
  3. Using Relax filters how many variants do you observe?
  4. Are there too many variants to make a decision?
  5. Try adding a filter for mutations in known HSP genes. This option will be in the “Clinical Significance” filter option.
  6. Does this help narrow down the list?
  7. Does this gene have an OMIM entry?

***GEM.app also provides a MatchMaker function, “GVDHD”, where you can identify other individuals in the database with the same mutation or other mutations in the same gene, to then be put in contact with the submitter of those datasets. It is your choice, if and when to make your exome data set discoverable only at individual variant level to others.

Technical questions about the tutorial, data, tools and website?
Please contact Dr. Michael Gonzalez at

Part III. GeneMatcher

(

GeneMatcher enables you to connect with other individuals who have a patient with mutations in a novel gene of interest. You match only on generic GENE-based information, with an email sent to notify other individuals who have entered that GENE into the database to determine if you have a “match” of patients with similar phenotypes or inheritance patterns.

  1. Create an account
  2. Login
  3. Enter a GENE NAME
  4. Enter a GENOMIC POSITION
  5. Enter a DISEASE NAME and an OMIM number
  6. Enter Features

Technical questions about the tutorial, data, tools and website?
Please contact Dr. Nara Sobreira at

1