Class

8

Essentials of Next Generation Sequencing 2013Page 1 of 4

Integrative Genomics Viewer (IGV)

Background

TheIntegrative Genomics Viewer (IGV)is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.

Helga Thorvaldsdóttir, James T. Robinson, Jill P. Mesirov.Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.Briefings in Bioinformatics 2012.

James T. Robinson, Helga Thorvaldsdóttir, Wendy Winckler, Mitchell Guttman, Eric S. Lander, Gad Getz, Jill P. Mesirov.Integrative Genomics Viewer. Nature Biotechnology 29, 24–26 (2011)

8.1Installing IGV

Go to

Scroll down the page to “Downloads” and click on the “register” link.

Type your information and click“Agree”

On the left-hand side click on “Downloads”

For this workshop we will be using the 1.2GB memory build of IGV. This allows IGV to use up to 1.2GB of memory. If you find that IGV is running out of space (bottom right of the IGV screen) and not loading files properly, you will want to switch to using a larger build.

Click the appropriate “Launch” button, and open the IGV.jnlpfile you downloaded (i.e. igv_mm.jnlp) and allow it to run. You may want to click “Always trust content from this publisher.”

8.2 Adding a Genome and Annotations

Input(s): magnaporthe_oryzae_70-15_8_single_contig.fasta

rnaseq/70-15_RNA_sample_1_thout/accepted_hits.bam

genes/maker/maker-preview.gffor maker-annotations.gff

Output(s): accepted_hits.bam.bai

Tracks and annotations visible in IGV

8.2.1 Adding a Genome

IGV can keep track of multiple genome assemblies. To load our information, we need to get it from the server and thenload it into IGV as a new genome.First, copy the genome from the server:

For PC:

Open WinSCP.

Connect to our server (csurs11.csr.uky.edu) with your username, password, and port.

In the left pane, Browse to your USB drive.

In the right pane, browse to your assembly directory.

Drag the magnaporthe_oryzae_70-15_8_single_contig.fastafile from the right pane to the left pane to transfer it to the local computer.

For Mac OSX:

Use scp in the terminal, specifying the port number with –P:

  • scp –P :path/filename \destination

Now, we will load the file into IGV.

Open IGV

In the menu, select GenomesLoad Genome from File...

Browse tothe folder on your USB drive where you copied files from the server, and select magnaporthe_oryzae_70-15_8_single_contig.fasta.

8.2.2 Adding AnnotationTracks

To import BAM and SAM files into IGV, we will first have to index them with SAMtools.

Open a shell and navigate to the folder containing your mycelial sample 1 TopHat run data. There, run samtools:

  • samtools index accepted_hits.bam

This will create the index file accepted_hits.bam.bai

Now follow the WinSCPor scpinstructions in section 8.2.1 again, this time to retrieve the following files:

rnaseq/70-15_mycelial_RNA_sample_1_thout/accepted_hits.bam
rnaseq/70-15_mycelial_RNA_sample_1_thout/accepted_hits.bam.bai
genes/maker/maker-preview.gff

Then, return to IGV.

In the menu, select FileLoad from File...

Open accepted_hits.bam

Repeat this process to loadmaker-preview.gff; if your MAKER run has already completed, you can use yourmaker-annotations.gff instead.

Results may not appear immediately, or if they do they may be difficult to read.

  • If you cannot see the features, zoom in using the zoom tool at the top right. If you still cannot see features use the bar at the top to scroll around.
  • If the features are too clumped together to see, right-click on the name of the track to display acontext menu.You should see three options: collapsed, expanded, and squished. Click on expanded and you should find your data easier to see.

Within the right-click context menu you can also do several other things such as: rename tracks, adjust heights and colors, etc. Different file types will have different options that alter the tracks. Descriptions of these options can be found at

8.3 Creating Coverage Tracks

Input(s):accepted_hits.bam

Output(s): accepted_hits.bam.tdf

Correct coverage track displayed in IGV

When you load the track, IGV will show a quickly generated coverage track. However, this will not be completely accurate. To load a more accurate coverage map we need to count the entries in the .bam file with igvtools.

In the menu, select ToolsRun igvtools

Select these options:

  • Command: Count(should be selected by default)
  • Input file:accepted_hits.bam
  • Output fileaccepted_hits.bam.tdf(will be automatically filled when you select your input file)
  • Genome: magnaporthe_oryzae_70-15_8_single_contig.fasta(should be selected by default)

Click Run. Wait for ‘Done’ to appear in the box at the bottom of the screen, and then close the window.

Back in the browser, right-click on the “accepted_hits.bamCoverage” track and select “Load Coverage Data

Select the accepted_hits.bam.tdffile which was just created.

In the right-click context menu, you can also change the scaling for the coverage map.

Log scale: Toggles logarithmic scaling for that track.

Auto scale: Toggles the auto scaling function for a given track.

8.4 Getting Heat maps from a .bam file

Input(s):accepted_hits.bam

magnaporthe_oryzae_70-15_8_single_contig.fasta

Output(s): accepted_hits.bedgraph

Heat map displayed in IGV

We are going to use the program genomeCoverageBed, which is a part of the Bedtoolssuite, to create a .bedgraph file to load into IGVand display a heat graph.

Type:

  • genomeCoverageBed–ibam \~/rnaseq/70\15_RNA_sample_1_thout/accepted_hits.bam \ -bg–g~/magnaporthe_oryzae_70-15_8_single_contig.fasta \ ~/accepted_hits.bedgraph

Now use WinSCPor scp in the terminal, to copy the accepted_hits.bedgraphfile to the local machine you're working on. (see section8.2.1 above)

Load the accepted_hits.bedgraph file into IGV (see section 8.3 above).

Right-click the “accepted_hits.bedgraph”track and change the type of graph to Heatmap.

Citations

Quinlan AR and Hall IM, 2010.BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 26, 6, pp. 841–842.

Essentials of Next Generation Sequencing 2015Page 1 of 5