EXPERIMENT

PHYLOGENETIC ANALYSIS

Traditional Phylogeny has been based on morphological comparison and anatomy

·  Presence / absence of true tissues

·  Diploblastic/triploblastic (3 germ layers)

·  Type of body symmetry

·  Presence / absence of body cavity (coelom)

·  Body cavity enclosed/not enclosed in mesoderm

·  Pattern of coelom development (acoelomate, pseudocoelomate or true coelomate)

Modern phylogeny is based on genetic data and DNA sequence comparison

Advance in DNA sequencing techniques made large-scale sequencing practical and more affordable allowing for a huge accumulation of sequence data for any organism of interest. Data sequences of highly conserved genes across all organism are used in such studies. The most used sequences for phylogenetic and evolution studies have been those of ribosomal RNA (rRNA) which changes extremely slowly over time. It is the comparison of ribosomal RNA sequences from many organisms that led to a new classification into eubacteria, archea and eucaryae and a new much more precise phylogenetic tree of life.

In this exercise we will determine the evolutionary relationship between the elongation factors EF-Tu and Ef-1a of a number of organisms based on sequence homology, and construct a phylogenetic tree that will show the evolutionary distances between the various organisms.

GOAL:

The goal of this exercise is to familiarize you with the basic bioinformatics tools available free on the internet to search, retrieve and analyze both DNA and protein sequences available in public data bases such as the GeneBank or the SWISSPROTEIN.

You will need to retrieve the protein sequences from public data bases, make sequence alignments, determine % homology between the sequences and create a phylogenetic tree based on the sequence alignment result. All of those steps will be done on the computer and require no more than two hours. As you familiarize yourself with this type of analysis the time required to construct a phylogenetic tree for any gene available in the public sequence data base will decrease.


EXERCISE:

1) Retrieve the DNA or protein sequences from GeneBank and save in a world file.

2) Align the protein sequences using ClustalW and find the conserved motifs and pairwise homology

3) Present a table of pairwise % homology between the various sequences

4) Create of a phylogenetic tree based on the results of the protein or DNA alignment

5) Analyze the results

1) Retrieve protein sequences from the public database

Go to the NCBI site: http://www.ncbi.nlm.nih.gov/

Click on PubMed at the top left and click on Go on the top right

Click on Protein on the black row. You will get to a page that looks like

You want to Search PubMed for elongation factor protein sequences (top of the screen).

You will find over 1000 sequences for various organisms. You will retrieve the sequences of ET-Tu and Ef-1alpha for specific organisms:

·  Type EF-Tu E. Coli (or any protein of interest) in the blank space and click Go

·  Stroll, down the page until you see the first mention of Elongation factor Tu (EF-Tu) (P0A6N1)

·  Click on the accession number

·  Scroll to the every end of the page you will see the amino acid sequence of the protein (in one letter code) This is not the right format for alignments. You need to obtain the FASTA format.

At the left hand top of the screen you can select in display the format you want.

Select Display: FASTA and push return.

You now have the protein sequence displayed in capital –one letter code for the amino acid.

·  Copy and paste the sequences in a word document one after the other including the beginning (>gi…..)

·  Select the FASTA report for the following: (feel free to add more) and paste in word document.

Two Eubacteria: E. coli (Gram-) P0A6N1 and Bacillus Subtilis (Gram +) P33166)

Two archeabacteria Methanocaldococcus jannaschii AAB98308

Pyrobaculum calidifontis YP_001056002

Chloroplast Pisum sativum (pea) CAA74893

Unicellular eukaryotes

Candida Albicans XP_717581)

Tetrahymena XP_001032213

Many multicellular eukaryotes

Porifera Ephydatia cooperensis sponge AAT06177

Cnidaria Hydra magnipapillata BAA11471

Platylelminthes Girardia tigrina CAB89808, Dugesia japonica BAA08663

Nematodes C. elegans AAA81688

Mollusks Mytilus edulis bivalve AAD21859,

Annelids Eunice yamamotoi BAA25733

Arthropoda

Drosophila Melanogaster NP_996316

Raphia abrupta (yellowmarked caterpillar) AAC47605

Australobius scabrior AAQ77068

Echinodermata Eucidaris tribuloides, AAT06181

Chordata (frog, Zebrafish, Chicken, cow, Human)

Xenopus laevis (African clawed frog) CAA39027

Danio rerio (zebrafish) AAY85516

Pelodiscus sinensis (Chinese softshell turtle) AB124568.1

allus gallus (chicken) NP_989488

Bos taurus (cattle) BAB60846,

Rattus norvegicus (Norway rat) AAI11708

Sus scrofa (pig) ABG65696

Canis lupus familiaris (dog) XP_850819

Homo sapiens (human) NP_001393

2) Alignment:

Once you have retrieved all of the proteins sequences and stored them as FASTA in a word document you need to compare the sequence to each other by aligning them one under the other. To do that you need a program that can do sequence alignments such as: Multiple Sequence Alignment by CLUSTALW

·  Go to : http://align.genome.jp/

·  Paste the entire document with the various Sequences of elongation factor into the ClustalW window. The format of the sequences must be FASTA and contain the file description at the beginning of the protein sequence.

·  Make a multiple alignment for all the proteins. The ClustalW alignment will let you choose some features. You can also use the ClustalW help to learn about parameter settings and other things.)

3) Table of % homology

ClustalW can calculate the pairwise homology between any 2 genes in %.

You should get those numbers and present them as a table with each gene listed both vertically and linearly on the table. You should fill each entry with the % homology.

4) Phylogenetic tree

ClustalW program can also generate a phylogenetic tree.

You have to enter the protein alignment and get back the distance pairwise between each EF_Tu or EF-1 a. You then have to produce your own tree. You can do it by hand or you can use a phylogenetic tree program.

There are other programs that can generate phylogenetic tree:

http://www.genebee.msu.su/services/phtree_reduced.html (Phylogenetic tree)

Phylip

5) DATA ANALYSIS:

·  What does the tree tell you about the evolution of prokaryotes, eukaryotes and Archebacteria?

·  Where does flatworms stand in the evolution of animals?

·  What is the closest phylum to Platyhelminthes?

·  The most divergent?

·  Does the tree fit with the current classification of animals?


APPENDIX A: LAB SET UP

At each bench

·  A dissecting microscope (for two people)

·  A phase contrast compound microscope ( for 4 people)

·  One Heat block with holes for 1.5 ml large eppendorf tubes at 37 ºC for everyone

·  One Heat block with holes for 1.5 ml large eppendorf tubes at 65 ºC for everyone

·  Plastic tupperware to hold planaria colonies

·  One sleeve of samll petri dishes per team (to hold individual planaria and cut pieces)

·  Scissors and saranwrap ( to put over ice)

·  Gloves

·  Ice bucket

·  Microcentrifuge (one for 4 people-2 teams)

·  Sterile Pasteur pipettes (to spool DNA)

·  20, 200 µl and 1000 µl pipetmans (for DNA work)

·  Sterile Tips of P20, P200 and P1000 (for DNA work)

·  Labeling tape (color coded for each bench)

·  Marker (color coded for each bench)

·  Twizzors (to move filter discs)

·  One rack for microfuge tubes (color coded for each bench)

·  One multitube rack

·  Kimwipes

·  70% EtOH in squirt bottle

·  Disposable plastic transfer pipettes (to suck up planaria and add/remove water)

·  Razor blades (to cut planaria)

·  Scintillation vials (to hold planaria in the process of regeneration)

·  Timer

·  A flask containing Poland spring water

·  One pipette aid

·  10 ml disposable pipettes

·  60 cc syringe (for conditioning experiment)

·  Plastic cutting board with training trough

·  Waste bins for tips and sharps

·  Waste beakers for water

·  1.5 ml microfuge tubes

At the back of the room:

·  two gel boxes and power supplies for DNA agarose gel

·  Film to take gel pictures, UV box and hand held camera

·  Sterile 250 ml and 1 liter flasks

Reagents:

Sterile water

Poland spring water

TBE for agarose gel, DNA sample buffer

Phenol-chloroform isoamylalcohol (25:24:1)

RNAse A

Proteinase K

Ethidium Bromide

Agarose

1 KB Mw markers

6X DNA loading dye

Reagents for immunohistochemistry

Reagents for whole mount and staining

2007 MIT Teacher Workshop

2007 MIT Teacher Workshop