Staffan Bensch 2014-03-18
Dry Lab –Tree Analyses
The file “mammals.meg” contains cytochrome b sequences (1140 bp, almost full length) from 23 species of mammals. The goal of this DryLab is to show that obtained phylogenetic treesare sensitive to taxon sampling, selection of outgroups and models of molecular evolution.
Neighbour-joining
Open the file mammals.meg in MEGA 5.
[Open a File/Session]
- Nucleotide sequences" OK
- "Protein-coding DNA" Yes
- Select genetic code "Vertebrate mitochondrial" .
- Make a neighbour joining tree
- Choose the following settings (Kimura 2-parameter model / Complete deletions):
- Now you will see a tree with all the names of the taxa. The “true” phylogeny of these species is pretty much based on full mitochondrial genomes and several nuclear genes (see Figure 1at the end of this document). The data contains a clade of carnivores, a clade of primates, a clade of rodents and two representatives of the African clade “Afrotheria”. In the true phylogeny of primates, Homo should be a sister taxon to Chimpanzee, followed by Gorilla, Orang-utan, Baboon, Howler Monkey and with lemurs being basal. Among the carnivores, we should see the cats forming a separate clade, the seals clustering together with polar bear and wolf outside. A particular phylogenetic problem has been the placement of rodents relative primates and carnivores. The present consensus based on many genes is that rodents are a sister group to primates, with carnivors outside.
Question 1: How does this first K2P tree differ from the expected “true tree”?
Documentation and saving trees. The most convenient way is to copy and paste trees into PowerPoint. Open a blank document in PowerPoint. In the “Tree Explorer” window of MEGA, select “Image/Copy to Clipboard”. Jump to PowerPoint and “Edit/Paste”.
Check the quality of the data and select OTUs (Operational Taxonomic Units)
Before continuing with more advanced analyses, it is best to inspect the quality of the data (go to the window “Sequence Data Explorer” by selecting “Data/Explore Active Data”). Does the alignment look to be OK?
- Rooting. A tree will not show the phylogenetic relationships unless it is rooted. A more adequate representation of an unrooted tree is by a radiation diagram. In the Tree Explorer window click on the ikon “Tree/Branch Style/ Radiation”.
- Go back to the default representation “Tree/Branch Style / Traditional / Rectangular”. To root a tree, click on a branch in the Tree Explorer window (now highlighted green) and then click on the ikon at the left hand side of the window with the green triangle. (Try out different branches as roots).
Question 2a: How does the tree look like if you root it with homo?
b: Identify the possible outgroups for phylogenetic testing of the relationships ……………..between carnivores, primates and rodents
c: Rerun the trees with one outgroup at the time and compare the results
- Bootstrap analysis is a way to evaluate the accuracy of a tree, or rather, how well the obtained tree is supported by the data. Make a new neighbor-joining tree (as above) but now change "Test of Phylogeny" to "Bootstrap method". Press “Compute”. Root the tree with the outgroup.
Question 3. Which clades have high support? Are there any inconsistencies relative the “true tree” that have good support?
- Models of Molecular Evolution. Try for example the Tamura 3-parameter model with rate variation between sites (see below for the settings). Try different values (high >2, low <0.2) of the gamma-parameter.
Question 4. How do the trees differ (topology, support values, branch length)?
Maximum parsimony
One of the traditional and still very popular discrete method is Maximum Parsimony. Go to “Phylogeny/ Test Maximum Parsimony Trees”. Use the default settings [(CNI (level=1) with initial tree byRandom addition (10 reps))] and select bootstrapping. Compare how much longer it takes to do bootstraps with this method compared to neighbour joining. Root the tree with the outgroup.
Question 5. How is the tree looking compared to the NJ-trees (branch length)?
Question 6. Which of the two trees look best compared to the true tree? How can this be explained?
Character mapping. A nice feature with maximum parsimony is that the nucleotides can be mapped directly on the tree. In the Tree Explorer window, click on the upper-right symbol and choose “Show all”. Walk through the sequence by changing the “Site Index”
Maximum Likelihood
Go to “Phylogeny/ Test Maximum Likelihood Tree using the settings as below (DO NOT BOOTSTRAP)
Question 7. Compare this tree with the best NJ tree and the true phylogeny.
Find the most appropriate model of molecular evolution for your data. Go to Model/ Find the best DNA / Protein Model (ML)
Question 8. Study the output and try to interpret the parameters
Go to “Phylogeny/ Test Maximum Likelihood Tree. Using the settings as were found to be the most optimal model (DO NOT BOOTSTRAP)
Question 9. Are there any differences between this tree and the previous ML tree?
Now, make a final ML tree with bootstrapping (select 100 replications).
Question 10. Compare the bootstrap values with the NJ tree?
Amino Acids
Neighbour Joining, Maximum Parsimony and Maximum Likelihood trees can all be done based on amino acid sequences (given the sequences are from a protein coding gene). You can do this by changing the “Substitution type” to amino acids
Make amino acid trees using NJ and ML.
Question 11. Compare these trees with those done from DNA sequences
Figure 1. Consensus phylogenetic tree of mammalian orders. Species included in this exercise are indicated on the right side of the figure
RAxML is a popular program for maximum likelihood tree constructions. It can be run on a remote server from the webpage:
Open your data in MEGA / Explore Active Data and go to “Exporting Sequence data” and change format to PHYLIP 3.0 (you may need to do some manual adjustments in the text file). Save the text file, open it and copy and paste into the RAxML sequence box.
1