Downloading and Using MEGA

Downloading and Using MEGA

My Contact info:

Paul Chafe

204A Lumbers

Downloading and Using MEGA:

MEGA is a new and easy to use phylogenetic analysis software. It is available for free download; HOWEVER, you will need to provide an email address in order to get a download link. I’ve chosen MEGA because it is both fast and very easy to use.

Head to the MEGA website:

This website has information on the program, how to complete various analyses, etc.

Click on DOWNLOAD for whichever version you’re going to use (Windows, Linux, Mac). I’ve only used the Windows version, so the information below work for windows, I cannot confirm that it will work on MAC.

Fill in the information requested (name and email). MEGA will send you an email with a link to download the program. Click the link within the email and the program will download. Once it has downloaded start the MEGA5 setup program. Click NEXT to install, choose the desired program folder (e.g. MEGA5), then select the startup menu folder name (e.g. MEGA5), next you can choose whether to add a desktop item, finally you can click install. Once the installation is complete you can choose to start the program.

The program website has a tutorial, which may help familiarize you with the software:

Now, we will complete a sample analysis of the AUSTROBAILEYALES, using Nymphaeacaerulea (NYMPHALES) as an outgroup. Note that the NYMPHALES sequence is first among those listed below.

Copy the sequences below into a .txt file (either open notepad and save a new file, or open a blank word document and then save the file as text only format). You will also want to change the first 10 characters to something useful, for instance, I called Nymphaea caerulea >Nymphaea in my analysis. Once you’ve done that you can save your file as something informative (like AUSTROByourname).

Nymphaeaceae gi|298379483|gb|GQ468660.1| Nymphaea caerulea isolate NycW1 ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit (rbcL) gene, partial cds; chloroplast

AAGTGTTGGATTCAAAGCTGGTGTTAAAGATTACAGATTGACTTATTACACTCCTGATTATGAAACCCTT

GCTACTGATATCTTGGCAGCATTCCGAGTAACTCCTCAACCTGGAGTTCCGCCTGAGGAAGCAGGAGCTG

CGGTGGCTGCCGAATCTTCCACTGGTACATGGACAACTGTGTGGACCGATGGACTTACCAGCCTTGATCG

TTACAAAGGACGATGCTACCACATCGAGCCTGTTGCTGGGGAGGAAAATCAATATATTGCTTATGTAGCT

TATCCTTTGGACCTTTTTGAAGAAGGTTCTGTTACTAACATGTTTACTTCCATTGTGGGTAATGTATTTG

GGTTCAAAGCCCTACGAGCTCTACGTCTGGAGGATCTGAGAATTCCTCCTGCTTATTCTAAAACTTTCCA

GGGCCCACCTCATGGAATCCAAGTTGAGAGAGATAAATTGAACAAGTATGGTCGTCCCCTATTGGGATGT

ACTATTAAACCAAAATTGGGGTTATCCGCAAAGAACTATGGGAGAGCGGTTTATGAGTGTCTCCGTGGTG

GACTTGATTTTACCAAGGATGATGAAAACGTGAACTCCCAACCGTTTATGCGTTGGAGAGACCGTTTCTT

ATTTTGCGCCGAAGCTATTTATAAAGCGCAGGCCGAAACAGGTGAAATTAAAGGACATTACTTGAATGCT

ACTGCAGGTACATCCGAAGAAATGATCAAAAGGGCGGTATGTGCCCGAGAGTTGGGAGTTCCTATCGTAA

TGCATGACTACTTAACAGGGGGATTCACCGCAAATACTAGCTTGGCTCATTATTGCCGAGACAATGGCCT

ACTTCTTCACATCCACCGCGCAATGCATGCAGTTATTGATAGACAGAGGAATCATGGTATTCACTTCCGT

GTACTAGCTAAAGCGTTGCGTATGTCTGGGGGGGATCATATTCACTCTGGTACCGTAGTAGGTAAACTGG

AAGGGGAACGAGATGTCACTTTGGGCTTTGTTGATTTACTACGTGATGATTTTATTGAAAAAGACCGGAG

TCGCGGTATTTATTTCACTCAAGATTGGGTATCTATGCCAGGTGTTCTGCCCGTGGCTTCAGGGGGTATT

CACGTTTGGCATATGCCTGCCCTGACCGAGATATTTGGGGATGATTCCGTGCTACAGTTCGGTGGAGGAA

CTTTGGGACACCCTTGGGGGAATGCACCTGGTGCAGTAGCTAATAGGGTAGCTTTAGAAGCGTGTGTACA

AGCTCGTAATGAGGGACGTGATCTTGCTCGTGAAGGTAATGAAATTATTCGTGAAGCTAGCAAATGGAGT

CCTGAACTGGCTGCTGCTTGTGAGGTATGGAAAGAGATCAAATTTGAATTCGAAGCAATGGATGTCTTGT

AA

>gi|37194768|gb|L12632.2|AUBCPRBCLA Austrobaileya scandens ribulose 1,5-bisphosphate carboxylase large subunit (rbcL) gene, partial cds; chloroplast gene for chloroplast product

GTGTTGGATTCAAGGCTGGTGTTAAAGATTACAGATTGACTTATTATACTCCTGACTATGAAACTAAAAT

GACTGATATCTTGGCAGCATTCCGAGTAACTCCTCAACCCGGAGTTCCACCTGAGGAAGCGGGGGCTGCG

GTAGCTGCAGAATCTTCTACTGGTACATGGACAACTGTGTGGACCGATGGACTTACCAGCCTCGATCGTT

ACAAAGGTCGATGCTACCACATCGAGCCTGTTGCTGGGGAGGAAAATCAATATATTGCTTATGTAGCTTA

CCCTTTAGACCTTTTTGAAGAAGGTTCTGTTACTAACATGTTTACTTCCATTGTGGGTAATGTATTTGGG

TTCAAAGCCCTACGAGCTCTGCGTCTGGAAGATCTGCGAATTCCTCCTGCTTATTCCAAAACTTTCCAAG

GCCCGCCTCATGGCATCCAAGTTGAGAGAGATAAATTGAACAAGTATGGGCGTCCCCTATTGGGATGTAC

TATTAAACCAAAATTAGGTTTATCTGCCAAGAACTACGGTAGAGCGGTTTATGAATGTCTCCGCGGTGGA

CTTGATTTTACCAAGGATGATGAGAACGTGAACTCCCAACCGTTTATGCGTTGGAGGGACCGTTTCGTAT

TTTGTGCCGAAGAAGTTTATAAAGCGCAGGCAGAAACAGGTGAAATCAAAGGACATTACTTGAATGCTAC

CGCAGGTACATGCGAAGAAATGATCAAAAGGGCCGTATTTGCCAGAGAATTGGGAGTTCCTATCGTAACG

CATGACTACTTAACAGGGGGATTCACTGCAAATACTAGCTTGGCTCATTATTGCCGAGACAACGGCCTAC

TTCTTCACATCCATCGCGCAATGCATGCAGTTATTGATAGACAGAGGAATCATGGTATACACTTTCGTGT

ACTAGCTAAAGCGTTGCGTATGTCTGGTGGAGATCATGTTCACTCTGGTACCGTAGTAGGCAAACTGGAA

GGGGAACGGGACGTCACTTTGGGTTTTGTTGATTTACTACGTGATGATTTTATTGAAAAAGACCGAAGTC

GCGGTATTTATTTTACTCAAGATTGGGTATCTATGCCAGGTGTTTTACCCGTGGCTTCAGGAGGTATTCA

CGTTTGGCATATGCCTGCCCTGACCGAGATCTTTGGGGATGATTCCGTACTACAGTTCGGTGGAGGAACT

TTAGGGCACCCTTGGGGAAATGCACCTGATGCAGTAGCCAATCGGGTGGCTTTAGAAGCGTGTGTACAAG

CTCGGAATGAGGGACGTGATCTTGCTCGTGAAGGTAATGAGGTTATCCGTGAAGCGAGCAAATGGAGCCC

TGAACTAGCTGCTGCTTGTGAGGTATGGAAGGAGATCAAATTCGAATTCGAAGCAATGGATGTCTTGTAA

>gi|37194806|gb|L12652.2|ILLCPRBCLA Illicium parviflorum ribulose 1,5-bisphosphate carboxylase large subunit (rbcL) gene, partial cds; chloroplast gene for chloroplast product

GTGTTGGATTCAAGGCTGGTGTTAAAGATTACAGATTGACTTATTATACTCCTGAATATGAAACGAAAGA

GACTGATATCTTGGCAGCATTCCGAGTAACTCCTCAACCCGGAGTTCCACCTGAGGAAGCGGGAGCTGCG

GTAGCTGCGGAATCCTCTACTGGTACCTGGACCACTGTGTGGACTGATGGACTTACCAGCCTCGATCGTT

ACAAAGGGCGATGCTACCACATTGAGCCCGTTGCTGGGGAGGAAAATCAATATATTGCTTATGTAGCTTA

TCCTTTAGACCTTTTTGAAGAAGGTTCTGTTACTAACATGTTTACTTCCATTGTGGGTAATGTATTTGGG

TTCAAAGCCCTACGAGCTCTGCGTCTGGAAGATTTGCGAATTCCTCCTGCTTATTCCAAAACTTTCCAAG

GCCCACCTCATGGCATCCAAGTTGAGAGAGATAAATTGAACAAGTATGGTCGTCCTCTATTGGGATGTAC

TATTAAACCAAAATTAGGATTATCTGCCAAGAACTACGGTAGAGCGGTTTATGAATGCCTCCGCGGTGGA

CTTGATTTTACCAAGGATGATGAGAACGTGAACTCCCAACCATTTATGCGTTGGAGGGACCGTTTCGTAT

TTTGTGCCGAAGCAGTTTATAAAGCGCAGGCCGAAACAGGTGAAATTAAAGGACATTACTTGAATGCTAC

TGCAGGTACATGCGAAGAAATGATCAAAAGGGCTGTATTTGCCAGAGAATTGGGAGTTCCTATCGTAATG

CATGACTACTTAACAGGGGGATTCACTGCAAATACTAGCTTGGCTCATTATTGCCGAGACAACGGCTTAC

TTCTTCACATCCATCGCGCAATGCATGCAGTTATTGATAGACAGAGGAATCATGGTATGCACTTTCGTGT

ACTAGCTAAAGCGTTGCGTATGTCTGGTGGAGATCATATTCACGCTGGTACTGTAGTAGGTAAACTGGAA

GGGGAACGGGATGTCACTTTGGGTTTTGTTGATTTACTACGTGATGATTTTATTGAAAAAGACCGAAGTC

GCGGCATTTATTTCACTCAAGATTGGGTATCTATGCCAGGTGTTCTGCCCGTGGCTTCAGGGGGTATTCA

CGTTTGGCATATGCCTGCCTTGACCGAGATCTTTGGGGATGATTCCGTACTACAGTTCGGTGGAGGAACT

TTAGGACACCCTTGGGGAAATGCGCCTGGTGCAGTAGCTAATCGAGAGGCTTTAGAGGCGTGTGTACAAG

CTCGTAATGAGGGACGTGATCTTGCTCGTGAAGGTAATGAAGTTATCCGTGAAGCTAGCAAATGGAGCCC

TGAACTAGCTGCTGCTTGTGAGGTATGGAGGGAGATCAAATTCGAATTCGAAGCAATGGATGTCTTATAA

>gi|37194836|gb|L12665.2|SDRCPRBCLA Schisandra sphenanthera ribulose 1,5-bisphosphate carboxylase large subunit (rbcL) gene, partial cds; chloroplast gene for chloroplast product

GTGTTGGATTCAAGGCTGGTGTTAAAGATTACAGATTGACTTATTATACTCCTGAATATGAAACGAAAGA

TACTGATATCTTGGCAGCATTCCGAGTAACTCCTCAACCCGGAGTTCCGCCCGAGGAAGCGGGAGCTGCG

GTAGCTGCGGAATCTTCTACTGGTACCTGGACTACTGTGTGGACTGATGGACTTACCAGCCTCGATCGTT

ATAAAGGGCGATGCTACCACATTGAGCCCGTTGCTGGGGAGGAAAATCAATATATTGCTTATGTAGCTTA

CCCTTTAGACCTTTTTGAAGAAGGCTCTGTTACTAACATGTTTACTTCTATTGTGGGTAATGTATTTGGG

TTCAAAGCCCTACGAGCTCTGCGTCTGGAAGATTTGCGAATTCCTCCTGCTTATTCCAAAACTTTCCAAG

GCCCACCTCATGGCATCCAAGTTGAGAGAGATAAATTGAACAAGTATGGTCGTCCCCTATTGGGATGTAC

TATTAAACCAAAATTAGGGTTATCTGCCAAGAACTACGGTAGAGCGGTTTATGAATGTCTCCGCGGTGGA

CTTGATTTTACCAAGGATGATGAGAACGTGAACTCCCAACCGTTTATGCGTTGGAGGGACCGTTTCTTAT

TTTGTGCCGAAGCTCTTTATAAAGCGCAGGCCGAAACAGGTGAAATTAAAGGACATTACTTGAATGCTAC

TGCAGGTACATGCGAAGAAATGATGAAAAGGGCTGTATTTGCCAGAGAATTGGGAGTTCCTATCGTAATG

CATGACTACTTAACAGGGGGATTCACTGCAAATACTAGCTTGGCTCATTATTGCCGAGACAACGGCCTAC

TTCTTCACATCCATCGCGCAATGCATGCAGTTATTGATAGACAGAGGAATCATGGTATCCACTTTCGTGT

ACTAGCTAAAGCGTTGCGTATGTCTGGTGGAGATCATATTCACTCTGGTACCGTAGTAGGTAAACTGGAA

GGGGAACGGGACGTCACTTTGGGTTTTGTTGATTTACTACGTGATGATTTTATTGAAAAAGACCGAAGTC

GCGGCATTTATTTCACTCAAGATTGGGTATCTATGCCAGGTGTTCTGCCCGTGGCTTCAGGGGGTATTCA

CGTTTGGCATATGCCTGCCCTGACCGAGATCTTTGGGGATGATTCCGTACTACAGTTCGGTGGAGGAACT

TTAGGACACCCTTGGGGAAATGCGCCTGGTGCAGTAGCTAATCGTGTGGCTTTAGAGGCGTGTGTACAAG

CTCGTAATGAGGGGCGTGATCTTGCTCGTGAAGGTAATGAAGTTATCCGTGAAGCTAGCAAATGGAGCCC

TGAACTAGCTGCTGCTTGTGAGGTCTGGAAGGAGATCAAATTCGAATTCGAAGCAATGGATGTCTTGTAA

>gi|37544966|gb|AY116658.1| Trimenia moorei 1,5-bisphosphate carboxylase large subunit (rbcL) gene, partial cds; chloroplast gene for chloroplast product

TGGATTCAAGGCTGGTGTAAAAGATTACCGTTTGACTTATTATACTCCTGAATATGATACGAAAGAGACT

GATATCTTGGCAGCATTCCGAGTAACTCCTCAACCCGGAGTTCCACCGGAGGAAGCAGGGGCTGCGGTAG

CTGCGGAATCTTCTACTGGTACATGGACCACTGTGTGGACGGATGGGCTTACCAGCCTCGATCGTTACAA

AGGGCGATGCTACCACATTGAACCAGTTCCTGGGGAGGATAATCAATTTATTGCTTATGTAGCTTATCCT

TTAGACCTTTTTGAAGAAGGTTCTGTTACTAACATGTTTACTTCCATTGTTGGGAATGTATTTGGGTTTA

AAGCCCTACGAGCTCTGCGTCTGGAAGATCTGCGAATTCCTACTGCTTATATCAAAACTTTCCAAGGTCC

GCCTCATGGCATCCAAGTTGAGAGAGATAAATTGAACAAGTATGGTCGTCCCCTATTGGGATGTACTATT

AAACCAAAATTAGGGTTATCCGCCAAGAACTACGGTAGAGCGGTTTATGAATGTCTCCGTGGTGGACTTG

ATTTTACTAAGGATGATGAGAATGTGAACTCCCAACCATTTATGCGCTGGAGGGACCGTTTCTTATTTTG

TGCCGAGGCCCTTTATAAAGCGCAGGCCGAAACCGGTGAAATCAAAGGACATTACTTGAATGCTACTGCA

GGTACATGCGAAGAAATGATCAAAAGGGCTGTATTTGCCAGAGAATTGGGAGTTCCTATCGTAATGCATG

ACTACTTAACAGGGGGATTCACTGCAAATACTAGCTTGGCTCATTATTGCCGAGACAACGGCCTACTTCT

TCACATCCATCGCGCAATGCATGCAGTTATTGATAGACAGAAGAATCATGGTATGCACTTTCGTGTACTA

GCTAAAGCCTTGCGTATGTCTGGTGGAGATCATATTCACTCTGGTACCGTAGTGGGGAAACTGGAAGGGG

AACGGGATATCACTTTGGGTTTTGTTGATTTATTACGCGATGATTTTATTGAAAAAGACCGAAGTCGCGG

CATTTATTTTACTCAAGATTGGGTATCTCTGCCAGGTGTTCTGCCCGTGGCTTCCGGGGGTATTCACGTT

TGGCATATGCCTGCCCTGACTGAGATCTTTGGGGATGATTCCGTACTACAGTTCGGCGGAGGAACTTTAG

GGCACCCTTGGGGAAATGCACCAGGTGCAGTAGCTAATCGGGTGGCTTTAGAGGCGTGTGTACGAGCTCG

TAATGAGGGACGTGATCTTGCTCGCGAAGGGAATGAAATTATCCGCGAAGCTTCCAAATGGAGTAAGGAA

CTATATGCTGCT

Once you’ve got the file saved you can open CLUSTALX (or access it online: ) and import the text file you’ve created. To do this click File, Load Sequences, and search for your text file. You can now align the sequences. To do this you click on Alignment, then do complete alignment. Depending on the number of sequences it may take a few seconds to complete the sequence alignment.

When the alignment is complete, you’ll need to save it in a format that MEGA can work with. So in clustalx click on file, then ‘save sequences as’, and select the format ‘Nexus’. Make sure you name the file something informative!!! You can now close Clustalx and open MEGA.

In MEGA you need to load, convert, and analyze your sequence alignment.

The first step is to convert your sequence alignment file. To do this you click on File, then ‘Convert File Format to MEGA’. A pop up window will now appear and you can select your sequence alignment. First you will need to select the format (It is important to choose ‘Nexus’ (Paup, Macclade), rather than .aln (clustal) since MEGA has a difficult time dealing with files in clustal format.), then you can seek out your nexus file (it will be called, for instance, AUSTROB.nxs; the .nxs file extension denoting a nexus file). Click on OPEN (you may need to change the file format option back to nexus at this point), then click OK. You now have the option to save your alignment file as a MEGA file (.meg). Again, give this file an informative name. MEGA will now expect you to review the conversion of your file to MEGA format, and you can just close the editor.

Now, back in the main MEGA program, you can open the file that you’ve just converted. To do this go to File, then click on Open a file/session, and select your converted MEGA file (e.g. Austrob.meg). A screen asking for the type of data will now appear, select Nucleotide data. Next you will asked whether your data is protein coding, it is so you can click ‘Yes’ (this just means that you’ll have options for base substitution models later on).

A good idea is to now recheck that your alignment has converted properly. If it has, it will look like the sequence data below (you can click the button that say TA with dots below to show/hide sites that are identical):


Now that we know that the data has imported properly, we can move on to performing some phylogenetic analyses!

Start by clicking on the ‘Phylogeny’ tab. In this tab there are several options for phylogenetic analysis. First we’re going to construct a maximum parsimony tree. To do this we click on ‘Construct/test maximum parsimony tree(s)’, which brings up a pop up window in which we can enter the criteria for the test. Set the following data n the menu (it should look like the one below):

Test of Phylogeny: None

Subsitiutions model: Nucleotide

Gaps/Missing Data Treatment: Complete Deletion

MP Search Method: Max-mini Branch and Bound


Now click ‘compute’. Since this analysis has a relatively small number of taxa the search is fairly fast. However, if you’re analyzing more than about 15 taxa a branch and bound search may take far too long to complete. If this is the case you can change the MP Search Method to something else (use Close-neighbor-interchange). Now click ‘Compute’. The program will come up with a tree that will appear in a new window. If your outgroup appears inside the analysis, you can tell MEGA to root the tree on the branch containing the outgroup. To do this you can click on the branch leading to the outgroup, then click on the ‘Place root on branch’ button. Now that the tree is ready you can save it for use in your report.

My example MP tree is below:

Now that we have a MP tree we can now test the tree by bootstrapping.

To do this we again click on phylogeny, then on construct/test maximum parsimony tree. Keeping the settings as before, we now select ‘Test of Phylogeny’ and change the test method to ‘Bootstrap’. Now change the number of bootstrap replications to 1000 (if this takes more than 5 minutes to compute, you can lower the number of bootstrap replications to 500).


Now, click ‘compute’ and wait for the program to give you a new tree file. Note there will be both an original tree and a bootstrap consensus tree. In the tree-viewer make sure you view the bootstrap consensus tree and copy it into your write-up. My example bootstrap tree is below:

Next, we are going to construct a maximum likelihood tree. Here you will run a test to determine which model is most appropriate for your data.

To run a model test, first click on ‘Analysis’, then ‘Find Best DNA/Protein Models (ML)’. Then you will get an options screen, slick ‘Compute’. The program will then analyze the different models available for maximum likelihood analyses. When the analysis is finished you will get a table with print outs of the different substitution models, organized by their BIC (Bayesian Information Criterion). The model with the lowest BIC is considered the best descriptor of the observed substitution pattern. The abbreviations listed in the table are described below the table (e.g. T92 is the ‘Tamura-Nei’ model.) . Now make sure that you copy the top 5 listed in your print out and include this information

Once your ‘best’ model has been determined you should write down the parameters, then proceed with the analysis. For my analysis of the AUSTROBAILEYALES the ‘best-model’ was TN92+G. The information for this model is described below the output table that was printed out after the model-test.

Now, with this information I proceed to run a maximum likelihood analysis using this model. I now go to ‘Analysis’, ‘Phylogeny’, then we select ‘Construct/test Maximum likelihood tree’. We can then enter the information we obtained above (Note, your information will vary depending on the results of the model test described above. However, keep the No. of discrete categories; Gaps/Missing data treatment; ML Hueristic method; and Initial tree for ML as described below):

Test of phylogeny: none

Substitutions model: nucleotide

Model/Method: Tamura-Nei

Rates among sites: Gamma Distributed (G)

No of discrete gamma categories: 5

Gaps/Missing data treatment: Complete deletion

ML Hueristic method: Close-neighbor-interchange

Initial tree for ML: Make initial tree automatically

Now, click compute. My example tree is below:

Keeping the other options the same, now perform a bootstrap test of your Maximum likelihood phylogeny. To do this, in the Maximum likelihood test, change the ‘test of phylogeny’ to bootstrap and the ‘No of replications’ to 1000 (maximum likelihood takes longer to compute than parsimony. If the length of the analysis is longer than about 1 hour you can reduce the number of replications to 500). Once the computation is complete you should view your consensus tree and copy it into your write up. My example is below:


This is the phylogenetic tree I copied from the Angiosperm phylogeny website (

REMEMBER to give your figures appropriate titles, indicating the family, the method used to construct the phylogeny, and any tests that were performed on the data (i.e. Bootstrapping).