S9

1. Phylogenetic analyses of the Homeobox domain of Cnidarians

Datasets. Phylogenetic analyses have therefore been conducted on alignments of 60 amino acid sites of the complete Homeobox domain using six different datasets.

1. HomBil82: 82 Hox and ParaHox sequences from slow evolving bilaterians with three protostomians Drosophila melanogaster (Dm), Nereis virens (Nev), and Cupiennius salei (Cs) and three deuterostomians Mus musculus (Mm), Branchiostoma floridae (Bf), Ptychodera flava (Pf).

2. HomBilNv92: The 82 Hox and ParaHox bilaterian sequences plus 10 Hox-like genes from the anthozoan Nematostella vectensis (Nv).

3. HomBilHm89: The 82 Hox and ParaHox bilaterian sequences plus 7 Hox-like genes from the hydrozoan Hydra magnipapillata (Hm).

4. HomBilCx87: The 82 Hox and ParaHox bilaterian sequences plus 5 Hox-like genes from the scyphozoan Cassiopea xamachana (Cx).

5. HomBilEd85: The 82 Hox and ParaHox bilaterian sequences plus 3 Hox-like genes from the hydrozoan Eleutheria dichotoma (Ed).

6. HomBilCnid107: The 82 Hox and ParaHox bilaterian sequences plus the 25 Hox-like genes from the four cnidarian species.

Model selection. The best fitting model of protein sequence evolution was selected using ProtTest 1.2.7 (Abascal et al. 2005) among a set of 40 candidate models constituted by all the combinations of the Dayhoff, Blosum62, JTT, WAG, and VT empirical matrices of amino acid substitution with a gamma distribution with eight categories (+G8) and a proportion of invariable sites (+I). All statistical criteria unanimously selected the JJT+G8 model as the best fitting model for all six sequence alignments.

Phylogenetic analyses. Distance-based phylogenetic trees were inferred by applying the BioNJ algorithm (Gascuel 1997) in SplitsTree 4.2 (Huson and Bryant 2006) on ML JTT+G8 distances computed using the ML a parameter value previously estimated by ProtTest. Neighbor-Net networks (Bryant and Moulton 2004) were constructed from the same distance estimates. Bootstrap proportions were also obtained from 100 replicates using the same distance correction. Bootstrap networks were then constructed from all splits that occurred in any of the 100 bootstrap replicates.

Maximum Likelihood (ML) analyses were performed using TreeFinder (Jobb et al. 2004) under the JJT+G8 model. ML bootstrap proportions were obtained from the 50% majority rule consensus of the 100 ML tree inferred using the same model from pseudo-replicates generated with the program SeqBoot of the Phylip package (Felsenstein 2001).

Bayesian phylogenetic analyses were conducted using MrBayes 3.1.2 (Ronquist and Huelsenbeck 2003). Two independent runs of four incrementally heated Metropolis-coupled Markov chain Monte Carlo (MCMCMC) chains were simultaneously run for 2,500,000 generations under the JTT+G8 model using the program default priors as starting values for model parameters and branch-lengths. The convergence of MCMCMC was monitored by examining the values of the marginal likelihood, the rate heterogeneity parameter (a), and clade posterior probabilities through generations using the AWTY web server (Wilgenbusch et al. 2004). Bayesian clade Posterior Probabilities (PP) were obtained from the 50% majority rule consensus of 12,500 trees sampled every 100 generations on both independent runs after removing the 12,500 first trees as a conservative "burn-in".

Statistical tests of alternative topologies. Likelihood-based tests of alternative topologies were performed in two steps. First, ML branch lengths and site-wise log-likelihood values of alternative topologies were computed assuming the JTT+G8 model using Tree-Puzzle 5.2 (Schmidt et al. 2002). Second, p-values of the SH (Shimodaira and Hasegawa 1999) and AU (Shimodaira 2002) likelihood-based tests were calculated with Consel 0.1i (Shimodaira and Hasegawa 2001) using a multiple bootstrap procedure with 1,000,000 replicates.

References

Abascal, F., R. Zardoya, and D. Posada. 2005. ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21:2104-2105.

Bryant, D., and V. Moulton. 2004. Neighbor-Net: An agglomerative method for the construction of phylogenetic networks. Mol. Biol. Evol. 21:255-265.

Felsenstein, J. 2001. PHYLIP (PHYLogeny Inference Package), version Version 3.06b. Department of Genome Sciences, University of Washington.

Gascuel, O. 1997. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14:685-95.

Huson, D. H., and D. Bryant. 2006. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23:254-257.

Jobb, G., A. von Haeseler, and K. Strimmer. 2004. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol. Biol. 4:18.

Ronquist, F., and J. P. Huelsenbeck. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572-1574.

Shimodaira, H., and M. Hasegawa. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16, 1114-6.

Shimodaira, H., and M. Hasegawa. 2001. CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics. 17:1246-7.

Shimodaira, H. 2002. An approximately unbiased test of phylogenetic tree selection. Syst. Biol. 51, 492-508

Schmidt, H.A., K. Strimmer, M. Vingron, and A. von Haeseler. 2002. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics. 18:502-4.

Wilgenbusch, J. C., D. L. Warren, and D. L. Swofford. 2004. AWTY: A system for graphical exploration of MCMC convergence in Bayesian phylogenetic inference. http://ceb.csit.fsu.edu/awty.

2. Other methods

Culture of Nematostella polyps and induction of gametogenesis was carried out as described (Hand and Uhlinger, 1992; Fritzenwanker and Technau, 2002). cDNA clones of the Hox genes were isolated by PCR by gene specific primers using first strand cDNA from mixed embryonic stages or adult polyps. In selected cases genomic DNA was used as template in PCR reactions to confirm the bioinformatic predictions. All cDNA clones were confirmed by sequencing.

To produce a BAC library from Nematostella, approximately 5 x 105 primary polyps were harvested, dissociated into single cell suspension by PronaseE (Sigma) digestion and 6.7 x 107 cells were embedded in an agarose block. The generation of the library was carried out as described (Osoegawa et al., 1998). Approximately 27,000 clones with an average insert size of 168 kb representing a 14x genome coverage have been arrayed into 72 384-well microtiter dishes and then gridded onto nylon filters for screening by probe hybridization. Filter hybridisation of the BAC library with Digoxigenin-labeled cDNA probes was carried out using standard protocols.

Hand, C. & Uhlinger, K. (1992). The culture, sexual and asexual reproduction and growth of the sea anemone Nematostella vectensis. Biol. Bull. 182, 169-176.

Fritzenwanker, J.H & Technau, U. Induction of gametogenesis in the basal cnidarian Nematostella vectensis (Anthozoa). Dev Genes Evol. 212, 99-103 (2002).

Osoegawa, K., Woon, P.Y,, Zhao, B., Frengen, E., Tateno, M., Catanese, J.J. & de Jong, P.J. An improved approach for construction of bacterial artificial chromosome libraries. Genomics 52, 1-8 (1998).