1. Supplemental Materials and Methods

Supplemental Text for Polytene Chromosomal Maps of 11 Drosophila species: The order of genomic scaffolds inferred from genetic and physical maps

Tables of Contents

1. Supplemental Materials and Methods / …………………………………..1
2. Supplemental Results / …………………………………..7
3.Notes on Chimeric Assembly Scaffolds / …………………………………11
4. File Formats for Supplemental Tables / …………………………………12
5. List of Supplemental Tables / …………………………………15
6. Supplemental Literature Cited / …………………………………17

1. Supplemental Materials and Methods.

General Strategy for Mapping Scaffolds to Polytene Chromosomes. A variety of approaches were used to map genome scaffolds to polytene chromosomes. The simplest approach involved the four members of the melanogaster subgroup (D. simulans, D. sechellia, D. erecta and D. yakuba). Prior investigation of these species revealed that the banding pattern of the polytene chromosomes was sufficiently conserved that the inversion complexes that characterize differences in gene order among these species were apparent even when it was not possible to recover F1 hybrids (Ashburner and Lemeunier 1976; Lemeunier and Ashburner 1976; Lemeunier et al. 1978). Thus, a simple alignment of the D. melanogaster orthologs in the scaffolds of these species allows one to not only unequivocally assign a majority of scaffolds to chromosome arm but also to order and orient the scaffolds within the arm. Interestingly, these alignments were largely congruent with the older purely cytological observations confirming the basic accuracy of the earlier observations (ibid).

For the more distantly related species, this was not possible because, while the syntenic relationships of the Muller elements could be used to assign scaffolds to arms, their order and orientation could not be as easily determined due to the number of overlapping inversions that rearranged genes within the arms. In some cases, this difficulty was overcome by prior mapping of clones to the polytene chromosomes of the non-melanogaster group species. Examples of this are the mapping of P1 clones in D. virilis (Lozovskaya et al. 1993; Vieira et al. 1997) and the position of transposon-induced mutations in D. ananassae (Matsubayashi et al. 1992). When this was not possible, for example in D. willistoni and D. mojavensis, probes were synthesized based on the sequence of the scaffolds and in situ hybridizations were performed. A unique approach was adopted for the Hawaiian species D. grimshawi. In this case, there were several prior localizations of individual genes, but these were done on Hawaiian species other than D. grimshawi (Davis et al. 1998). However, the phylogeny of the Drosophila species endemic to the Hawaiian chain is known and well established (see Powell 1997). Moreover, one of the data sets used to establish the phylogeny is the set of inversion polymorphisms within and between species that are associated with their evolution (Carson 1992; Carson et al. 1992). Using the various localizations in non-D. grimshawi species and the inversion genealogy from those to D. grimshawi, it was possible to deduce the positions of the scaffolds in this species. These varied approaches have informed the organization of the major scaffolds on the chromosome maps of all 11 species and these analyses of the assembled genome sequences have provided novel insights into possible underlying causes and mechanisms of chromosomal evolution in the genus.

D. melanogaster Species Group Chromosome Map Preparation. All of the analyses reported used the CAF1 assemblies of D. simulans, D. sechellia, D. erecta and D. yakuba and version 4.3 of the D. melanogaster assembled and annotated genome. The annotated versions of the four melanogaster group species were retrieved from FlyBase. These were derived from the community assemblies and annotations posted on the AAA web site (http://rana.lbl.gov/drosophila/). The orthology calls made by V. Iyer and M. Eisen, which were also posted at the AAA site, were also used in this analysis. The FlyBase inferred cytological map locations were assigned to all of the orthologs called in the four species. These associations were then ordered and sorted according to their scaffold assignments and molecular coordinates for each species. These simple alignments based on a correlation of D. melanogaster cytology, scaffold linkage and gene order by molecular coordinate proved remarkably congruent and allowed ready alignment of the major scaffolds to the polytene chromosome maps. An added feature resulting from the alignments was that the known inversion constellations that differentiate the four species from D. melanogaster were easily discerned and mapped to the sequence of the assembled scaffolds.

D. ananassae Chromosome Map Preparation. The stock of D. ananassae (AABBg1) was maintained at 23o C in a corn meal-yeast-glucose and agar medium. Well-fed larvae ready for pupation were dissected in a solution containing lactic acid: distilled water: acetic acid (1:2:3). Salivary glands were immediately transferred to the same solution and were squashed after 10 minutes for chromosome preparation. The photographic maps (Tobari et al. 1993) were used to anchor assembles scaffolds to the cytological map. The positions of the genetic and physical markers within the assembled scaffolds were obtained from the Synpipe output (Bhutkar et al. 2006). The cytological position of each molecular marker was determined by in situ hybridization to the polytene chromosomes with minor modification of the procedures described in Biemont et al. (2004). The DNA fragments of PCR product were labeled with digoxigenein-11-dUTP (PCR DIG Labeling Mix, Roche) as a probe for the hybridization. The linkage maps of morphological mutants constructed by Hinton (unpublished in 1991 in Tobari 1993) were used for the analysis.

D. pseudoobscura Species Group Chromosome Map Preparation. The stocks of D. pseudoobscura (MV2-25, 14011-0121.94) and D. persimilis (MSH3, 14011-0111.49) were maintained at 18o C in a corn meal molasses agar culture medium. Third instar larvae were collected and placed in Drosophila Ringers solution for 5 minutes. Salivary glands were dissected from larvae and squashed according to the procedure developed by Harshman (1977). This technique helped to obtain chromosomes that tended to be linearized. A 700 gram weight was set on the coverslip to aid in flattening the chromosomes (Ballard and Bedo 1991). The chromosomes were viewed at 1,000x and digital images of linear chromosomes were collected for the six chromosomal arms.

Adobe Photoshop was used to build mosaic images of each of the six chromosomes. We minimized the number of different polytene chromosomes that were used to build the mosaic images, but in all cases, we made sure that the different chromosomal sections blended in the mosaic were of similar scales. Section designations were available for each of the six chromosomes except for XR (Muller A•D). Sub-sections were assigned using the approach of Bridges (1935). We made every effort to begin each sub-section at an easily recognizable landmark such as a band or boundary to a puff. New ideograms for the six chromosomes of the two species were drawn by tracing the original maps developed by Tan (1936; 1937) in Adobe Illustrator. These new ideograms are superior to the old reproductions because they are drawn in vector graphics and allow infinite scalability in web based applications. There was not an ideogram available for Muller A•D so the photomicrograph image of this chromosome was used to develop the representation for this map.

We used the locations of previously mapped genetic markers (Anderson 1993; Beckenbach 1981; Donald 1936; Kovacevic and Schaeffer 2000; Lancefield 1922; Levine and Levine 1955; Noor et al. 2000; Orr 1995; Ortiz-Barrientos et al. 2006; Prakash 1974; Sturtevant and Novitski 1941; Sturtevant and Tan 1937; Tan 1936; Yardley 1974) and physical markers (Aquadro et al. 1991; Babcock and Anderson 1996; Bondinas et al. 2002; Dobzhansky and Sturtevant 1938; Hamblin and Aquadro 1999; Machado et al. 2002; Moore and Taylor 1986; Papaceit et al. 2006; Schaeffer and Aquadro 1987; Schaeffer et al. 2003; Segarra and Aguade 1992; Segarra et al. 1996) to anchor the assembled scaffolds to the cytological map. The positions of the genetic and physical markers within the assembled scaffolds were obtained from the Synpipe output (Bhutkar et al. 2006).

D. willistoni Chromosome Map Preparation. The stocks of D. willistoni were maintained in 25o C incubators on the culture medium of Marques et al. (1966). Salivary gland cells of third instar, well-nourished larvae were prepared using a modification of the technique of Ashburner (1967), the glands were fixed with acetic acid (45%) and stained with acetic orcein (2%).

The Gd-H4-1 strain of D. willistoni from Guadeloupe (16° 15’ N 61° 35’ W) was chosen for genomic sequencing from a set of six strains because it was chromosomally monomorphic. Gd-H4-1 contains the standard chromosomal order, except for two small fixed inversions in arm IIL and one inversion in XL.

The genetic maps of morphological mutant alleles (Spassky and Dobzhansky 1950) and allozyme variants (Lakovaara and Saura 1972) were used to anchor assembled scaffolds to the cytological map. These maps were developed using a strain of D. willistoni that was chromosomally monomorphic for standard gene arrangements. The method of Engels et al. (1986) was used for the in situ hybridization assays to physically map genes to the polytene chromosomes of D. willistoni. Probes were labeled with biotin-7-dATP by nick translation with the GIBCO BRL kit, the hybridizations were detected using the BCIP, SAP and NBT while chromosomes were stained with 0.1 % lacto-aceto-orcein. The positions of the genetic and physical markers within the assembled scaffolds were obtained from the Synpipe output (Bhutkar et al. 2006).

To construct the new photomap, we used as the standard X-chromosome order those patterns of the XL and XR arms with the widest geographical distribution within all the populations analyzed (according to Rohde 2000). These standard gene arrangements (designed XL-A and XR-A) are fixed in an old laboratory population-WIP4, collected by A. Cordeiro and H. Winge in the 1960s in Ipitanga, Bahia Northeast State in Brazil and have the closest phylogenetic relationship with the remaining arrangements. Microscopic analysis of X chromosomes in offspring from crosses between the WIP4 population and southern Brazilian wild populations was used to confirm the gene arrangement on the X chromosome. The X-chromosome pattern of the WIP4 population probably corresponds to the Standard arrangement from the Belém population described by Dobzhansky (1950).

Until recently, the status of the dot chromosome was unclear in D. willistoni (Sturtevant and Novitski 1941). Many authors considered that this small chromosome might be fused to another chromosome. Papaceit and Juan (1998) solved this mystery when they used probes for genes on the fourth chromosome of D. melanogaster and found that they hybridized to the most basal section of the third chromosome. Thus, the dot chromosome or Muller element F has apparently fused to the E element in D. willistoni.

D. virilis Chromosome Map Preparation. The sequenced strain of D. virilis contains visible mutations at loci on each of the large autosomal elements. The original strain appears to have been constructed at The Institute for Developmental Biology (Moscow, USSR), and it was subsequently placed in the species collection currently maintained at the Tucson Drosophila Stock Center. A derivative of this original strain was inbred more than 14 generations by single pair sib-mating in the laboratories of Brian Charlesworth and Bryant McAllister, and it underwent two additional generations of sib-mating immediately prior to expansion for isolating genomic DNA for sequencing.

Physically mapped positions in the genome of D. virilis were identified from Flybase records, published literature, and unpublished data. Cytological map positions determined by in situ hybridization are relative to the nomenclature of the standard photographic chromosome map (Gubenko and Evgen'ev 1984) and the corresponding graphic map (Kress 1993). Loci on the linkage map, where a clear relationship exists between the locus and a reference DNA sequence of either D. virilis or D. melanogaster, were also identified (Alexander 1976; Gubenko and Evgen'ev 1984). Reported positions of these markers within the physical and/or linkage maps, coupled with an associated DNA sequence, provided reference points to anchor the assembled genome sequence along the chromosomal arms. Most of the reported mapped positions within the genome of D. virilis were obtained through a series of analyses that used in situ hybridization to localize large-insert P1 clones along the chromosomes (Lozovskaya et al. 1993; Vieira et al. 1997). End sequences from 593 of these P1 clones that map to unique sites within the genome were generated to anchor the assembly onto the polytene chromosome map.

In cases where a reference sequence of D. virilis was available for the in situ localized probe, position of the sequence in the CAF1 assembly was determined using local alignment. Large sequences were localized using MEGABLAST (e < 1E-50), and small microsatellite loci were localized using the best score obtained from BLASTN searches. Otherwise, the transcript of the putative ortholog of D. melanogaster was used as a query in a BLASTN search (e < 1E-10). Identification of each mapped position, its associated reference sequence, and its identified position in the CAF1 assembly is included in Supplemental Table 24.

Overall organization of the assembled genome sequence on the chromosomal arms of D. virilis was inferred through a comparison of the order and orientation of scaffolds indicated by the anchored positions in the physical and linkage maps and scaffold joins identified from the Synpipe analysis of conserved syntenic blocks of orthologous genes. Position and orientation of scaffolds was mostly supported by both approaches, thus providing a high level of confidence in the inferred order and orientation of these scaffolds. A greater degree of uncertainty exists for the placement of the sequence in regions where only one of the approaches supports a particular arrangement, and this uncertainty is clearly demarcated.

A set of “orphan” scaffolds was identified as being present on particular chromosomal elements based on their content of putative orthologs. However, these scaffolds were not placed within the polytene chromosome map with marker data. These orphan scaffolds potentially represent “islands” of assembled genome sequence localized within the pericentromeric heterochromatin of D. virilis. Heterochromatic regions of the Drosophila genome are known to be enriched with repetitive sequences (Smith et al. 2007). To determine the proportion of each orphan scaffold that is represented by repetitive elements, each sequence was analyzed with Repeat Masker to identify interspersed repeats using the “Drosophila fruit fly genus” repeat library (http://www.repeatmasker.org). Three scaffolds of similar size (~1 Mb each) that mapped to medial positions of the X chromosome were used for comparison of element content.