Supplemental Table 1. Oligonucleotide primers used for amplifying Colias PGI introns. Primers are all in the 5’ to 3’ direction. Primers designated “S” are sense-going, and primers designated “A” are antisense-going. Numbers represent positions of the 3'-terminal primer nucleotides in the coding DNA. Standard ambiguity codes for bases are used. The listed primers were also used for sequencing. A few primers not listed here were made for primer-walking in specific cases of long intron sequences.

Intron 1 / PGI S115 / CAACTATTCCAACAAGATGCAGACC
PGI A156 / GAGTAATCCAGGAGGATCTCACCA
Intron 2 / PGI S264 / TGTTGAGAAAGCTAGAGATGCTATG
PGI A291 / CGTGGAGCACTGCTCGGTCTTC
Intron 3 / PGI S437 / AAGTTATTAGTGGAGCGTGGAAG
PGI A459 / GATTCCAATRTTGATGACGTCAGTG
Intron 4 / PGI S533 / GTCACAGAGGCTTTGCCTTA
PGI A595 / GTTTCTGGGTTCAATCGCTTTAG
Intron 5 / PGI S700 / GTCAGCGAAGACATGGTTCTTTG
PGI A716 / AAAGTGCTTTGATACTGCTGCTG
Intron 6 / PGI S825 / TTCTGGGACTGGGTTGGAGG
PGI A840 / AGATGGATAGACCAATAGCGGAC
Intron 7 / PGI S943 / GAYAACCACTTCTGTACTGCACCTC
PGI A965 / ACCACACTCCTAACAGAGCTAGTA
Intron 8 / PGI S1028 / GCTGAGACTCACGCTCTATTGCC
PGI A1074 / ACTTGCCGTTACTCTCCATGTCC
Intron 9 / PGI S1198 / CGTTCTACCAGCTCGTGCATC
PGI A1214 / CGCTAGGAAGTCGCATGGGA
Intron 10 / PGI S1324 / CAAACTGAGGCGCTTATGAAGG
PGI A1350 / CCATTCCTGATTTCTCTAATTCC
Intron 11 / PGI S1471 / GAAGTTCACGCCGTTCACACTTG
PGI A1489 / CYTGYGTGAAGATTTTGTGTTCGTA


Supplemental Table 2. Ratio of nucleotide substitutions S to indels I in Colias introns

Intron number / S/I
1 / 3.47
2 / 6.88
3 / 3.70
4 / 7.93
5 / 6.43
6 / 24.4
7 / 3.71
8 / 7.89
9 / 4.64
10 / 3.4
11 / 7.02
Average (n = 31) / 7.22
Comparison data
Species / S/I / Reference
Drosophila pseudoobscura / 5.0 / Schaeffer 2002
Drosophila melanogaster / 5.2 / Berger et al. 2001
Arabidopsis / 6.6 / Zhang et al. 2008
Zea mays / 8.7 / Batley et al. 2003
Gallus gallus / 12.5 / Brandström and Ellegren 2007
Homo sapiens / 4.7 / Chen et al. 2009


Supplemental Table 3. Intron % GC vs. intron length, intron by intron Regression of the length of Colias PGI introns on their % GC content. For introns 6 and 9, GC content and length are not significantly correlated.

Intron number / Slope / F / r2 / P
1 / 155.7106 / 27.0347 / 0.5297 / 0.000025
2 / 33.1021 / 6.3961 / 0.1999 / 0.01706
3 / 14.0865 / 173.98 / 0.8657 / < 1 x 10-6
4 / 48.4757 / 21.323 / 0.4323 / 0.000079
5 / 30.89427 / 21.791 / 0.4376 / 0.000069
6 / 34.35492 / 2.4558 / 0.0834 / 0.1287
7 / 18.2136 / 170.74 / 0.8591 / < 1 x 10-6
8 / 74.2525 / 7.1156 / 0.2026 / 0.01256
9 / 2.3479 / 0.03485 / 0.0013 / 0.8534
10 / 62.0078 / 24.995 / 0.4901 / 0.000034
11 / 16.6144 / 32.174 / 0.5347 / 0.000004
All / 21.3622 / 59.873 / 0.1593 / < 1 x 10-6

Note: The slopes of the relationships between %GC and length of introns are all positive. Lack of significance in introns 6 and 9 results from lower %GC in the longest alleles of those introns. Joint significance of all relationships is testable by “Fisher’s Method” of combining probabilities: χ2 = 2 ∑ ln P = 181.3 with 11 degrees of freedom, overall P < 0.001. This intron by intron analysis reaches the same conclusion as the overall plot of Figure 6.
Supplemental Table 4. List of insect sequences producing significant alignments to the intron sequences of the Colias PGI gene, found through NCBI BLAST searches. The program BLASTN (v. 2.2.25) was used to search against the database “nr” (all GenBank+ EMBL+DDBJ+PDB sequences, but no EST, STS, GSS, environmental samples or phase 0, 1 or 2 HTGS sequences). E-values of the listed matches are all ≤ 1×10-12.

Colias PGI intron / Description of the sequences producing significant alignments
PGI intron allele / Position of the sequence
Intron 1 of 4-A / 327-477 / Bombyx mori DNA, multiple clones from multiple chromosomes
Intron 1 of 4-A / 629-810 / Bicyclus anynana clone BA_Ba48N20;
Biston betularia lethal giant larvae gene;
Bombyx mori DNA, multiple clones from multiple chromosomes;
Spodoptera frugiperda BAC, clone 87A24_SfBAC_fin from egg DNA
Intron 1 of 2-37 / 1466-1590 / Spodoptera frugiperda BAC, clone 72F1_SfBAC_fin from egg DNA
Intron 2 of 3-277 / 103-360 / Choristoneura fumiferana antifreeze protein 2.7a and antifreeze protein Lu1 genes;
Cydia pomonella odorant receptor 11a (Or11a) mRNA, clone Cp3.37 microsatellite sequence;
Plutella xylostella microsatellite pxy038 sequence
Intron 2 of 4-2312 / 124-256 / Bombyx mori genomic DNA, multiple clones from multiple chromosomes; gustatory receptor 16 (Gr16), trehalase, BmBR-C gene, fibroin heavy chain Fib-H (fib-H) gene, acetylcholinesterase (ace) gene
Intron 2 of 4-2521 / 9-192,
294-444 / Spodoptera frugiperda BAC, clone 67K19_SfBAC_fin from egg DNA
Intron 2 of 1-53 / 714-783 / Colias behrii microsatellite DNA, locus A104
Intron 2 of 1-53 / 470-547 / Colias behrii microsatellite DNA, locus A104
Intron 2 of 5-54 / 323-556 / Cydia pomonella odorant receptor 11a (Or11a) mRNA, clone Cp3.37 microsatellite sequence
Intron 3 of 3-5510 / 112-207 / Bicyclus anynana clone BA_Ba85J10;
Bombyx mori genomic DNA, multiple clones from multiple chromosomes; gustatory receptor 16 (Gr16), trehalase, BmBR-C gene, fibroin heavy chain Fib-H (fib-H) gene, acetylcholinesterase (ace) gene
Intron 3 of 5-54 / 194-398 / Antheraea yamamai vg gene for vitellogenin;
Bombyx mori genomic DNA, multiple clones from multiple chromosomes; male-specific lethal 3 (Msl3) protein
Intron 4 of 6-38 / 1-90
622-776 / Bicyclus anynana multiple clones;
Bombyx mandarina lipase gene;
Bombyx mori genomic DNA, multiple clones from multiple chromosomes; putative cuticle protein CPR151 (CPR151) gene, Bmdsx gene, fibroin light chain gene, lipase gene, male-specific lethal 3 (msl3) gene, endoplasmic reticulum oxidoreduction 1-like protein, intron 1 of sericin gene, , fibroin heavy chain Fib-H (fib-H) gene, inorganic pyrophosphatase gene, antitrypsin gene, BmBR-C gene, cadherin-like protein gene, methylated DNA-protein cysteine methyltransferase, v itellogenin receptor (VgR) gene, Bmtitin1, Bmtitin2, Bmmiple, Bmsyx6, BmPM-Scl, Bmtkz, BmubcD4 genes;
Colias behrii microsatellite DNA, loci D3, D12, D120, D125;
Drupadia theda clone DTH008 and DTH044 microsatellite sequence;
Helicoverpa armigera BAC, multiple clones from pupae DNA;
Hyalophora cecropia cuticle protein 66 gene;
Limenitis arthemis nuclear locus 10;
Limenitis weidemeyerii nuclear locus 10;
Ostrinia nubilalis strain Z fatty-acyl CoA reductase 4 mRNA;
Papilio dardanus DNA sequence from clone AEPD-10A24;
Spodoptera frugiperda BAC, multiple clones from egg DNA, hemicentin-like protein;
Spodoptera littoralis CL1Contig220.Spli mRNA sequence
Intron 4 of 4-518 / 185-252 / Spodoptera frugiperda BAC, multiple clones from egg DNA
Intron 4 of 4-518 / 333-547 / Bicyclus anynana multiple clones;
Helicoverpa armigera BAC, pupae DNA clone 94B11_HaBAC_fin;
Heliothis subflexa clone 89F08 ABC transporter family C protein;
Manduca sexta arylphorin beta subunit gene
Intron 4 of 4-4510 / 150-318 / Autographa californica nucleopolyhedrovirus mutant vsk-1d1;
Bicyclus anynana multiple clones;
Bombyx mori ivd gene for isovaleryl Coenzyme A dehydrogenase, BmXDH2, BmXDH1 genes for xanthine dehydrogenase;
Cnaphalocrocis medinalis mRNA for CYP6AE30 protein;
Cotesia sesamiae multiple bracovirus clones;
Drupadia theda clone DTH040 microsatellite sequence;
Heliconius melpomene DNA sequence from multiple clones;
Heliconius numata DNA sequence from multiple clones;
Helicoverpa armigera BAC, multiple clones from pupae DNA;
Helicoverpa armigera microsatellite Ham4 sequence, voltage-gated sodium channel alpha subunit;
Helicoverpa zea microsatellite locus HzMS3-48, HzMS4-23 sequence;
Ostrinia nubilalis mannose phosphate isomerase (MPI) gene, OnubOR5c, OnubOR4 genes, pheromone binding protein (PBP) gene;
Plutella xylostella clone TGA1B04 and clone DBM01 microsatellite sequence, microsatellite pxy sequences, strain BCS3-Pearl nicotinic acetylcholine receptor alpha 6 gene;
Spodoptera exigua trehalose-phosphate synthase (TPS) gene;
Striacosta albicosta stalb_01_00005175 mRNA sequence;
Utetheisa ornatrix microsatellite Utor7 sequence;
Zeiraphera diniana microsatellite DNA clone ZdC03
Intron 4 of 5-46 / 183-289 / Bicyclus anynana clone BA_Ba18H03;
Heliothis subflexa clone 89F08 ABC transporter family C protein ABCC3 (ABCC3) gene;
Spodoptera frugiperda BAC, multiple clones from egg DNA
Intron 4 of 5-46 / 396-557 / Autographa californica nucleopolyhedrovirus mutant vsk-1d1, genomic DNA;
Bicyclus anynana multiple clones;
Bombyx mori genomic DNA, multiple clones from multiple chromosomes; BmXDH2, BmXDH1 genes for xanthine dehydrogenase, Bmfkbp13, Bmhepa, Bmhig, Bm6922, Bmlap, Bmprojectin, Bmkettin, Bmtitin1 genes, fibroin heavy chain Fib-H (fib-H) gene, gene for putative cholesterol transporter BmStart1;
Cnaphalocrocis medinalis mRNA for CYP6AE30 protein;
Colias behrii microsatellite DNA, loci D3, D12, D120, D125;
Cotesia sesamiae multiple bracovirus clones;
Drupadia theda clone DTH035 DTH037 DTH040 microsatellite sequences;
Helicoverpa armigera BAC, multiple clones from pupae DNA, voltage-gated sodium channel alpha subunit receptor alpha 6 gene;
Helicoverpa zea microsatellite locus HzMS3-48 sequence;
Heliothis subflexa clone 89F08 ABC transporter family C protein ABCC3 (ABCC3) gene;
Plutella xylostella strain BCS3-Pearl nicotinic acetylcholine gene, clone DBM01 microsatellite sequence, microsatellite pxy sequences;
Spodoptera frugiperda BAC, multiple clones from egg DNA
Intron 4 of 5-46 / 707-856 / Acyrthosiphon pisum clone BAC VMRC38-10-A9;
Bicyclus anynana multiple clones;
Bombyx mori genomic DNA, multiple clones from multiple chromosomes; acetylcholinesterase (ace) gene, Bm6922 gene for hypothetical protein, Bmfkbp13, Bmhepa, Bmhig, Bm6922, Bmlap, Bmprojectin, Bmkettin, Bmtitin1 genes, BMWCP5, BMWCP4, BMWCP3, BMWCP2 genes for cuticle proteins, CBP gene for carotenoid-binding protein, fibroin heavy chain Fib-H (fib-H) gene, gene for putative cholesterol transporter BmStart1, gustatory receptor 15 (Gr15) gene, og gene for molybdenum cofactor sulfurase, phosphoribosyl pyrophosphate synthetase, putative cuticle protein CPG41 (CPG41) gene;
Cotesia sesamiae Kitale bracovirus clone BAC 4C14;
Drosophila melanogaster chromosome 2R, clone BACR03I06, genomic scaffold 211000022280741, 211000022280732;
Drosophila willistoni GK23104 (Dwil\GK23104), GK23665 (Dwil\GK23665), GK20272 (Dwil\GK20272), GK23694 (Dwil\GK23694), mRNA;
Manduca sexta clone BB-IML-1 IML1 (IML1) gene, pro-lebocin gene;
Plutella xylostella microsatellite DNA, clone TGA1B04;
Sarcophaga crassipalpis HAHN.FLY.12787.C1 mRNA sequence;
Spodoptera frugiperda BAC, multiple clones from egg DNA
Intron 4 of 5-54 / 2-90 / Anastrepha suspensa clone 1-5E microsatellite sequence;
Bicyclus anynana microsatellite sequences, multiple clones;
Bombyx mori genomic DNA, multiple clones from multiple chromosomes; fibroin heavy chain Fib-H (fib-H) gene, voltage-gated sodium channel alpha subunit, Bmtitin1, Bmtitin2, Bmmiple, Bmsyx6, BmPM-Scl, Bmtkz,BmubcD4 genes;
Cnaphalocrocis medinalis mRNA for CYP6AE30 protein;
Colias behrii microsatellite DNA, loci D3, D12, D120, D125;
Drupadia theda clone DTH037, DTH040 DTH035 microsatellite sequence;
Helicoverpa armigera BAC, multiple clones from pupae DNA, microsomal cytochrome P450 (CYP9A12) gene;
Spodoptera frugiperda BAC, multiple clones from egg DNA
Intron 5 of 4-4510 / 432-568 / Bombyx mori genomic DNA, multiple chromosomes; acetylcholinesterase (ace) gene, fibroin light chain gene, gustatory receptor 57 (Gr57) gene;
Cotesia sesamiae Mombasa bracovirus clone BAC 6L7;
Papilio dardanus BAC sequence, clone BAC 19F6;
Cotesia plutellae polydnavirus segment S22;
Cotesia vestalis bracovirus segment c3
Intron 5 of 4-A / 685-830 / Bombyx mandarina lipase gene;
Bombyx mori genomic DNA, multiple clones from multiple chromosomes; fibroin light chain gene, heavy chain Fib-H (fib-H) gene, intron 1 of sericin gene, putative cuticle protein CPR151 (CPR151) gene, Bmdsx gene, lipase gene, antitrypsin gene, endoplasmic reticulum oxidoreduction 1-like protein, mfkbp13, Bmhepa, Bmhig, Bm6922, Bmlap, Bmprojectin, Bmkettin, Bmtitin1 genes, methylated DNA-protein cysteine methyltransferase, BmBR-C gene, male-specific lethal 3 (msl3) gene, cadherin-like protein gene, DNAJ13 gene, vitellogenin receptor (VgR) gene;
Chilo suppressalis microsatellite Cs138 sequence;
Colias behrii microsatellite DNA, locus D120;
Drupadia theda clone DTH008, DTH044 microsatellite sequence;
Helicoverpa armigera BAC, multiple clones from pupae DNA;
Limenitis arthemis nuclear locus 10;
Limenitis weidemeyerii nuclear locus 10;
Papilio dardanus BAC sequence, clone BAC 19F6, clone AEPD-10A24, putative wing patterning locus;
Spodoptera frugiperda BAC, multiple clones from egg DNA, hemicentin-like protein;
Spodoptera littoralis CL1Contig220.Spli mRNA sequence
Intron 5 of 4-A / 99-164
373-825 / Bicyclus anynana clone BA_Ba69H15, BA_Ba19O01;
Hyalophora cecropia cuticle protein 66 gene;
Spodoptera frugiperda BAC, egg DNA, clone 41I04_SfBAC_fin;
Intron 5 of 4-2312 / 433-561 / Bombyx mori genomic DNA, multiple clones from multiple chromosomes; gustatory receptor 57 (Gr57) gene, fibroin light chain gene, acetylcholinesterase (ace) gene;
Cotesia plutellae polydnavirus segment S22;
Cotesia sesamiae multiple bracovirus clones;
Cotesia vestalis bracovirus segment c3;
Papilio dardanus DNA sequence from clone AEPD-10A24, clone BAC 19F6
Intron 5 of 4-197 / 490-600 / Colias behrii microsatellite DNA, locus A104;
Geoica utricularia microsatellite Gu10 sequence
Intron 5 of 3-277 / 323-486 / Antheraea yamamai vg gene for vitellogenin;
Bombyx mori genomic DNA, multiple clones from multiple chromosomes
Intron 5 of 3-277 / 546-759 / Bombyx mori genomic DNA, multiple clones from multiple chromosomes; Bmtitin1, Bmtitin2, Bmmiple, Bmsyx6, BmPM-Scl, Bmtkz, BmubcD4 genes, BmBR-C gene, EH gene for eclosion hormone;
Colias eurytheme single domain major allergen 1 (SDMA1) gene
Intron 6 of 4-518 / 485-672 / Anastrepha suspensa clone 1-5E microsatellite sequence;
Bicyclus anynana multiple clones;
Biston betularia hephaestus gene, microsatellite BBCAA2A9-B1 sequence, orthopedia gene;
Bombyx mandarina cytochrome P450 Cyp305b1v1 gene;
Bombyx mori genomic DNA, multiple clones from multiple chromosomes; gustatory receptor 58 (Gr58) gene, origin recognition complex subunit 6 (ORC6) gene, P25 gene for fibroin P25, putative cuticle protein CPR149 (CPR149) gene, LZM gene for lysozyme, fibroin heavy chain Fib-H (fib-H) gene, BMWCP5, BMWCP4, BMWCP3, BMWCP2 genes for cuticle proteins, Bmtitin1, Bmtitin2, Bmmiple, Bmsyx6, BmPM-Scl, Bmtkz, BmubcD4 genes, w-3 gene for ABC transporter, gypsy-Ty3-like retrotransposon Kabuki gene, voltage-gated sodium channel alpha subunit, non-LTR retrotransposon Kendo;
Cotesia sesamiae Kitale bracovirus clone BAC 20O4;
Helicoverpa armigera BAC, multiple clones from pupae DNA, cytochrome P450 (CYP9A17) gene;
Helicoverpa zea pheromone biosynthesis activating neuropeptide;
Ostrinia nubilalis fatty acyl-coA reductase (pgFAR) mRNA;
Ostrinia scapulalis FAR-like protein XIII mRNA;
Papilio dardanus DNA sequence from clones AEPD-10A24, 19F6, AEPD-9M9, putative wing patterning locus;
Spodoptera litura pheromone binding protein 2 (PBP2) gene
Intron 6 of 4-37 / 298-894 / Bactrocera dorsalis clone Bdor_pnrfos1.scaffold_0, Bdor_pnrfos2.scaffold_0, Bdor_pnrfos3.scaffold_0 pnr gene;
Cotesia sesamiae Mombasa bracovirus clone BAC 4B19
Intron 6 of 3-2121 / 346-467 / Bombyx mori genomic DNA, multiple chromosome; acetylcholinesterase (ace) gene, fibroin light chain gene, gustatory receptor 57 (Gr57) gene;
Cotesia sesamiae bracovirus clone;
Papilio dardanus BAC sequence, clone BAC 19F6 and clone AEPD-10A24
Intron 6 of 6-38 / 101-306 / Acyrthosiphon pisum BAC VMRC38-30-A5;
Bicyclus anynana multiple clones;
Spodoptera frugiperda BAC, multiple clones from egg DNA
Intron 6 of 6-38 / 961-1044 / Antheraea yamamai vg gene for vitellogenin;