Electronic supplementary material to Wagner, A.

“The role of robustness in phenotypic adaptation and innovation”

Three classes of systems involved in most adaptation and innovation. In addition to proteins and RNA, discussed in the main text, two other classes of biological systems play central roles in evolutionary adapation and innovation.

Regulatory circuits are systems of one or more regulatory molecules that influence each other’s activity. Especially important are transcriptional regulation circuits, which consist of transcriptional regulators that influence each other’s expression. Each regulatory circuit has a regulatory genotype that specifies how its member molecules mutually regulate each other’s activities, and how they produce a gene expression phenotype that influences many processes in physiology and development. Genotypic change that alters regulatory interactions can bring forth novel gene expression phenotypes. These are involved in many evolutionary adaptations and innovations, such as the dissected leaves of some plants, the eyespots of some butterflies, the flowers of flowering plants, and the limbs of vertebrates(Bharathan et al. 2002; Brakefield et al. 1996; Burke et al. 1995; Carroll et al. 2001; Coen & Meyerowitz 1991; Davidson & Erwin 2006; Hay & Tsiantis 2006; Hughes & Kaufman 2002; Keys et al. 1999). Any one circuit genotype exists in a much larger genotype space. This space captures all biochemically feasible circuits involving a given set of molecules.

Metabolic networks, a third system class, comprise hundreds to thousands of chemical reactions that are catalyzed by enzymes, which are encoded by genes. These networks are responsible for providing cells with energy and multiple molecular building blocks -- amino acids, nucleotides, lipids, and others -- for cell growth. Innovations involving metabolic networks enable an organism to produce useful secondary metabolites, to detoxify waste products of its metabolism, or to use novel molecules as a source of energy or chemical elements. Heterotrophic bacteria, for example, have acquired the ability to use a broad spectrum of different molecules as sole carbon sources that include crude oil and natural gas, but also man-made compounds such as antibiotics and industrial chemicals (Dantas et al. 2008; Rehmann & Daugulis 2008; van der Meer 1995; van der Meer et al. 1998). The necessary biochemical pathways often do not arise through the evolution of novel enzymes, but through novel combinations of already existing, individually widespread enzymes, which may be facilitated by horizontal gene transfer (Copley 2000; Lerat et al. 2005; Ochman et al. 2005; Pal et al. 2005). Metabolic networks exist in a metabolic genotype space, a space of possible metabolic networks, where each network has a different metabolic genotype. This genotype can be compactly represented through information about the presence or absence of individual reactions (enzyme-coding genes) from a much larger universe of metabolic reactions (Rodrigues & Wagner 2009; Samal et al. 2010).

Macromolecules, regulatory circuits, and metabolism share features important for evolutionary adaptation and innovation. In all three major system classes, the neighbor of a genotype G in genotype space is an important concept. In the macromolecules discussed in the main text, this is a genotype that differs from G in exactly one amino acid or nucleotide. In the case of regulatory circuits, a genotype’s neighbor differs from it in one regulatory interaction, and in metabolic networks, it differs in one metabolic reaction (one enzyme-coding gene) in the case of metabolic networks. More generally, a genotype’s k-neighbor differs from it in k system parts (amino acids, regulatory interactions, metabolic reactions). A genotype’s (k-)neighborhood includes all its (k-)neighbors, and may comprise thousands of different genotypes. More generally, one can define a distance between two genotypes as the fraction of system parts in which they differ.

One can view mutational robustness as a property of a genotype G’s neighborhood, namely as the fraction of G’s neighbors that have the same phenotype P as G itself. Systems in all three classes are to some extent robust to mutations. This has been shown through engineered mutations that eliminate enzyme-coding genes from a genome, through rewiring of regulatory circuits, through large-scale mutagenesis studies in proteins, and through a variety of comparative and modeling approaches (Alon et al. 1999; Blank et al. 2005; Edwards & Palsson 2000; Giaever et al. 2002; Hafner et al. 2009; Huang et al. 1996; Isalan et al. 2008; Kleina & Miller 1990; Raman & Wagner 2011; Rennell et al. 1991; Segre et al. 2002; Soyer & Pfeiffer 2010; Stelling et al. 2002; Thompson et al. 1999; Wagner 2005a; Wagner 2005b; Wang & Zhang 2009; Weatherall & Clegg 1976). The fraction of neighbors with the same phenotype varies widely, and typically ranges between 10 percent to more than 50 percent, depending on system class, system size, and phenotype (Wagner 2011b).

Genotype networks exist in metabolic and regulatory networks, just as they exist in macromolecules (Ciliberti et al. 2007b; Giurumescu et al. 2009; MacCarthy et al. 2003; Ndifon et al. 2009; Rodrigues & Wagner 2009; Rodrigues & Wagner 2011; Samal et al. 2010). Genotype networks typically extend far – between 70 and 100 percent -- through genotype space. This means that two genotypes can differ in more than 70 percent of their parts (amino acids, regulatory interactions, metabolic reactions) and still have the same phenotype.

Robustness is both necessary (Wagner 2011b) and sufficient (Reidys et al. 1997) for the existence of genotype networks with this property. I will briefly sketch how one can show that this assertion is correct, an argument that is presented in greater detail elsewhere (Wagner 2011a; Wagner 2011b). Consider a typical phenotype P in any of the three system classes I mentioned. It will be adopted by some very large number M of genotypes that, however, jointly constitute a very small fraction of a vast genotype space (Ciliberti et al. 2007a; Samal et al. 2010; Sumedha et al. 2007; Todd et al. 1999; Wagner 2011b). Let us assume that this set of genotypes consists of genotypes chosen at random from genotype space, without requiring that each genotype has neighbors with the same phenotype. One can then estimate the probability that each of these genotypes has no neighbors with phenotype P – it completely lacks robustness. This probability is very close to one. In other words, robustness is necessary for the existence of genotype networks.

Robustness is also sufficient. To see this, it is useful to view genotype networks as graphs (mathematical objects that consist of nodes, and of edges that link these nodes), and to ask how genotype space would be organized if genotype networks were random networks that shared only the one feature that each genotype has some fraction ν of neighbors with the same phenotype as itself. I emphasize that such random networks may show little resemblance to actual genotype networks. However, they are useful in forming null-hypotheses about genotype space organization. One can show that a random graph constructed by connecting a genotype G to a fraction ν of its neighbors, and each of these neighbors to a fraction ν of their neighbors, and so on – without any further assumptions – would form a genotype networks that would span genotype space or nearly so. In other words, random genotype networks would extend far through genotype space, as long as genotypes in them have many neighbors with the same phenotype (Wagner 2011b, Chapter 6).

A second common property of different system classes regards the neighborhoods of different genotypes G1 and G2 that have the same phenotype, and that lie on the same genotype network. One can ask whether any one phenotype that occurs in one of these neighborhoods occurs only in this neighborhood (and not in the other neighborhood), or whether it occurs in both neighborhoods. The answer is that the fraction of phenotypes unique to one neighborhood in this sense increases with the distance between two genotypes. But even if the two genotypes G1 and G2 have a modest distance and differ in as little as 25 percent of their parts, the majority of phenotypes in one of the genotype’s neighborhoods typically does not occur in the other neighborhood (Ferrada & Wagner 2010). Pertinent evidence exists for proteins (Ferrada & Wagner 2010), RNA (Huynen 1996; Schuster et al. 1994; Sumedha et al. 2007), model regulatory circuits (Ciliberti et al. 2007c), and metabolic networks (Rodrigues & Wagner 2009; Rodrigues & Wagner 2011).

Literature Cited

Alon, U., Surette, M. G., Barkai, N. & Leibler, S. 1999 Robustness in bacterial chemotaxis. Nature 397, 168-171.

Bharathan, G., Goliber, T. E., Moore, C., Kessler, S., Pham, T. & Sinha, N. R. 2002 Homologies in leaf form inferred from KNOXI gene expression during development. Science 296, 1858-1860.

Blank, L. M., Kuepfer, L. & Sauer, U. 2005 Large-scale C-13-flux analysis reveals mechanistic principles of metabolic network robustness to null mutations in yeast. Genome Biology 6, R49.

Brakefield, P. M., Gates, J., Keys, D., Kesbeke, F., Wijngaarden, P. J., Monteiro, A., et al. 1996 Development, plasticity and evolution of butterfly eyespot patterns. Nature 384, 236-242.

Burke, A. C., Nelson, C. E., Morgan, B. A. & Tabin, C. 1995 Hox genes and the evolution of vertebrate axial morphology. Development 121, 333-346.

Carroll, S. B., Grenier, J. K. & Weatherbee, S. D. 2001 From DNA to diversity. Molecular genetics and the evolution of animal design. Malden, MA: Blackwell.

Ciliberti, S., Martin, O. C. & Wagner, A. 2007a Circuit topology and the evolution of robustness in complex regulatory gene networks. PLoS Computational Biology 3(2): e15.

Ciliberti, S., Martin, O. C. & Wagner, A. 2007b Innovation and robustness in complex regulatory gene networks Proceedings of the National Academy of Sciences of the U.S.A. 104, 13591-13596

Ciliberti, S., Martin, O. C. & Wagner, A. 2007c Innovation and robustness in complex regulatory gene networks. Proceedings of the National Academy of Sciences of the U.S.A. 104, 13591-13596

Coen, E. S. & Meyerowitz, E. M. 1991 The war of the whorls : Genetic interactions controlling flower development. Nature 353, 31-37.

Copley, S. D. 2000 Evolution of a metabolic pathway for degradation of a toxic xenobiotic: the patchwork approach. Trends in Biochemical Sciences 25, 261-265.

Dantas, G., Sommer, M. O. A., Oluwasegun, R. D. & Church, G. M. 2008 Bacteria subsisting on antibiotics. Science 320, 100-103.

Davidson, E. H. & Erwin, D. H. 2006 Gene regulatory networks and the evolution of animal body plans. Science 311, 796-800.

Edwards, J. S. & Palsson, B. O. 2000 The Escherichia coli MG1655 in silico metabolic genotype: Its definition, characteristics, and capabilities. Proceedings of the National Academy of Sciences of the United States of America 97, 5528-5533.

Ferrada, E. & Wagner, A. 2010 Evolutionary innovation and the organization of protein functions in sequence space. PLoS ONE 5(11), e14172.

Giaever, G., Chu, A. M., Ni, L., Connelly, C., Riles, L., Veronneau, S., et al. 2002 Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387-391.

Giurumescu, C. A., Sternberg, P. W. & Asthagiri, A. R. 2009 Predicting phenotypic diversity and the underlying quantitative molecular transitions. PLoS Computational Biology 5.

Hafner, M., Koeppl, H., Hasler, M. & Wagner, A. 2009 'Glocal' robustness in model discrimination for circadian oscillators. Plos Computational Biology 5: e1000534. .

Hay, A. & Tsiantis, M. 2006 The genetic basis for differences in leaf form between Arabidopsis thaliana and its wild relative Cardamine hirsuta. Nature Genetics 38, 942-947.

Huang, W., Petrosino, J., Hirsch, M., Shenkin, P. & Palzkill, T. 1996 Amino acid sequence determinants of beta-lactamase structure and activity. Journal of Molecular Biology 258, 688-703.

Hughes, C. L. & Kaufman, T. C. 2002 Hox genes and the evolution of the arthropod body plan. Evolution & Development 4, 459-499.

Huynen, M. A. 1996 Exploring phenotype space through neutral evolution. Journal of Molecular Evolution 43, 165-169.

Isalan, M., Lemerle, C., Michalodimitrakis, K., Beltrao, P., Horn, C., Garriga-Canut, M., et al. 2008 Evolvability and hierarchy in rewired bacterial gene networks. Nature 452, 840-845.

Keys, D., Lewis, D., Selegue, J., Pearson, B., Goodrich, L., Johnson, R., et al. 1999 Recruitment of a hedgehog regulatory circuit in butterfly eyespot evolution. Science 283, 532-534.

Kleina, L. & Miller, J. 1990 Genetic studies of the lac repressor. 13. Extensive amino-acid replacements generated by the use of natural and synthetic nonsense suppressors. Journal of Molecular Biology 212, 295-318.

Lerat, E., Daubin, V., Ochman, H. & Moran, N. A. 2005 Evolutionary origins of genomic repertoires in bacteria. PLoS Biology 3, e130.

MacCarthy, T., Seymour, R. & Pomiankowski, A. 2003 The evolutionary potential of the Drosophila sex determination gene network. Journal of Theoretical Biology 225, 461-468.

Ndifon, W., Plotkin, J. B. & Dushoff, J. 2009 On the accessibility of adaptive phenotypes of a bacterial metabolic network. Plos Computational Biology 5.

Ochman, H., Lerat, E. & Daubin, V. 2005 Examining bacterial species under the specter of gene transfer and exchange. Proceedings of the National Academy of Sciences of the United States of America 102, 6595-6599.

Pal, C., Papp, B. & Lercher, M. J. 2005 Adaptive evolution of bacterial metabolic networks by horizontal gene transfer. Nature Genetics 37, 1372-1375.

Raman, K. & Wagner, A. 2011 Evolvability and robustness in a complex signaling circuit. Molecular BioSystems 7, 1081-1092.

Rehmann, L. & Daugulis, A. J. 2008 Enhancement of PCB degradation by Burkholderia xenovorans LB400 in biphasic systems by manipulating culture conditions. Biotechnology and Bioengineering 99, 521-528.

Reidys, C., Stadler, P. & Schuster, P. 1997 Generic properties of combinatory maps: Neutral networks of RNA secondary structures. Bulletin of Mathematical Biology 59, 339-397.

Rennell, D., Bouvier, S., Hardy, L. & Poteete, A. 1991 Systematic mutation of bacteriophage T4 lysozyme. Journal of Molecular Biology 222, 67-87.

Rodrigues, J. F. & Wagner, A. 2009 Evolutionary plasticity and innovations in complex metabolic reaction networks. PLoS Computational Biology 5, e1000613.

Rodrigues, J. F. & Wagner, A. 2011 Genotype networks in sulfur metabolism. BMC Systems Biology 5: 39.

Samal, A., Rodrigues, J. F. M., Jost, J., Martin, O. C. & Wagner, A. 2010 Genotype networks in metabolic reaction spaces. BMC Systems Biology 4:30.

Schuster, P., Fontana, W., Stadler, P. & Hofacker, I. 1994 From sequences to shapes and back - a case-study in RNA secondary structures. Proceedings of the Royal Society of London Series B 255, 279-284.