17.The branch of mathematics needed to prove this statement is graph theory, and specifically the theory of generalized hypercube graphs. For a glimpse at some of its arcana see Reidys, Stadler, and Schuster (1997). For a more accessible exposition see Wagner (2011), chapter 6.
18.See Darwin (1872), chapter 6, page 170. In the same chapter he expresses his faith in selection’s power to preserve small and useful improvements.
19.See Land and Nilsson (2002). Refraction is the change in the direction of a wave caused by a change in its speed, when the wave passes from one medium to another.
20.The kind of regulatory change that allows this high expression varies from crystallin to crystallin. Many crystallins have undergone gene duplication, but nonduplicated crystallins also exist. They include ε-crystallin, which is the same as lactate dehydrogenase, and -crystallin, which is the same as α-enolase. See Piatigorsky and Wistow (1989), as well as Tomarev and Piatigorsky (1996) and Piatigorsky (1998). In such nonduplicated crystallins changes in regulatory DNA regions allow enhanced gene expression in the lens. See Jornvall et al. (1993) for crystallins related to alcoholdehydrogenase. For other examples of co-option see True and Carroll (2002) and Keys et al. (1999).
21.For the extraordinary half-life of crystallins see Lynnerup et al. (2008).
22.See Graw (2009).
23.A useful source on the evolution of vision is Eldredge and Eldredge (2008).
24.See Gould (1993) and Gould and Lewontin (1979).
25.As quoted in Burr and Andrew (1992).
26.For the quote and a brief history of the technology see Stewart (2012), chapter 11. See http://pittsburgh.cbslocal.com/station/newsradio-1020-kdka/ for the KDKA radio station.
27.The numbers cited in this section refer to an analysis of the secondary structure of the molecule, which is not only computationally predictable but essential for ribozyme function. The number of secondary structures in the molecule’s neighborhood can be smaller than 129 for several reasons, the most important being that several neighbors can have the same shape.
28.How to compute the huge number of RNA molecules with this phenotype is described in Jörg et al. (2008).
29.The general principle is this: If all individuals in a population are confined to exploring new phenotypes from one place in genotype space, their chances of uncovering a superior phenotype are slim, incomparably slimmer than if they can explore many different neighborhoods, as permitted by the existence of genotype networks.
30.More precisely, I am referring here to the typical distance between two arbitrary genotype networks, not to the distance from a specific genotype to one with an arbitrary new phenotype. Furthermore, I note that statements like this always refer to typical cases. More precisely speaking, they hold, as mathematicians say, “with probability one” as a system grows very large in size. See, for example, Reidys et al. (1997) for some of the relevant mathematics. Phenotypes where the properties I discuss do not apply certainly exist, but they are exceptions to the rule defined by typical phenotypes—phenotypes with large genotype networks. And biologically important phenotypes are typical phenotypes. In a research project that speaks to this issue, we computed the sizes of genotype networks for some eighty different biologically important RNA phenotypes. These sizes were not smaller but even greater than those of random phenotypes. See Jörg et al. (2008). In hindsight, this is expected, because phenotypes associated with large genotype networks are easier to find in genotype space than other phenotypes.
31.This volume depends on the number of genes in a circuit, but it is tiny even for circuits with fewer than twenty genes, the example I use here. See Ciliberti, Martin, and Wagner (2007b) for a more detailed analysis of this and other properties of circuit libraries. In general a circle’s area and a ball’s volume calculate as πr2 and (4/3)πr3, respectively, where r is the radius and π 3.14. In higher dimensions, analogous expressions exist, but they are more complex, e.g., the volume of a five-dimensional ball is given by 8π2r5/15. The numbers I discuss in the text can be calculated from formulae like these and a value of r = 0.15, if the squares and cubes at issue have sides of length 1.
32.My earlier note that these are statistical statements applies here as well: This is the typical outcome, with possible exceptions.
33.I note that the value of 0.75 is already close to the maximally possible ratio of “volumes” in two dimensions, because the largest circle that one can inscribe into a square of volume 1 has a radius of 1/2, and such a circle occupies a fraction 0.785 of the square’s area.
34.For relevant observations pertaining to RNA secondary structures, see Schuster et al. (1994). For RNA tertiary structures and their functions, it is not known whether the same holds. However, because secondary structures are prerequisites for tertiary structures, it may be possible to extrapolate these observations to tertiary structures. Proteins, in contrast, are typically more conserved over large genotype distances, thus suggesting that in a small neighborhood of a genotype, it may not be possible to find all possible protein folds. However, this statement applies to a fold as defined by a protein’s arrangement of major secondary structure elements, and many more subtle changes can lead to new functions. For example, some proteins with the same arrangement of secondary structure elements have evolved multiple different enzymatic functions. The innovability of protein functions is an area where substantial discoveries are still waiting to be made.
35.Whether a simple device would be more robust to parts failure, simply because it has fewer parts, or whether a more complex device might be more robust, because any one part is less important and because any one configuration has many neighbors that preserve function, may depend on details of a technology and of the design of a device.
36.See Lawrence (1992).
37.See Gierer and Meinhard (1972) for a simple principle that is useful to make patterns such as segmented bodies. This principle may be realized in many systems, but it is often disguised by an enormously complex signaling circuit involved in pattern formation. For the complexity of the insect segmentation network see Akam (1989), as well as Jäger et al. (2004) and von Dassow et al. (2000). I also note that the relevant environment comprises more than just the world outside a fly’s body, and also includes the constantly fluctuating concentrations of molecules inside this body. At least some of the complexity of pattern formation exists to buffer development against these kinds of fluctuations. See, for example, Lopes et al. (2008) and Ochoa-Espinosa et al. (2009).
38.See Samal et al. (2010), as well as Gerdes et al. (2003), for an experimental analysis in a different environment.
39.They are also valuable to some species of ants, which milk aphids for the honeydew their body releases, and provide protection in return.
40.It turns out that the endosymbionts are especially heat-sensitive and thus limit the range of habitats that aphids can occupy. See Ohtaka and Ishikawa (1991).
41.A less charitable analogy, one that holds for many endoparasites, is that of a prisoner. Buchnera can no longer live on its own. It is completely dependent on what its host provides.
42.For the process of genome reduction in Buchnera see Moran, McLaughlin, and Sorek (2009), as well as Moran and Mira (2001), Tamas et al. (2002), and van Ham et al. (2003).
43.See Yus et al. (2009) and Razin, Yogev, and Naot (1998).
44.See Samal et al. (2010).
45.See Thomas et al. (2009).
46.See Rodrigues and Wagner (2011). There are many ways of defining complexity, but a simple definition—the number of reactions in a metabolism or, more generally, the number of parts in a system—suffices for my purpose.
47.See Samal et al. (2010) and Rodrigues and Wagner (2011).
48.Some aspects of an organism’s complexity, such as the size and organization of its genome, may also be augmented by deleterious changes whose effect is so weak that natural selection cannot eliminate them, at least in small populations
. See, for example, Lynch (2007).
CHAPTER SEVEN: FROM NATURE TO TECHNOLOGY
1.See Sproewitz et al. (2008) and Moeckel et al. (2006).
2.I will here follow Arthur (2009), 27, in using two alternative and complementary definitions of “technology.” In the first, technology is a means to fulfill a human purpose. In the second definition, technology is an assemblage of practices and components. In either sense, biotechnology or digital electronics would be technologies.
3.As quoted in Alfred (2009).
4.As quoted in Lohr (2007).
5.Since the beginning of the twentieth century, objections have occasionally been raised to the notion that mutations are blind or “random” with respect to whether they improve or impair a protein’s function. But even the most carefully documented objections have not stood the test of time and have eventually been refuted by data. See Cairns, Overbaugh, and Miller (1988), as well as Foster (2000) and Hall (1998). For all we know, nature has no foresight into the effects of genetic changes. Given the many ramifications that such changes have for the short-term and long-term future of an organism, this is not so surprising. Even we humans with our cognitive abilities are notoriously poor at predicting the effects of interventions in complex systems, from proteins, cells, and organisms to ecosystems or financial markets.
6.See Burchfield (1990), 43. In 1897 he settled on somewhere around twenty million years.
7.This statement is sometimes attributed to Max Planck but may be apocryphal.
8.See Rosen (2010).
9.See Merton (1936) and Merton (1968), 477.
10.See Ogburn and Thomas (1922).
11.Merton called a historical predilection for focusing on single inventors the “Matthew effect,” from the passage in the Gospel of Matthew that reminds us that “for unto every one that hath shall be given.” It is related to Stigler’s law, propounded by the statistician Stephen Stigler, who wrote that “no scientific discovery is named after its original discoverer.” Stigler then, in a self-referential joke, credited Merton as the “real” discoverer of Stigler’s law. See Gieryn (1980), 147–57.
12.Nature’s solutions are discussed in Rothschild (2008). Some of them may be superior in certain environments, such as in an aerobic atmosphere, which helps explain why multiple solutions to carbon fixation persist to this day.
13.For many examples of multiple independent innovations in nature see Vermeij (2006). While life has discovered some innovations more than once, it may have discovered others only once, but the genotypes encoding them may have diversified later beyond recognition. In some systems, for example proteins where current genotypes are extremely diverse, it is difficult to distinguish multiple independent origins from a single origin followed by diversification.
14.See Johnson (2010), 153.
15.These and many other examples of combinatorial innovation and reuse of existing objects and technologies can be found in Kelley and Littman (2001) and Arthur (2009).
16.As quoted in Arkin (1998).
17.See Gould and Vrba (1982). Gould and Vrba used the term for changes that conferred a function different from the original function, and for changes that had no utility when they first appeared.
18.See Darwin (1872), chapter 6, page 175. The example Darwin had in mind was the transformation of the fish’s flotation bladder into the lungs of terrestrial animals.
19.See Sumida and Brochu (2000).
20.This and several other examples are discussed in True and Carroll (2002), who call the reuse of old parts co-option. I note that Sonic hedgehog is not a transcriptional regulator, but a molecule involved in signaling between cells. For the naming of Sonic hedgehog see Riddle et al. (1993).
21.More precisely, I am referring to an internal combustion air-breathing turbofan engine.
22.See Arthur (2009), 19. The book also discusses the jet engine in some detail.
23.To my knowledge, one of the first who developed this idea for engineering applications was the German Ingo Rechenberg. See Rechenberg (1973). He pointed out that mutations need to have effects on a system’s behavior or performance that must not be too large in order for an evolutionary algorithm to be able to improve it.
24.This algorithm of mutation and selection is actually a stochastic algorithm, where individuals can sometimes survive based on dumb luck. Such stochasticity gives rise to a process called genetic drift that is important in biological evolution. See, for example, Hartl and Clark (2007).
25.There are many flavors of evolutionary algorithms. Two especially prominent ones are genetic programming and genetic algorithms. See Koza (1992) and Mitchell (1998).
26.More commonly known as NP-hard. See Moore and Mertens (2011).
27.An analysis that suggests how bumblebees might solve this problem is given in Lihoreau, Chittka, and Raine (2010).
28.In contrast to traditional (nonevolutionary) algorithms that can solve instances of the traveling salesman problem involving millions of cities within a few percent of optimality, evolutionary algorithms are general-purpose algorithms that can provide good (if not perfect) solutions for less well studied problems, or for problems that are not as clearly posed mathematically.
29.See Dong and Vagners (2004).
30.For an example on engine design see Senecal, Montgomery, and Reitz (2000). One of the principal problems that evolutionary algorithms face is to find a good “genotype” representation of the features to be optimized, and to find mutation or recombination “operators”—the routines that modify the genotype—that work well. But the reason why such algorithms have not revolutionized engine design may lie deeper, in a technology that does not easily admit recombination through standardized linkage. For technologies with such linkage, choosing mutation and recombination operators may also be easier.
31.This does not mean that they do not use recombination, far from it, but they usually recombine abstract (bit-string) representations of candidate solutions to a problem, and not elements of the solution itself, like the amino acids of proteins. More generally, I note that I use the word “recombination” here in a broader sense than prescribed by its standard definition in genetics—the swapping of DNA molecules. This more general use applies to DNA but also to any other notion of genotype, and can even apply to human innovation that is combinatorial. In this sense, even a change in one or a few amino acids of a protein amounts to a recombination of amino acids.
32.More succinctly, new circuits consist of new combinations of interactions between regulators, interactions that may already have existed in other circuits.
33.See the Web site of the Centro Internazionale di Studi di Architecttura Andrea Palladio (CISA) at http://www.cisapalladio.org/veneto/index.php?lingua=i&modo=nomi&ordine=alfa.
34.Their work is described in Hersey and Freedman (1992).
35.An algorithm could produce more than one outcome, more than one floor plan, because individual instructions in the algorithm may have a stochastic component, such as “subdivide a room into two, three, or four rooms with equal probability.” Such stochastic algorithms are widespread in computer science.
36.Other rules include that splits are executed such that buildings are generally bilaterally symmetric around a central axis, walls are aligned whenever possible, and no rooms are allowed to be as wide or long as the entire length or width of the building.
37.More than a century after Palladio, the Swedish inventor and industrialist Christopher Polhem invented a “mechanical alphabet” of machines that is described in Strandh (1987). The letters in this alphabet are machine components, such as levers, wedges, screws, and winches. Polhem believed that through combining these machine components, one could build any conceivable mechanical device. He intended the alphabet as a teaching tool, but some models of machines written in this language have been built. While the idea behind this alphabet is important for students of innovable technologies, it is worth pointing out that the
links between machine parts are not standardized. In a similar vein, Sanchez and Mahoney point out that the automobile, aircraft, consumer electronics, and other industries build many different products by combining a limited number of “modular” components. See Sanchez and Mahoney (1996). But yet again, the links between these modules are often nonstandardized, custom-made, one-of-a-kind. This is an important shortcoming of the “design spaces” of many technologies and a major difference from the genotype spaces of living beings. For the notion of a design space see Stankiewicz (2000).
38.Even more precisely, in mathematics, a function f can be described as a set of ordered pairs (a,b), where a is a member of a set called the domain of the function that defines all admissible function arguments (inputs), and b is a member of the set of output values the function can take. One writes b = f(a). Digital logic circuits compute functions of bit strings whose outputs are again bit strings.
Arrival of the Fittest: Solving Evolution's Greatest Puzzle Page 27