50.See Smil (2000). These kinds of metals have served as valuable catalysts to industrial chemists for a long time. The Haber-Bosch process that sustains a third of the world’s population, for example, uses iron to create five hundred million tons of ammonium fertilizer every year. See Holm and Andersson (1998), as well as Hsu-Kim et al. (2008).
51.The citric acid cycle is also called the tricarboxylic acid cycle or the Krebs cycle, after the German-born Nobel Prize–winning biochemist Hans Adolf Krebs. See Braakman and Smith (2013) for some variants on this theme as a possible origin of metabolism.
52.See Morowitz et al. (2000), as well as Braakman and Smith (2013).
53.See Stryer (1995), as well as Smith and Morowitz (2004).
54.What I have described first is the more primitive reductive TCA cycle, which uses energy from reduced inorganic molecules and carbon from CO2 to synthesize precursors for other molecules. In contrast, the oxidative TCA cycle in heterotrophic organisms (like us) extracts energy from organic molecules to produce both energy—ultimately, ATP—and building blocks for biosyntheses, as well as the CO2 waste product that we exhale.
55.See Hugler et al. (2007), as well as Smith and Morowitz (2004).
56.See Zhang and Martin (2006) and Cody et al. (2000).
57.Theoretical treatments of autocatalytic networks include those by Eigen and Schuster (1979) and Kauffman (1986). It is easy to see how the state of a metabolism can be inherited from parent to offspring, but such inheritance is unlikely to be very faithful, for example because it is subject to stochastic fluctuations in the concentrations of metabolites and catalysts among offspring from the same parent. Nucleic acids clearly provide a superior means of faithful inheritance.
58.See Williams et al. (2011), Huang and Ferris (2006), Ferris et al. (1996), and Holm (1992).
59.See Budin and Szostak (2010).
60.See Deamer (1998).
61.See Budin, Bruckner, and Szostak (2009).
62.Curiously, Pasteur rang the death knell for spontaneous creation, but he still believed that a vital force was necessary for fermentation, which Buchner later showed to require only inanimate enzymes.
63.The numbers I cite here are taken from well-studied cells, such as those of the bacterium E. coli. See Neidhardt (1996) and Feist et al. (2007). Although the chemical composition of biomass and thus its building blocks vary among organisms, some important principles hold broadly, such as that proteins, RNA, and DNA typically constitute the majority of biomass.
64.Our human metabolism is even more complex. It has more than two thousand reactions and more than two thousand small molecules. Current knowledge about the E. coli network is summarized in Feist et al. (2007), and about the human network by Duarte et al. (2007). Both bodies of knowledge will undoubtedly grow in the future.
65.More precisely, the microbes in our gut synthesize biotin.
66.See Wolfenden and Yuan (2008). I note that sucrase, like other enzymes, does not float through a cell’s interior, but is anchored to the membrane of intestinal cells.
67.Some reactions are catalyzed by more than one enzyme, and some enzymes catalyze more than one reaction.
68.To be precise, sucrase is a protein that consists of two identical polypeptides. See Sim et al. (2010).
69.This holds for metabolic enzymes. There are other enzymes, most notably protein kinases, that add phosphates to other proteins, which are large molecules.
70.See Tanenbaum (1988), 254.
71.To be precise, there are several related molecules that can also serve to store energy, such as GTP and deoxy CTP, but they are very similar in chemical structure to ATP, and they use the same kind of chemical bond for energy storage.
72.There are many different kinds of lipids, and membranes vary in their lipid content among organisms, but the principle that membrane molecules are amphiphilic remains unchanged.
73.Some organisms show minor variations from the genetic code. See Knight, Freeland, and Landweber (2001), but these variants most likely originated after the most recent common ancestor of all extant life.
74.One of the alternatives to ATP is its close relative GTP, and one alternative to DNA is PNA (peptide nucleic acid). See Nelson, Levy, and Miller (2000). Chapter 3 of Wagner (2005b) reviews some relevant literature on the genetic code. One of the alternatives may be superior, but that’s beside the point. Even if natural selection has caused the demise of the others, the current standards tell us that we descend from a single ancestor.
CHAPTER THREE: THE UNIVERSAL LIBRARY
1.This analogy is inspired by a famous short story of the Argentine author Jorge Luis Borges entitled “The Library of Babel” (Spanish original: “La biblioteca de Babel”), published in English translation in Borges (1962). The idea behind this short story, however, predates Borges. It has been used by many other authors, including Umberto Eco and Daniel Dennett.
2.The BioCyc database can be found at http://biocyc.org/ and is described in Caspi et al. (2012). For the KEGG database see Ogata et al. (1999). Yet another relevant database is described in Chang et al. (2009).
3.See McCarthy, Claude, and Copley (1997), Ederer et al. (1997), Nohynek et al. (1996), Copley (2000), and Copley et al. (2012)
4.See Copley (2000).
5.See Rehmann and Daugulis (2008).
6.See van der Meer et al. (1998) and van der Meer (1995).
7.See Dantas et al. (2008).
8.See Takiguchi et al. (1989).
9.See Mommsen and Walsh (1989) and Wright, Felskie, and Anderson (1995).
10.Plants themselves respire some of the oxygen they produce to build biomass.
11.Salt-loving bacteria also have other adaptations. See Postgate (1994).
12.See Steppuhn et al. (2004).
13.See Bennick (2002).
14.See McMahon, White, and Sayre (1995).
15.The reason is that genomes as similar as these, especially in higher organisms, usually encode metabolisms that are also very similar and do not contain very different sets of enzymes.
16.See Shrestha et al. (2011). They have a mutation that inactivates the enzyme.
17.See Redfield (1993) and Dubnau (1999).
18.The genetic material of some viruses is RNA and not DNA, but their life cycle usually involves a DNA intermediate to which similar principles apply.
19.Excessive DNA can also cause problems when replicated genes or chromosomes need to separate during reproduction.
20.See Bushman (2002), Loreto, Carareto, and Capy (2008), and Bergthorsson et al. (2003).
21.The analogy to human races must be taken with a grain of salt. Bacteria do not reproduce sexually like many animals and plants. The notion of a species is not clearly defined for them, and the same holds for even less precise categories such as race.
22.See Lawrence and Ochman (1998), Blattner et al. (1997), Ochman and Jones (2000), and Pal, Papp, and Lercher (2005).
23.See Lawrence and Ochman (1998).
24.Some relevant articles are Smillie et al. (2011), as well as Ochman, Lerat, and Daubin (2005) and Ma and Zeng (2004).
25.The actual percentage varies, being greater in bacteria and smaller in most multicellular organisms.
26.See Blattner et al. (1997) and Feist et al. (2007). These may also include enzymes that catalyze chemical reactions outside metabolism, such as enzymes that are involved in transmitting information between cells.
27.This simple description hides many technical complexities. For example, even similar genes can sometimes encode enzymes that catalyze different reactions, and vice versa. Also, some enzymes can catalyze more than one reaction, some reactions are catalyzed by multiple enzymes, and some enzymes are the products of not just one but multiple genes. In practice, annotating the metabolic reactions in a genome thus involves more than just computerized comparison of genes. See Feist et al. (2009).
28.This notion of distance is diff
erent from the pairwise Hamming distance of two bit strings, which designates the number or fraction of bits at which two strings differ. Specifically, it does not take into account all the reactions that are absent in both metabolisms. Most known metabolisms comprise only a small fraction of the total number of reactions in the known reaction universe. See Ogata et al. (1999). Even if two metabolisms differed in all their reactions, however, there would still exist many reactions that are absent in both metabolisms. For this reason, and because I focus on the proportion of reactions unique to one network, the fraction of shared reactions, D, is a more appropriate distance measure than the Hamming distance.
29.This becomes less surprising if one is aware that the DNA of two such strains may differ in more than one million nucleotides.
30.I performed this analysis for one bacterial species from each genus, to avoid overrepresenting highly similar species. See Wagner (2009a).
31.While many colors are caused by pigments, in others a finely textured surface brings forth colors through iridescence, such as in the wing coloration of butterflies. In some coloration phenotypes, such as that of the chameleon, both structural colors and pigment-based colors account for the phenotype.
32.Although the procedure I described is feasible, it turns out not to be the most efficient way to compute viability. In practice, an approach called flux balance analysis is more useful. It relies on a computational technique called linear programming. For an overview see Price, Reed, and Palsson (2004). Computations like this can determine more than just viability. They also tell us how fast a metabolism works—how speedily it manufactures biomass molecules. In other words, they can tell us whether an organism could go forth and multiply, or whether it would barely hang on to life.
33.Furthermore, flux balance analysis can also correctly predict growth and nutrient uptake rates under different growth conditions and environments. See Feist et al. (2007), Segre, Vitkup, and Church (2002), Edwards, Ibarra, and Palsson (2001), and Neidhardt (1996). Where predictions and experiment disagree, two principal causes are at work. The first is missing information about a metabolism. The second involves regulatory constraints, where genes for a particular enzyme-catalyzed reaction exist in a genome, but the enzyme is not produced, because the gene is not regulated appropriately. These kinds of constraints are quickly overcome, even in laboratory evolution experiments, and thus do not present a serious obstacle for metabolic innovation. See Fong and Palsson (2004), Fong et al. (2006), Forster et al. (2003), Segre et al. (2002), and Edwards and Palsson (2000).
34.A notable exception would be endosymbionts, organisms that live inside other organisms and benefit from the constant environment their hosts provide. An example of a long-standing endosymbiosis that has endured for many millions of years is found in the bacterial genus Buchnera, an endosymbiont of aphids. See Moran, McCutcheon, and Nakabachi (2008), as well as chapter 6.
35.See Feist et al. (2007).
36.A (hypothetical) metabolism viable on all possible fuels certainly has a phenotype, but it can no longer experience a fuel innovation, such that the number of possible innovations must be strictly smaller than that of phenotypes.
37.A classical work exploring spaces of many dimensions is Abbott (2002). A more contemporary exploration can be found in Stewart (2001).
38.In mathematical language, our three-dimensional space and the metabolic library—a space of metabolic genotypes—are both metric spaces, because a notion of distance exists in both of them. See Searcoid (2007). Mathematicians also study nonmetric spaces, but their properties are more difficult to understand intuitively, precisely because they lack a notion of distance.
39.(4 × 109 [yr]) × (365 [d/yr]) × (8.64 × 104 [s/d]) × (5 × 1030) = 6.3 × 1047 combinations.
40.By conventional measures of scientific output, such as citations of scientific publications per capita, it is arguably the world leader. See Cole and Phelan (1999). For what it’s worth, Switzerland has also produced more Nobel laureates per capita than even the United States. “List of Nobel laureates by country per capita,” Wikipedia, http://en.wikipedia.org/wiki/List_of_Nobel_laureates_by_country_per_capita.
41.More precisely, it is the ability to synthesize the carbon backbone of all these molecules from the carbon atoms and from the energy stored in this sugar.
42.Other researchers focused on a different question, namely whether all reactions in a metabolism are essential, and found that they are not, which leads to the same conclusion. See Edwards and Palsson (2000), as well as Fong and Palsson (2004).
43.This work was carried out in collaboration with Areejit Samal and Olivier Martin. See Samal et al. (2010).
44.The work I discuss here is summarized in Rodrigues and Wagner (2009), as well as in Samal et al. (2010) and Rodrigues and Wagner (2011).
45.See Rodrigues and Wagner (2009).
46.I note that in our analyses we started from viable networks of different numbers of reactions, and kept the number of reactions in a network approximately constant during a random walk. Each such walk thus explored a “slice” through the hypercube of the metabolic library.
47.See Rodrigues and Wagner (2009).
CHAPTER FOUR: SHAPELY BEAUTIES
1.Fletcher, Hew, and Davies (2001) present an overview of antifreeze proteins in fish.
2.Some enzymes can catalyze multiple reactions, and are often called “promiscuous” for that reason. See O’Brien and Herschlag (1999). Conversely, some reactions are catalyzed by multiple enzymes.
3.See Zhao et al. (2001). Charcot-Marie-Tooth disease can also be caused by mutations in other genes.
4.Other factors matter as well, such as the electric charge of amino acids, but because most of what I say holds for these factors as well, I will let shape stand for them. Binding of molecules with shape complementarity involves specific interactions and attractive forces between molecules, such as hydrogen bonding. See Branden and Tooze (1999).
5.The technically more precise term for a single amino acid chain is polypeptide. A protein can consist of one polypeptide or multiple polypeptides.
6.The words I use here to describe protein folding are anthropomorphic, but the process is purely physical, no less than how iron filings align in a magnetic field, but more complicated than that, because multiple conflicting attractive and repulsive amino acid interactions are at work.
7.More precisely, these elements of a protein’s structure are called an α-helix and a pleated β sheet. A pleated β sheet forms from parts of the amino acid chain that are not necessarily contiguous in the chain. These parts are also called β-strands. In the figure, they correspond to the nearly straight ribbons terminated by arrowheads. See also Branden and Tooze (1999).
8.What you see is actually only about half of the entire amino acid string, also called the N-terminal domain. The entire sucrase molecule is a complex of two amino acid strings. See Sim et al. (2010). The large size of a protein may provide it with stability to thermal motion, specificity for its target molecule, high rate of catalysis, and the ability to regulate its activity. Although one can synthesize catalytic peptides, much smaller enzymes consisting of few amino acids, such peptides do not have these properties of more complex enzymes. See Tanaka, Fuller, and Barbas (2005).
9.This jiggling is also the basis for enzyme promiscuity, the phenomenon in which some enzymes can catalyze multiple chemical reactions. Some of the oscillating shapes they form can bind molecules other than their main targets, and help these molecules react. They may not be very good at these side jobs, but good enough to accelerate the rate at which these other reactions proceed. See, for example, O’Brien and Herschlag (1999). In the evolution of enzymes early in the history of life, some enzymes probably were highly promiscuous. They catalyzed multiple reactions, each at a low rate, and became specialized later for one reaction that they could catalyze efficiently. See Kacser and Beeby (1984). Because no one enzyme can catalyze all reactions necess
ary to sustain modern life, promiscuity does not eliminate the need to understand how enzymes with new catalytic abilities arose in the first place.
10.See Szegezdi et al. (2006).
11.From the closing paragraph of Darwin (1859).
12.See Cheng (1998) for a survey on the evolution of diverse antifreeze proteins. The ancestors of these proteins include enzymes and lectins, which have a variety of roles, including the promotion of cell adhesion.
13.The species in question are sculpins in the genus Myoxocephalus. See Cheng (1998).
14.Fletcher et al. (2001). Different antifreeze proteins within the same organism might not have a different origin.
15.The ability to protect against freezing may not have arisen abruptly but gradually, where some amino acid changes increased a protein’s ability to protect against freezing to a small extent, until today’s antifreeze proteins had formed.
Arrival of the Fittest: Solving Evolution's Greatest Puzzle Page 24