Axe knew that the huge population sizes of prokaryotes like bacteria dwarf the population sizes of all other organisms combined. Thus, estimates for the size of the bacterial population—plus a smidge for everything else—would approximate the size of the number of organisms living at any given time. Based on the average length of time of a bacterial generation and the time since the first appearance of bacterial life on earth (3.8 billion years ago), scientists have estimated that a total of about 1040 organisms have lived on earth since life first appeared.19 Axe made the assumption that each new organism received one new sequence of bases (one potential gene) capable of generating one of the possible amino-acid sequences in sequence space per generation.
This was an extremely generous assumption. Since mutations have to be quite rare for life to survive, most bacterial cells inherit an exact copy of their parent’s DNA. Furthermore, the ones that differ from their parents are likely to carry a mutation that has already occurred many times in other cells. For these reasons, the actual number of new sequences sampled in the history of life is much lower than the total number of bacterial cells that have existed. Nevertheless, Axe assumed that one new gene per organism has been transmitted to the next generation. Thus, he used 1040 gene sequences as a liberal estimate of the total number of gene sequences (evolutionary trials) that have been generated to search sequence space in the history of life.
Even so, 1040 represents only a tiny fraction—1 ten trillion, trillion, trillionth—of 1077. Thus, the conditional probability of generating a gene sequence capable of producing a novel protein fold and function is still only 1 in 1037. This means that if every organism from the dawn of time had generated, by random mutation, one new base sequence in the sequence space of interest, that would amount to only one 10 trillion, trillion, trillionth of the sequences in that space—the space that needs to be searched.And, since conditional probability of a new gene arising in the manner envisioned by the classical model turns out to be almost unimaginably less than ½, the classical model turns out to be vastly more likely to be false than true. Thus, Axe concluded that a reasonable person should reject it. The probabilistic resources available to the classical model of gene evolution are simply far too small to tame 1 chance in 1077 (see Fig. 10.4).
FIGURE 10.4
The top panel in this diagram represents the results of Axe’s mutagenesis experiments showing the extreme rarity of functional proteins in sequence space. Based on his experiments Axe estimated that there are 1077 possible sequences corresponding to a specific functional sequence 150 amino acids long. The second panel shows that functional amino-acid sequences are extremely rare even in relation to the total number of opportunities the evolutionary process would have had to generate novel sequences (on the assumption that each organism that has ever lived during the history of life produced one such sequence per generation).
To appreciate why this model fails, consider the following illustration.After the 1975 Steven Spielberg film Jaws became a big hit, one small-town motel advertised “Shark-Free Pool.” Proponents of the second evolutionary scenario (gene duplication, followed by neutral evolution), envision an evolutionary pool where there are no consequences for mutational missteps—by analogy, a pool with no predators. But to extend the illustration, picture a predator-free pool the size of our galaxy. Now picture a blindfolded man dropped into the middle of it. He must swim to the far side, to the one spot on the edge of the pool where a ladder would give him a way out. He’s safe from predators, but it will do him no good. He needs direction, some way of gauging his progress, and an immense amount of time. But he has none of these and so he will arrive at the ladder in neither a hundred years, nor a hundred billion. Similarly, in the classical model of gene evolution, random mutations must thrash about aimlessly in immense combinatorial space, a space that could not be explored by this means in the entire history of life on earth, let alone in the few million years of the Cambrian explosion.
To Build an Animal
Yet Axe’s calculations only hint at the full problem for neo-Darwinian theory. By bending over backwards not to overstate the improbability of generating a new protein fold and by focusing narrowly on that one aspect of the challenge confronting evolutionary theory, his figures vastly understate the improbability of building a Cambrian animal. There are several reasons for this.
First, the Cambrian explosion as dated by fossil evidence took far less time than has elapsed since the origin of life on earth until the present (about 3.8 billion years).20 Less time available for a given evolutionary transition means fewer generations of new organisms and fewer opportunities to generate new genes by which to search relevant sequence space. This makes it even harder to generate a new protein fold by chance in the relevant time period.
Second, bacteria are by far the most common type of organisms included in Axe’s estimate of the total number of organisms that have lived on earth. Yet no one thinks that Cambrian animals evolved directly from bacteria. Nor does anyone think that the putative multicellular ancestors of the Cambrian forms would have been anywhere near as abundant as the bacterial populations that Axe used as the main basis of his estimate. A more realistic estimate for the number of possible animal ancestors would necessarily result in a much lower estimate for the number of gene sequences available for searching sequence space (corresponding to a single protein of modest length in a single Cambrian animal). Recall that, based on Axe’s estimates, the probability of generating just one gene (for a new functional protein fold) from all the bacteria (and other organisms) that have ever lived on earth is just 1 in 10 trillion, trillion, trillion. Consequently, the mechanism for searching sequence space that Axe determined to be extremely implausible must be judged far more implausible as a mechanism for producing the Cambrian information explosion, since there were far fewer multicellular organisms present in the Precambrian period than there have been total organisms present in the entire history of life.
Third, building new animal forms requires generating far more than just one protein of modest length. New Cambrian animals would have required proteins much longer than 150 amino acids to perform necessary, specialized functions.21 For example, as previously noted, many of these Cambrian animals needed the complex protein lysyl oxidase to support their stout body structures. In addition to a novel protein fold, these molecules (in living organisms) comprise over 400 precisely sequenced (nonrepeating) amino acids. Reasonable extrapolation from mutagenesis experiments done on shorter protein molecules suggests that the improbability of randomly producing functionally sequenced proteins of this length would be extremely unlikely to occur given the probabilistic resources (and duration) of the entire universe.22
The mutation and selection mechanism faces a related obstacle. The Cambrian animals exhibit structures that would have required many new types of cells, each requiring many novel proteins to perform their specialized functions. But new cell types require not just one or two new proteins, but coordinated systems of proteins to perform their distinctive cellular functions. The unit of selection in such cases ascends to the system as a whole. Natural selection selects for functional advantage, but no advantage accrues from a new cell type until a system of servicing proteins is in place. But that means random mutations must, again, do the work of information generation without the help of natural selection—and, now, not simply for one protein, but for many proteins arising together. Yet the odds of this occurring by chance alone are, of course, far smaller than the odds of the chance origin of a single new gene or protein—so small as to render the chance origin of the information needed to build a new cell type fantastically improbable (and implausible) given even the most optimistic estimates for the length of the Cambrian explosion.
Richard Dawkins has noted that scientific theories can rely on only so much “luck” before they cease to be credible.23 But the second scenario, involving gene duplication and neutral evolution, by its own logic, precludes natural selection from playing a role in
generating genetic information until after the fact. Thus, it relies entirely on “too much luck.” The sensitivity of proteins to functional loss, the rarity of proteins within combinatorial sequence space, the need for long proteins to build new cell types and animals, the need for whole new systems of proteins to service new cell types, and the brevity of the Cambrian explosion relative to rates of mutation—all conspire to underscore the immense implausibility of any scenario for the origin of Cambrian genetic information that relies upon random variation alone, unassisted by natural selection.
Yet the classical model of gene evolution—which relies on neutral evolution—requires novel genes and proteins to arise, precisely, by random mutation alone. Adaptive advantage accrues after the generation of new functional genes and proteins. Natural selection cannot play a role until new functional information-bearing molecules have independently arisen. Thus, to return to Dawkins’s imagery, evolutionary theorists envisioned the need to scale the steep face of a precipice of which there is effectively no gradually sloping back side, since the smallest increment of structural innovation in the history of life—a new protein fold—itself presents a formidable Mount Improbable.
By the way, Axe’s later experiments establishing the extreme rarity of protein folds in sequence space also show why random changes to existing genes inevitably efface or destroy function before they generate fundamentally new folds or functions (scenario one). If only one out of every 1077 of the alternate sequences are functional, an evolving gene will inevitably wander down an evolutionary dead-end long before it can ever become a gene capable of producing a new protein fold. The extreme rarity of protein folds also entails their isolation from each other in sequence space.
A Catch-22
Douglas Axe’s results highlight an acute dilemma for neo-Darwinism, a “catch-22.” On the one hand, if natural selection plays no role in generating new genes, as the idea of neutral evolution implies, then mutations alone must climb a Mount Improbable in a single leap—a situation that, given Axe’s results and Dawkins’s own logic, is probabilistically untenable. On the other hand, any model for the origin of genetic information that envisions a significant role for natural selection, by assuming a preexisting gene or protein under selective pressure, encounters other equally intractable difficulties. The evolving genes and proteins will range over a series of disadvantageous or nonfunctional intermediates that natural selection will not favor or preserve, but will, instead, eliminate. At that point, selection-driven evolution will cease, locking existing genes and proteins into place.
Thus, whether one envisions the evolutionary process beginning with a preexisting functional gene or a duplicated noncoding region of the genome, the results of mutagenesis experiments present a precise quantitative challenge to the efficacy of the neo-Darwinian mechanism. Indeed, our growing knowledge about the rarity and isolation of proteins and functional genes in sequence space implies that neither neo-Darwinian scenario for producing new genes is at all plausible. Thus, neo-Darwinism does not explain the Cambrian information explosion.
11
Assume a Gene
When I first heard that Douglas Axe had succeeded in making a rigorous estimate of the rarity of proteins in sequence space, I wondered what neo-Darwinists would say in response. Given the experimental rigor and mathematical precision of the work he reported in the Journal of Molecular Biology in 2004, and the long odds against mutation and selection ever finding a novel gene or functional protein, what could they say? That the probability of a successful search for new genes and proteins was higher than Axe’s experiments suggested? That his methods or calculations were flawed? That no one else had gotten similar results? Since Axe’s work confirmed other analyses and experiments, and since his paper had passed through the careful scrutiny of peer review, none of those responses seemed plausible. Yet defenders of the adequacy of the neo-Darwinian mechanism were far from admitting defeat, as I would soon find out.
The same year, I published a peer-reviewed scientific article about the Cambrian explosion and the problem of the origin of the biological information needed to explain it.1 In the paper, I cited Axe’s results and explained why the rarity of functional proteins in sequence space posed such a severe challenge to the adequacy of the neo-Darwinian mechanism. The article appeared in a biology journal, Proceedings of the Biological Society of Washington, published out of the Smithsonian Institution by scientists working for the Smithsonian’s National Museum of Natural History (NMNH). Because the article also argued that the theory of intelligent design could help explain the origin of biological information (see Chapter 18), its publication created a firestorm of controversy.
Museum scientists and evolutionary biologists from around the country were furious with the journal and its editor, Richard Sternberg, for allowing the article to be peer-reviewed and then published. Recriminations followed. Museum officials took away Sternberg’s keys, his office, and his access to scientific samples. He was transferred from a friendly to a hostile supervisor. A congressional subcommittee staff later investigated and found that museum officials initiated an intentional disinformation campaign against Sternberg in an attempt to get him to resign. His detractors circulated false rumors: “Sternberg has no degrees in biology” (actually he has two Ph.D.’s, one in evolutionary biology and one in systems biology); “He is a priest, not a scientist” (Sternberg is not a priest, but a research scientist); “He is a Republican operative working for the Bush campaign” (he was far too busy doing scientific research to be involved in political campaigns, Republican or otherwise); “He’s taken money to publish the article” (not true); and so on. Eventually, despite the demonstrable falsehood of the charges, he was demoted.2
Major news stories about the controversy appeared in Science, Nature, The Scientist, and the Chronicle of Higher Education.3 Then articles appeared in the mainstream press, including the Washington Post and the Wall Street Journal.4 A major story aired on National Public Radio.5 Sternberg himself even appeared on The O’Reilly Factor.
Despite the intense furor, there was no formal scientific response to my article: neither the Proceedings nor any other scientific journal published a scientific refutation. The members of the Council of the Biological Society of Washington who oversaw the publication of the journal insisted that they didn’t want to dignify it by responding.
Eventually two scientists and a science education policy advocate—each associated with the National Center for Science Education, a group that lobbies for teaching evolution in the public schools—stepped forward. The three authors—geologist Alan Gishlick, education policy advocate Nicholas Matzke, and wildlife biologist Wesley R. Elsberry—published a response to my article on TalkReason.org, a prominent atheistic website.6 Although the website’s guidelines prohibit “ad hominem arguments,” the rule was somewhat loosely enforced in the case of Gishlick, Matzke, and Elsberry’s response, which they titled “Meyer’s Hopeless Monster.”
Gishlick, Matzke, and Elsberry attempted to refute my central argument by citing a scientific paper that they said solved the problem of the origin of genetic information. The paper, a scientific review essay titled “The Origin of New Genes: Glimpses from the Young and Old,” had appeared in Nature Reviews Genetics in 2003. Gishlick, Matzke, and Elsberry asserted that this paper—coauthored by Manyuan Long, an evolutionary biologist at the University of Chicago, and several colleagues—was representative of an extensive “scientific literature documenting the origin of new genes.”7
Other biologists echoed Gishlick, Matzke, and Elsberry’s claim in the context of another public controversy. During the 2005 Kitzmiller v. Dover trial about an ill-advised attempt to require teachers in a Pennsylvania school district to read a statement about intelligent design, Brown University biologist Kenneth Miller cited Long’s paper in his testimony. He said that it shows how new genetic information evolves. The judge in the case, John E. Jones, then cited Miller’s testimony about Long’s article in his o
wn decision. Judge Jones asserted there are “more than three dozen peer-reviewed scientific publications showing the origin of new genetic information by evolutionary processes.”8 Elsewhere Matzke, along with biologist Paul Gross, stated that the paper by Long “reviews all the mutational processes involved in the origin of new genes and then lists dozens of examples in which research groups have reconstructed the genes’ origins.”9 In their view, “Competent scientists know how new genetic information arises.”10
But do evolutionary biologists really know this?
Let’s take a closer look at the article that allegedly shows “how new genetic information arises.”11
Once Upon a Gene
The oft-cited Long paper points to a variety of studies that purport to explain the evolution of various genes. These studies typically begin by taking a gene and then seeking to find other genes that are similar (or homologous) to it. They then seek to trace the history of slightly different homologous genes back to a hypothetical common ancestor gene (or genes). To do this, the studies survey databases of gene sequences looking for similar sequences in representatives of different taxonomic groups—often in closely related species. Some studies also attempt to establish the existence of a common ancestor gene on the basis of similar genes within the very same organism. They then typically propose evolutionary scenarios in which an ancestral gene duplicates itself,12 and then the duplicate and the original evolve differently as the result of subsequent mutations in each gene.
Next, these scenarios invoke various kinds of mutations—duplication events, exon shuffling, retropositioning, lateral gene transfer, and subsequent point mutations—as well as the activity of natural selection (see Fig. 11.1). The evolutionary biologists conducting these studies postulate that modern genes arose as the result of these various mutational processes—processes that they envision as having shaped genes during a long evolutionary history. Since the information in modern genes is presumably different from the information in the hypothetical ancestor genes, they regard the mutational mechanisms that are allegedly responsible for these differences as the explanation for the origin of genetic information.
Darwin's Doubt Page 23