Darwin's Doubt

Home > Other > Darwin's Doubt > Page 18
Darwin's Doubt Page 18

by Stephen C. Meyer


  The classical Mendelian genetics that replaced Darwin’s blending theory of inheritance also suggested limitations on the amount of genetic variability available to natural selection. If plant reproduction produced either green or yellow peas but never some intermediate form, and if the signals for producing the green traits and yellow traits persisted unchanged from generation to generation, it was difficult to see how sexual reproduction and genetic recombination could produce anything more than unique combinations of already existing traits.

  In the decades immediately after Mendel’s work, geneticists came to understand genes as discrete units or packets of hereditary information that could be independently sorted and shuffled within the chromosome. This too suggested that a significant, but still strictly limited, amount of genetic variation could arise by genetic recombination during sexual reproduction. Thus, Mendelian genetics raised significant questions about whether the process of natural selection has access to enough variation (which, after Mendel, was conceived as genetic variation) to allow it to produce any significant morphological novelty. For a time, Darwin’s theory was in retreat.

  Darwinism Mutates

  During the 1920s, 1930s, and 1940s, however, developments in genetics revived natural selection as the main engine of evolutionary change. Experiments performed by Hermann Muller in 1927 showed that X-rays could alter the genetic composition of fruit flies, resulting in unusual variations.4 Muller called these X-ray-induced changes “mutations.” Other scientists soon reported that they had produced mutations in the genes of other organisms, including humans. Whatever genes were made of—and biologists then still did not know—these developments suggested they could vary more than either Darwin or classical Mendelian genetics had assumed. Geneticists at the time also discovered that these small-scale changes in genes were potentially heritable.5 But if variant versions of genes were heritable, then presumably natural selection could favor advantageous gene variants and eliminate the others. These mutations could then influence the future direction of evolution and, in theory at least, provide an unlimited supply of variation for natural selection’s workshop.

  The discovery of genetic mutations also suggested a way to reconcile Darwinian theory with insights from Mendelian genetics. During the 1930s and 1940s, a group of evolutionary biologists, including Sewall Wright, Ernst Mayr, Theodosius Dobzhansky, J. B. S. Haldane, and George Gaylord Simpson, attempted to demonstrate this possibility using mathematical models to show that small-scale variations and mutations could accumulate over time in whole populations, eventually producing large-scale morphological change.6 These mathematical models formed the basis of a subdiscipline of genetics known as population genetics. The overall synthesis of Mendelian genetics with Darwinian theory came to be called “neo-Darwinism” or simply the “New Synthesis.”

  According to this new synthetic theory, the mechanism of natural selection acting upon genetic mutations suffices to account for the origin of novel biological forms. Small-scale “microevolutionary” changes can accumulate to produce large-scale “macroevolutionary” innovations. The neo-Darwinists argued that they had revived natural selection by discovering a specific mechanism of variation that could generate new forms of life from simpler preexisting ones. By the centennial celebration of Darwin’s Origin of Species in 1959, it was widely assumed that natural selection and random mutations could indeed build new forms of life over the course of time with their distinctive body plans and novel anatomical structures. At the celebration, Julian Huxley, the grandson of T. H. Huxley, summarized this optimism in a grand proclamation:

  Future historians will perhaps take this Centennial Week as epitomizing an important critical period in the history of this earth of ours—the period when the process of evolution, in the person of inquiring man, began to truly be conscious of itself… . This is one of the first public occasions on which it has been frankly faced that all aspects of reality are subject to evolution, from atoms and stars to fish and flowers, from fish and flowers to human societies and values—indeed, that all reality is a single process of evolution …7

  In a television broadcast leading up to the Centennial celebration, Huxley captured the optimistic mood more succinctly: “Darwinism has come of age, so to speak. We are no longer having to bother about establishing the fact of evolution.”8

  Variation as Information

  Initially, the elucidation of the structure of DNA by James Watson and Francis Crick in 1953 contributed to this euphoria.9 Indeed, it seemed to lift the mechanism of genetic variation and mutation out of the mist and into the clear light of the emerging science of molecular biology. Watson and Crick’s elucidation of the double helix structure of DNA suggested that DNA stored genetic information in the form of a four-character digital and chemical code (see Fig. 8.1). Later, following the formulation of Francis Crick’s famed “sequence hypothesis,” molecular biologists confirmed that the chemical subunits along the spine of the DNA molecule called nucleotide bases function just like alphabetic characters in a written language or digital characters in a machine code. Biologists established that the precise arrangement of these nucleotide bases conveyed instructions for building proteins.10 (See Fig. 8.2.) Molecular biologists also determined that this store of genetic information in DNA is transmitted from one generation of cells and organisms to another. In short, it was established that DNA stores hereditary information for building proteins and thus, presumably, for building higher-order anatomical traits and structures as well.

  The elucidation of the double helix seemed to resolve some long-standing issues in evolutionary biology. Darwinists had long maintained that natural selection produced new forms by separating the proverbial wheat from the chaff of genetic variation, but they didn’t know where the raw material for all of the competing variations resided. Neither did they know how genes stored information for producing the traits associated with them. Further, even after geneticists discovered that stable genetic traits could be altered by mutations, they remained uncertain about what exactly it was that was being “mutated.” Consequently, biologists were uncertain about exactly where variations and mutations occurred.

  FIGURE. 8.1

  James Watson (left) and Francis Crick (right) presenting their model of the structure of the DNA molecule in 1953. Courtesy A. Barrington Brown/Science Source.

  Watson and Crick’s model suggested an answer to that question: genes correspond to long sequences of bases on a strand of DNA. Building on that insight, evolutionary biologists proposed that new variations arose, first, from the genetic recombination of different sections of DNA (different genes) during sexual reproduction and, second, from a special kind of variation called mutations that occur from random changes in the arrangement of nucleotide bases in DNA. Just as a few typographical errors in an English sentence might alter the meaning of a few words or even the whole sentence, so too might a change in the sequential arrangement of the bases in the genetic “text” in DNA produce new proteins or morphological traits.

  FIGURE 8.2

  The model (or structural formula) of the DNA molecule showing the digital or alphabetic character of the nucleotide bases stored along the sugar-phosphate backbone of the molecule.

  Watson and Crick’s discovery also raised new questions—in particular, questions about the information necessary to build completely new forms of life during the course of biological evolution. True, mutations play a role in this process, but could they generate enough information to produce novel forms of animal life such as those that arose in the Cambrian period, an explosion—a vast proliferation—of new biological information?

  The Cambrian Information Explosion

  Consider choanoflagellates, a group of single-celled eukaryotic organisms with a flagellum. What separates such organisms from a trilobite or a mollusk or even a lowly sponge? Clearly, all three of these higher forms of life are more complex than any one-celled organism. But exactly how much more complex?

  James Valenti
ne has noted that one useful way of comparing degrees of complexity is to assess the number of cell types in different organisms (see Fig. 8.3).11 Though a single-celled eukaryote has many specialized internal structures such as a nucleus and various organelles, it still, obviously, represents just a single type of cell. Functionally more complex animals require more cell types to perform their more diverse functions. Arthropods and mollusks, for example, have dozens of specific tissues and organs, each of which requires “functionally dedicated,” or specialized, cell types.

  These new cell types, in turn, require many new and specialized proteins. An epithelial cell lining a gut or intestine, for example, secretes a specific digestive enzyme. This enzyme requires structural proteins to modify its shape and regulatory enzymes to control the secretion of the digestive enzyme itself. Thus, building novel cell types typically requires building novel proteins, which requires assembly instructions for building proteins—that is, genetic information. Thus, an increase in the number of cell types implies an increase in the amount of genetic information.

  FIGURE 8.3

  Biological complexity scale as measured in number of cell types of different organisms.

  Applying this insight to ancient life-forms underscores just how dramatic the Cambrian explosion was. For over 3 billion years, the living world included little more than one-celled organisms such as bacteria and algae.12 Then, beginning in the late Ediacaran period (about 555–570 million years ago), the first complex multicellular organisms appeared in the rock strata, including sponges and the peculiar Ediacaran biota discussed in Chapter 4.13 This represented a large increase in complexity. Studies of modern animals suggest that the sponges that appeared in the late Precambrian, for example, probably required about ten cell types.14

  Then 40 million years later, the Cambrian explosion occurred.15 Suddenly the oceans swarmed with animals such as trilobites and anomalocaridids that probably required fifty or more cell types—an even greater jump in complexity. Moreover, as Valentine notes, measuring complexity differences by measuring differences in the number of cell types probably “greatly underestimate[s] the complexity differentials between bodyplans.”16

  One way to estimate the amount of new genetic information that appeared with the Cambrian animals is to measure the size of the genomes of modern representatives of the Cambrian groups and compare that to the amount of information in simpler forms of life. Molecular biologists have estimated that a minimally complex single-celled organism would require between 318,000 and 562,000 base pairs of DNA to produce the proteins necessary to maintain life.17 More complex single cells might require upwards of a million base pairs of DNA. Yet to assemble the proteins necessary to sustain a complex arthropod such as a trilobite would need orders of magnitude more protein-coding instructions. By way of comparison, the genome size of a modern arthropod, the fruit fly Drosophila melanogaster, is approximately 140 million base pairs.18 Thus, transitions from a single cell to colonies of cells to complex animals represent significant—and in principle measurable—increases in genetic information.

  During the Cambrian period a veritable carnival of novel biological forms arose. But because new biological form requires new cell types, proteins, and genetic information, the Cambrian explosion of animal life also generated an explosion of genetic information unparalleled in the previous history of life.19 (In Chapter 14, we’ll see that building a new animal body plan also requires another type of information, not stored in genes, called epigenetic information.)

  So can the neo-Darwinian mechanism explain the dramatic increase in genetic information that appears in the Cambrian explosion? Before addressing that question, it will help to define the concept of information and identify the kind of information that DNA contains.

  Biological Information: Shannon or Otherwise?

  Scientists typically recognize at least two basic types of information, functional (or meaningful) information and so-called Shannon information, which is not necessarily meaningful or functional. The distinction has arisen in part because of developments in a branch of applied mathematics known as information theory. During the late 1940s, mathematician Claude Shannon, working at the Bell Laboratories, developed a mathematical theory of information. Shannon equated the amount of information transmitted by a sequence of symbols or characters with the amount of uncertainty reduced or eliminated by the transmission of that sequence.20

  Shannon thought that an event or communication that didn’t eliminate much uncertainty was also not very informative. Consider an illustration. In the 1970s, when I was a teenager, if someone made a completely obvious statement, we would say, “Tell me something else I didn’t know.” Imagine that one of my classmates on the baseball team has just rushed up to breathlessly “inform” me that our team’s star pitcher is planning to throw the ball to the catcher in the next game. Such a statement would earn the scornful reply, “Tell me something else I didn’t know.”

  The obvious statement about the pitcher’s intentions also illustrates why Shannon equated the elimination of uncertainty with the transmission of information. Since star pitchers who want to go on being star pitchers have no choice but to throw the ball across the plate to the catcher, the statement of my overwrought friend eliminated no previous uncertainty. It was not informative in the least. If, on the other hand, after days of speculation on campus about which one of our team’s four pitchers the baseball coach would choose to pitch in the championship game, my friend had rushed up to me and revealed the identity of the starting pitcher, that would be different. In that case, he would have eliminated some significant uncertainty on my part with a decidedly informative statement.

  Shannon’s theory quantified the intuitive connection between reduced uncertainty and information by asserting that the more uncertainty an event or communication eliminated, the more information it conveyed. Imagine that after revealing to me the identity of the starting pitcher before the championship baseball game in the spring, he then also revealed to me the identity of the starting quarterback before the upcoming football season. Imagine as well that our baseball team had four equally competent pitchers and the football team had just two equally competent quarterbacks. Given these facts, my friend’s decision to inform me of the identity of the starting pitcher eliminated more uncertainty than his decision to reveal to me the identity of the starting quarterback.

  To assign precise quantitative measures of information, Shannon further linked the reduction of uncertainty and information to quantitative measures of probability (or improbability). Notice that in my illustration, the more informative communication reduced more uncertainty and also described a more improbable event. The probability of any one of the four pitchers being selected was 1 in 4. The probability of one of the quarterbacks being selected, given the same assumption, was only 1 in 2. The more improbable event eliminated more possibilities and more uncertainty. Thus, it conveyed more information.

  Shannon applied these intuitions to quantify the amount of information present in sequences of symbols or characters stored in texts or codes or transmitted across communications channels. Thus, in his theory the presence of an English letter in a sequence of other such letters transmits more information than a single binary digit (a zero or one) in a section of computer code. Why? Again, the letter in the English alphabet reduces uncertainty among twenty-six possibilities, whereas a single binary digit reduces uncertainty among only two. The probability of any one character from the English alphabet occurring in a sequence of other such letters (disregarding the need for spaces and punctuation) is 1 in 26. The probability of either a zero or one arising in a sequence of binary characters is 1 in 2. In Shannon’s theory the presence of the more improbable character conveys more information.

  Yet even a binary alphabet can convey an unlimited amount of information because, in Shannon’s theory, additional information is conveyed as improbabilities multiply. Imagine a grab bag of tiles with either zero or one etched onto each. Imag
ine someone producing a series of zeros and ones by reaching into the bag and placing them one by one on a game board. The probability of choosing a zero on the first pick is just 1 in 2. But the probability of choosing two consecutive zeros after placing the first back in the grab bag (and shaking the tiles) is 1 chance in 2 × 2, or 1 chance in 4. This is because there are four possible combinations of digits that could have been chosen—00, 01, 10, or 11. Similarly, the probability of producing any three-letter sequence as a result of choosing consecutively in this manner is 1 in 2 × 2 × 2, or 1 in 23 (or 1 in 8). The improbability of any specific sequence of characters increases exponentially with the number of characters in the sequence. Thus, longer and longer sequences can generate larger and larger amounts of information even using a simple binary alphabet.

  Information scientists measure such informational increases through a unit they call a bit. A bit represents the minimum amount of information that can be conveyed (or uncertainty reduced) by a single digit in a two-character alphabet.21

  Biologists can readily apply Shannon’s information theory to measure the amount of Shannon information in a sequence of DNA bases (or the sequence of amino acids in a protein) by assessing the probability of the sequence occurring and then converting that probability to an information measure in bits.22 DNA conveys information, in Shannon’s sense, in virtue of its containing long improbable arrangements of four chemicals—the four bases that fascinated Watson and Crick—adenine, thymine, guanine, and cytosine (A, T, G, and C). As Crick realized in formulating his sequence hypothesis, these nucleotide bases function as alphabetic or digital characters in a linear array. Since each of the four bases has an equal 1 in 4 chance of occurring at each site along the spine of the DNA molecule, biologists can calculate the probability, and thus the Shannon information, or what is technically known as the “information-carrying capacity,” of any particular sequence n bases long. For instance, any particular sequence three bases long has a probability of 1 chance in 4 × 4 × 4, or 1 chance in 64, of occurring—which corresponds to 6 bits of Shannon information. (Indeed, each base in a DNA sequence conveys 2 bits of information, since 1 in 4 is equal to 1 chance in 2 × 2.)

 

‹ Prev