The Tangled Tree
Page 5
That was mild but firm, a dismissive shrug. Hitchcock would ignore Charles Darwin and encourage his readers to do likewise. More telling, more defensive, was his other response: he removed the trees figure from his own book. No more Paleontological Chart. It seems never to have appeared in another edition of Elementary Geology.
Darwin and Darwin’s followers owned the tree image now. It would remain the best graphic representation of life’s history, evolution through time, the origins of diversity and adaptation, until the late twentieth century. And then rather suddenly a small group of scientists would discover: oops, no, it’s wrong.
PART II
A Separate Form of Life
9
Molecular phylogenetics, the study of evolutionary relatedness using molecules as evidence, began with a suggestion by Francis Crick, in 1958, offered passingly in an important paper devoted to something else. That was characteristic of Crick—so brilliant and recklessly imaginative that he sometimes influenced the course of biology even with his elbows.
You know Crick’s name from the most famous triumph of his life: solving the structure of the DNA molecule, with his young American partner James Watson, in 1953, for which he and Watson and one other scientist would eventually, in 1962, receive the Nobel Prize. Crick wasn’t wasting his time, in 1958, mooning about dreams of glory in Stockholm. He was still interested in DNA, but he had moved on from the sheer structural question to other big problems. He had bent his mind intensely, but with his usual sense of merry play, to the challenge of deciphering the genetic code.
The code, as you’ve heard many times but might need reminding, is written in an alphabet of four letters, each letter representing a component—a nucleotide base, in chemistry lingo—of the DNA double helix. The four letters are: A (for adenine), C (cytosine), G (guanine), and T (thymine). DNA’s full moniker is deoxyribonucleic acid, of course, and it’s worth understanding why. The two helical strands of the double helix, twining around a central axis in parallel with each other, are composed of units called nucleotides, linked in a chain, each nucleotide containing a base (that’s the A, C, G, or T), a sugar (that’s the deoxyribose), and a phosphate group (that’s the acidic part). The sugar end of one nucleotide bonds to the phosphate end of the next, forming the two long helical strands. I just called them parallel, but to be more precise, those strands are antiparallel to each other, since the sugar-phosphate binding gives them directionality—a front end and a back end—and the front end of one strand aligns with the back end of the other. The nucleotide bases, linked crossways by hydrogen bonds, hold the strands together. The base A pairs with T, the base C pairs with G, forming a stable structure, like the steps in a spiral staircase. This is the nifty arrangement that Watson and Crick deduced.
It’s not just a stable structure, though. It’s a wondrously efficient one for storing, copying, and applying heritable data. When the two strands are peeled apart, the sequence of bases along one of the strands (the template strand) represents genetic information ready to be duplicated or used. Watson and Crick noted that capacity with exquisite coyness in their 1953 paper. The paper was lapidary, only a page long, as published in the journal Nature, and included a sketch. Near the end, having proposed their double helix structure and the matchup of bases, always A with T and C with G, they wrote: “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”
But copying that material, for hereditary continuity, was one thing. Translating it into living organisms was another. Translated how? By what steps does the information in DNA become physically animate?
This mystery leads first to proteins. There are four kinds of molecule essential to living processes—carbohydrates, lipids, nucleic acids, and proteins—often collectively called the molecules of life. Proteins might be the most versatile, serving a wide range of structural, catalyzing, and transporting functions. Their piecemeal production, and the controls on the process of building and using them, are encoded in DNA. Every protein consists of a linear chain of amino acids, folded upon itself into an elaborate secondary structure. Although about five hundred amino acids are known to chemistry, only twenty of those serve as the fundamental components of life, from which virtually all proteins are assembled. But what sequences of the four bases determine which amino acids shall be added to a chain? What combination of letters specifies leucine? What combination produces cysteine? What arrangement of A, C, G, and T delivers its meaning as glutamine? What spells tyrosine? This fundamental matter—how do bases designate aminos?—became known as “the coding problem,” to which Francis Crick addressed himself in the late 1950s. Solving it was a crucial step toward understanding how organisms grow, live, and replicate.
There were questions within questions. Do the bases work in combinations? If so, how many? Two-base clusters, selected variously from the group of four and in specified order (CT, CG, AA, and so on) would allow only sixteen combinations, not enough to code twenty amino acids. Then maybe clusters of three or more? If three (such as CTC, CGA, AAA), do those triplets overlap one another, or do they function separately, like three-letter words divided by commas? If there are commas, are there periods too? Four letters, in every possible combination of three, yield sixty-four variants. Are all sixty-four possible triplets used? If so, that implies some redundancy; different triplets coding for the same amino acid. Does the code include a way of saying “Stop”? If not, where does one gene end and another begin? Crick and others were keen to know.
Crick himself had also started thinking beyond that problem, to the question of how proteins are physically assembled from the coded information, with one amino acid brought into line after another. How does the template strand find or attract its amino acids? How do those units become linked? He wanted to learn not just the language of life—its letters, words, grammar—but also the mechanics of how it gets spoken: its equivalent of lungs, larynx, lips, and tongue.
Crick was back in England by the mid-1950s, after a sojourn in the United States, and based again at the Cavendish Laboratory in Cambridge, where he had worked with Jim Watson. He had a contract with the Medical Research Council (MRC), a government agency with some mandate for fundamental as well as medical research. Solving the DNA structure, though it had brought scientific fame to Crick and Watson and would eventually bring the Nobel Prize, provided no immediate cure for Crick’s dicey financial situation, all the more acute since the birth of his and his wife Odile’s third child. He had to work for pay: a modest salary from the MRC and whatever small change the occasional radio broadcast or popular article might bring. Now he was sharing his office, his pub lunches, his fevered conversations, and his blackboard with another scientist, Sydney Brenner, rather than with Watson. One colleague at the Cavendish, upon early acquaintance with Crick, concluded that “his method of working was to talk loudly all the time.” When not talking, or listening to Brenner, he spent his time reading scientific papers, rethinking the results of other researchers, combing through such bodies of knowledge for clues to the mysteries that engaged him. He was not an experimentalist, generating data. He was a theoretician—probably the century’s best and most intuitive in the biological sciences.
Sometime in 1957 Crick gathered his thoughts and his informed guesses on this problem—about how DNA gets translated into proteins—and in September he addressed the annual symposium of the Society for Experimental Biology, convened that year at University College London. His talk “commanded the meeting,” according to one historian, and “permanently altered the logic of biology.” The published version appeared a year later, in the society’s journal, under the simple title “On Protein Synthesis.” Another historian, Matt Ridley, in his short biography of Crick, called it “probably his most remarkable paper,” comparable to Isaac Newton’s Principia and Ludwig Wittgenstein’s Tractatus. It was a commanding presentation of insights and speculations about how proteins are bui
lt from DNA instructions. It noted the important but still-fuzzy hypothesis that RNA (ribonucleic acid), the other nucleic acid, which seemed to exist in DNA’s shadow, is somehow involved. Might RNA play a role in manufacturing proteins, possibly by helping express the order (coded by DNA) in which amino acids are linked one to another? Amid such ruminations, Crick threw off another idea, almost parenthetically: ah, by the way, these long molecules could also provide evidence for evolutionary trees.
As published in the paper: “Biologists should realize that before long we shall have a subject which might be called ‘protein taxonomy’—the study of the amino acid sequences of the proteins of an organism and the comparison of them between species.”
He didn’t use the words “molecular phylogenetics,” but that’s what he was getting at: deducing evolutionary histories from the evidence of long molecules. Comparing slightly different versions of essentially the same protein (such as hemoglobin, which transports oxygen through the blood of vertebrates), as found in one creature and another, could allow you to draw inferences about degrees of relatedness between them. Those inferences would be based on assuming that the variant hemoglobins had evolved from a common ancestral molecule and that, over time, in divergent lineages, small differences in the amino sequences would have crept in, by accident if not by selective advantage. The degree of such differences between one hemoglobin and another should correlate with the amount of time elapsed since those lineages diverged. From such data, Crick suggested, you might draw phylogenetic trees. Humans have one variant of hemoglobin, horses have another. How different? How long since we shared an ancestor with horses? It could be argued, Crick added, that protein sequences also represent the most precise observable register of the physical identity of an organism, and that “vast amounts of evolutionary information may be hidden away within them.”
Having tossed off this fertile suggestion, Crick returned in the rest of the paper to his real subject: how proteins are manufactured in cells. That was his way. A passing thought, with the heft of a beer truck. Essentially he had said: Look, I’m not pursuing this protein taxonomy business, but somebody should.
10
Somebody did, though not immediately. Seven years passed, during which several other scientists began noodling along various routes that would lead to a similar idea. Two of them were Linus Pauling and Emile Zuckerkandl, who gave their own fancy name to the enterprise—they called it “chemical paleogenetics”—and they converged on it by very different trajectories.
Zuckerkandl was a young Viennese biologist whose family had escaped Nazi Europe via Paris and Algiers. He got to America, did a master’s degree at the University of Illinois (long before Carl Woese would arrive there), then returned to Paris after the war for a doctorate. He found work at a marine laboratory on the west coast of France and studied the molting cycles of crabs, which involve a molecule analogous to hemoglobin. His interest drifted from crustacean physiology to questions at the molecular level, and he hankered to return to America. In 1957 Zuckerkandl finagled a chance to meet Pauling, who by then was a celebrated chemist with the first of his two Nobel Prizes already won. The prize had given Pauling some latitude to expand his own range of concerns, from lab chemistry at the California Institute of Technology to the wider world, and some leverage in pursuing those concerns. He had two in particular: genetic diseases such as sickle cell anemia and the threats posed by thermonuclear weapons, including radioactive fallout from testing. By the late 1950s, Pauling was raising his voice. He initiated a petition against atmospheric nuclear testing that more than eleven thousand scientists signed. He had become, along with Bertrand Russell, the provocative British philosopher, also a Nobel winner, one of the world’s most august peaceniks.
Pauling’s initial encounter with Zuckerkandl coincided with his increasing interest in genetics, evolution, and mutation—most pointedly, the mutations that might be caused by radiation released in weapons tests. His interest in disease led in the same direction, because sickle cell anemia is a problem that results from mutations in one of the genes for hemoglobin. Pauling found Zuckerkandl impressive enough that he offered the younger man a postdoctoral fellowship in chemistry at Caltech. Then, when Zuckerkandl arrived in Pasadena, intending to continue work on the crab-molting molecule, Pauling discouraged that project and said, “Why don’t you work on hemoglobin?”
Pauling suggested further that he take up a newly invented technique—still primitive but promising—that employed electrophoresis (separating molecules by their sizes, using electrical charge) and other methods to “fingerprint” such proteins, distinguishing one variant from another. Comparing protein molecules that way, Pauling figured, might allow researchers to draw some evolutionary conclusions. So Zuckerkandl went to work, learning the technique and applying it to hemoglobin in variant forms. Before long, he could see the close similarity between human hemoglobin and chimpanzee hemoglobin, and that human hemoglobin was less similar to hemoglobin found in orangutans. He could also tell a pig from a shark just by looking at the molecular fingerprints. Of course, there were easier ways to tell a pig from a shark, but never mind. Although it wasn’t such a precise methodology as he might have wished, this sort of molecular comparison was a start.
Over the next half dozen years, Zuckerkandl’s work thrived, and he published a series of papers with Pauling. Some of those were invited contributions to celebratory volumes, Festschriften, in honor of eminent scientists, generally on some occasion such as retirement or a big, round birthday. Such invitations came often because of Pauling’s own eminence, and he recruited Zuckerkandl as coauthor to do much of the thinking and most of the writing. In the meantime, Pauling won his second Nobel, this time the Peace Prize in recognition of his efforts against nuclear weapons proliferation and testing. That one didn’t add to his scientific reputation (in fact, he resigned from his Caltech professorship because university administrators and trustees disapproved of his peace activism), but it certainly helped amplify his public voice. He was a busy man, much in demand. The invitations—to speak, to visit, to contribute scientific papers for ceremonial volumes—continued. Because such papers didn’t normally go through the peer-review filter, they could be a little more bold and speculative than a typical journal article. One of them, written in 1963 to honor a Russian scientist on his seventieth birthday, was titled “Molecules as Documents of Evolutionary History.” Two years later, it was reprinted in English in the Journal of Theoretical Biology, giving it much broader reach and influence. Pauling and Zuckerkandl were wading into the same pond where Francis Crick had dipped his toe.
Their 1963 paper made an important distinction between molecules that carry genetic information—such as DNA or the proteins it encodes—and other molecules, such as vitamins, that cycle through a living creature and out the other end. Information molecules have histories that can be deduced; they have ancestors from which the variant forms, in this creature or that, have descended. Scrutiny of such molecules, wrote Zuckerkandl and Pauling, can tell us three things: how much time has passed since the lineages split, what the ancestral molecules must have looked like, and what were the lines of descent. The first of those three kinds of information became known as the molecular clock, although Zuckerkandl and Pauling hadn’t yet named it. The third kind implied trees.
Zuckerkandl continued reworking and developing these ideas, with Pauling as his coauthor and sponsor. In September 1964, before a distinguished and argumentative symposium audience at Rutgers University, he delivered a long paper that became the definitive version of their shared ideas and that, despite Zuckerkandl having done most of the writing, has been called the “most influential of Pauling’s later career.” In this paper, the two authors offered their memorable metaphor: if the minor changes in molecular variants are proportional to elapsed time over the eons, they said, what you have is “a molecular evolutionary clock.”
It was tentative, a hypothesis. The hypothesis was disputed at the Rutgers sy
mposium and would be controversial in coming years, but it captured attention, it focused thought, and it promised a whole new way of measuring life’s history, if it was right. The molecular clock has since been called “one of the simplest and most powerful concepts in the field of evolution,” and also “one of the most contentious.” Crick himself later judged it “a very important idea” that turned out to be “much truer than people thought at the time.”
Emile Zuckerkandl, meanwhile, moved back to France. Along with Pauling and just a few others, he had helped launch a new scientific enterprise, and when a Journal of Molecular Evolution came into being, in 1971, he was its first editor in chief. His name isn’t familiar to the wider world, as Pauling’s is, but if you say “Zuckerkandl and Pauling” to a molecular biologist today, he or she will think “molecular clock.” Fitting as that may be, it overlooks the other important point: the other metaphor embedded in the long Rutgers paper, where Zuckerkandl wrote that “branching of molecular phylogenetic trees should in principle be definable in terms of molecular information alone.” This was a whole new way of sketching those trees, which rose and spread their branches as the clock ticked.
11
Carl Woese came to the University of Illinois, in Urbana, in 1964, the same year Zuckerkandl delivered the paper at Rutgers. The enterprise that would become molecular phylogenetics—back then bruited under other names, such as Crick’s protein taxonomy, and Pauling and Zuckerkandl’s chemical paleogenetics—had begun to attract interest. Woese saw its deepest possibilities more clearly than anyone else. Molecular sequence information, he realized, could be used to read the shape of the past.