But if the protein chain was altered by exactly one amino acid, then its gene had to be different by precisely one triplet (“one triplet encodes one amino acid”). Indeed, as predicted, when the gene encoding the hemoglobin B chain was later identified and sequenced in sickle-cell patients, there was a single change: one triplet in DNA—GAG—had changed to another—GTG. This resulted in the substitution of one amino acid for another: glutamate was switched to valine. That switch altered the folding of the hemoglobin chain: rather than twisting into its neatly articulated, clasplike structure, the mutant hemoglobin protein accumulated in stringlike clumps within red cells. These clumps grew so large, particularly in the absence of oxygen, that they tugged the membrane of the red cell until the normal disk was warped into a crescent-shaped, dysmorphic “sickle cell.” Unable to glide smoothly through capillaries and veins, sickled red cells jammed into microscopic clots throughout the body, interrupting blood flow and precipitating the excruciating pain of a sickling crisis.
It was a Rube Goldberg disease. A change in the sequence of a gene caused the change in the sequence of a protein; that warped its shape; that shrank a cell; that clogged a vein; that jammed the flow; that racked the body (that genes built). Gene, protein, function, and fate were strung in a chain: one chemical alteration in one base pair in DNA was sufficient to “encode” a radical change in human fate.
* * *
I. A team led by James Watson and Walter Gilbert at Harvard also discovered the “RNA intermediate” in 1960. The Watson/Gilbert and Brenner/Jacob papers were published back to back in Nature.
II. This “triplet code” hypothesis was also supported by elementary mathematics. If a two-letter code was used—i.e., two bases in a sequence (AC or TC) encoded an amino acid in a protein—you could only achieve 16 combinations, obviously insufficient to specify all twenty amino acids. A triplet-based code had 64 combinations—enough for all twenty amino acids, with extra ones still left over to specify other coding functions, such as “stopping” or “starting” a protein chain. A quadruplet code would have 256 permutations—far more than needed to encode twenty amino acids. Nature was degenerate, but not that degenerate.
III. The alteration of the single amino acid was discovered by Vernon Ingram, a former student of Max Perutz’s.
Regulation, Replication, Recombination
Nécessité absolue trouver origine de cet emmerdement [It is absolutely necessary to find the origin of this pain in the ass].
—Jacques Monod
Just as the formation of a giant crystal can be seeded by the formal arrangement of a few critical atoms at its core, the birth of a great body of science can be nucleated by the interlocking of a few crucial concepts. Before Newton, generations of physicists had thought about phenomena such as force, acceleration, mass, and velocity. But Newton’s genius involved defining these terms rigorously and linking them to each other via a nest of equations—thereby launching the science of mechanics.
By similar logic, the interlocking of just a few crucial concepts—
—relaunched the science of genetics. In time, as with Newtonian mechanics, the “central dogma” of genetics would be vastly refined, modified, and reformulated. But its effect on the nascent science was profound: it locked a system of thinking into place. In 1909, Johannsen, coining the word gene, had declared it “free of any hypothesis.” By the early 1960s, however, the gene had vastly exceeded a “hypothesis.” Genetics had found a means to describe the flow of information from organism to organism, and—within an organism—from encryption to form. A mechanism of heredity had emerged.
But how did this flow of biological information achieve the observed complexity of living systems? Take sickle-cell anemia as a case in point. Walter Noel had inherited two abnormal copies of the hemoglobin B gene. Every cell in his body carried the two abnormal copies (every cell in the body inherits the same genome). But only red blood cells were affected by the altered genes—not Noel’s neurons or kidneys or liver cells or muscle cells. What enabled the selective “action” of hemoglobin in red blood cells? Why was there no hemoglobin in his eye or his skin—even though eye cells and skin cells and, indeed, every cell in the human body possessed identical copies of the same gene? How, as Thomas Morgan had put it, did “the properties implicit in genes become explicit in [different] cells?”
In 1940, an experiment on the simplest of organisms—a microscopic, capsule-shaped, gut-dwelling bacterium named Escherichia coli—provided the first crucial clue to this question. E. coli can survive by feeding on two very different kinds of sugars—glucose and lactose. Grown on either sugar alone, the bacterium begins to divide rapidly, doubling in number every twenty minutes or so. The curve of growth can be plotted as an exponential line—1, 2-, 4-, 8-, 16-fold growth—until the culture turns turbid, and the sugar source has been exhausted.
The relentless ogive of growth fascinated Jacques Monod, the French biologist. Monod had returned to Paris in 1937, having spent a year studying flies with Thomas Morgan at Caltech. Monod’s visit to California had not been particularly fruitful—he had spent most of his time playing Bach with the local orchestra and learning Dixie and jazz—but Paris was utterly depressing, a city under siege. By the summer of 1940, Belgium and Poland had fallen to the Germans. In June 1940, France, having suffered devastating losses in battle, signed an armistice that allowed the German army to occupy much of Northern and Western France.
Paris was declared an “open city”—spared from bombs and ruin, but fully accessible to Nazi troops. The children were evacuated, the museums emptied of paintings, the storefronts shuttered. “Paris will always be Paris,” Maurice Chevalier sang, if pleadingly, in 1939—but the City of Lights was rarely illuminated. The streets were ghostly. The cafés were empty. At night, regular blackouts plunged it into an infernally bleak darkness.
In the fall of 1940, with red-and-black flags bearing swastikas hoisted on all government buildings, and German troops announcing nightly curfews on loudspeakers along the Champs-Élysées, Monod was working on E. coli in an overheated, underlit attic of the Sorbonne (he would secretly join the French resistance that year, although many of his colleagues would never know his political proclivities). That winter, with his lab now nearly frozen by the chill—he had to wait penitently until noon, listening to Nazi propaganda on the streets while waiting for some of the acetic acid to thaw—Monod repeated the bacterial growth experiment, but with a strategic twist. This time, he added both glucose and lactose—two different sugars—to the culture.
If sugar was sugar was sugar—if the metabolism of lactose was no different from that of glucose—then one might have expected bacteria fed on the glucose/lactose mix to exhibit the same smooth arc of growth. But Monod stumbled on a kink in his results—literally so. The bacteria grew exponentially at first, as expected, but then paused for a while before resuming growth again. When Monod investigated this pause, he discovered an unusual phenomenon. Rather than consuming both sugars equally, the E. coli cells had selectively consumed glucose first. Then the bacterial cells had stopped growing, as if reconsidering their diet, switched to lactose, and resumed growth again. Monod called this diauxie—“double growth.”
That bend in the growth curve, small though it was, perplexed Monod. It bothered him, like a sand grain in the eye of his scientific instinct. Bacteria feeding on sugars should grow in smooth arcs. Why should a switch in sugar consumption cause a pause in growth? How might a bacterium even “know,” or sense, that the sugar source had been switched? And why was one sugar consumed first, and only then the second, like a two-course bistro lunch?
By the late 1940s, Monod had discovered that the kink was the result of a metabolic readjustment. When bacteria switched from glucose to lactose consumption, they induced specific lactose-digesting enzymes. When they switched back to glucose, these enzymes disappeared and glucose-digesting enzymes reappeared. The induction of these enzymes during the switch—like changing cutlery between dinner c
ourses (remove the fish knife; set the dessert fork)—took a few minutes, thereby resulting in the observed pause in growth.
To Monod, diauxie suggested that genes could be regulated by metabolic inputs. If enzymes—i.e., proteins—were being induced to appear and disappear in a cell, then genes must be being turned on and off, like molecular switches (enzymes, after all, are encoded by genes). In the early 1950s, Monod, joined by François Jacob in Paris, began to systematically explore the regulation of genes by E. coli by making mutants—the method used with such spectacular success by Morgan with fruit flies.I
As with flies, the bacterial mutants proved revealing. Monod and Jacob, working with Arthur Pardee, a microbial geneticist from America, discovered three cardinal principles that governed the regulation of genes. First, when a gene was turned on or off, the DNA master copy was always kept intact in a cell. The real action was in RNA: when a gene was turned on, it was induced to make more RNA messages and thereby produce more sugar-digesting enzymes. A cell’s metabolic identity—i.e., whether it was consuming lactose or glucose—could be ascertained not by the sequence of its genes, which was always constant, but by the amount of RNA that a gene was producing. During lactose metabolism, the RNAs for lactose-digesting enzymes were abundant. During glucose metabolism, those messages were repressed, and the RNAs for glucose-digesting enzymes became abundant.
Second, the production of RNA messages was coordinately regulated. When the sugar source was switched to lactose, the bacteria turned on an entire module of genes—several lactose-metabolizing genes—to digest lactose. One of the genes in the module specified a “transporter protein” that allowed lactose to enter the bacterial cell. Another gene encoded an enzyme that was needed to break down lactose into parts. Yet another specified an enzyme to break those chemical parts into subparts. Surprisingly, all the genes dedicated to a particular metabolic pathway were physically present next to each other on the bacterial chromosome—like library books stacked by subject—and they were induced simultaneously in cells. The metabolic alteration produced a profound genetic alteration in a cell. It wasn’t just a cutlery switch; the whole dinner service was altered in a single swoop. A functional circuit of genes was switched on and off, as if operated by a common spool or a master switch. Monod called one such gene module an operon.II
The genesis of proteins was thus perfectly synchronized with the requirements of the environment: supply the correct sugar, and a set of sugar-metabolizing genes would be turned on together. The terrifying economy of evolution had again produced the most elegant solution to gene regulation. No gene, no message, and no protein labored in vain.
How did a lactose-sensing protein recognize and regulate only a lactose-digesting gene—and not the thousands of other genes in a cell? The third cardinal feature of gene regulation, Monod and Jacob discovered, was that every gene had specific regulatory DNA sequences appended to it that acted like recognition tags. Once a sugar-sensing protein had detected sugar in the environment, it would recognize one such tag and turn the target genes on or off. That was a gene’s signal to make more RNA messages and thereby generate the relevant enzyme to digest the sugar.
A gene, in short, possessed not just information to encode a protein, but also information about when and where to make that protein. All that data was encrypted in DNA, typically appended to the front of every gene (although regulatory sequences can also be appended to the ends and middles of genes). The combination of regulatory sequences and the protein-encoding sequence defined a gene.
Once again, we might return to our analogy to an English sentence. When Morgan had discovered gene linkage in 1910, he had found no seeming logic to why one gene was physically strung with another on a chromosome: the sable-colored and the white-eyed genes seemed to have no common functional connection, yet sat, cheek by jowl, on the same chromosome. In Jacob and Monod’s model, in contrast, bacterial genes were strung together for a reason. Genes that operated on the same metabolic pathway were physically linked to each other: if you worked together, then you lived together in the genome. Specific sequences of DNA were appended to a gene that provided context for its activity—its “work.” These sequences, meant to turn genes on and off, might be likened to punctuation marks and annotations—inverted quotes, a comma, a capitalized letter—in a sentence: they provide context, emphasis, and meaning, informing a reader what parts are to be read together, and when to pause for the next sentence:
“This is the structure of your genome. It contains, among other things, independently regulated modules. Some words are gathered into sentences; others are separated by semicolons, commas, and dashes.”
Pardee, Jacob, and Monod published their monumental study on the lactose operon in 1959, six years after the Watson and Crick paper on the structure of DNA. Called the Pa-Ja-Mo—or, colloquially, the Pajama—paper, after the initials of the three authors, the study was instantly a classic, with vast implications for biology. Genes, the Pajama paper argued, were not just passive blueprints. Even though every cell contains the same set of genes—an identical genome—the selective activation or repression of particular subsets of genes allows an individual cell to respond to its environments. The genome was an active blueprint—capable of deploying selected parts of its code at different times and in different circumstances.
Proteins act as regulatory sensors, or master switches, in this process—turning on and turning off genes, or even combinations of genes, in a coordinate manner. Like the master score of a bewitchingly complex symphonic work, the genome contains the instructions for the development and maintenance of organisms. But the genomic “score” is inert without proteins. Proteins actualize this information. They conduct the genome, thereby playing out its music—activating the viola at the fourteenth minute, a crash of cymbals during the arpeggio, a roll of drums at the crescendo. Or conceptually:
The Pa-Ja-Mo paper laid a central question of genetics to bed: How can an organism have a fixed set of genes, yet respond so acutely to changes in the environment? But it also suggested a solution to the central question in embryogenesis: How can thousands of cell types arise from an embryo out of the same set of genes? The regulation of genes—the selective turning on and off of certain genes in certain cells, and at certain times—must interpose a crucial layer of complexity on the unblinking nature of biological information.
It was through gene regulation, Monod argued, that cells could achieve their unique functions in time and space. “The genome contains not only a series of blue-prints [i.e., genes], but a co-ordinated program . . . and a means of controlling its execution,” Monod and Jacob concluded. Walter Noel’s red blood cells and liver cells contained the same genetic information—but gene regulation ensured that the hemoglobin protein was only present in red blood cells, and not in the liver. The caterpillar and the butterfly carry precisely the same genome—but gene regulation enables the metamorphosis of one into the other.
Embryogenesis could be reimagined as the gradual unfurling of gene regulation from a single-celled embryo. This was the “movement” that Aristotle had so vividly imagined centuries before. In a famous story, a medieval cosmologist is asked what holds the earth up.
“Turtles,” he says.
“And what holds up the turtles?” he is asked.
“More turtles.”
“And those turtles?”
“You don’t understand.” The cosmologist stamps his foot. “It’s turtles all the way.”
To a geneticist, the development of an organism could be described as the sequential induction (or repression) of genes and genetic circuits. Genes specified proteins that switched on genes that specified proteins that switched on genes—and so forth, all the way to the very first embryological cell. It was genes, all the way.III
Gene regulation—the turning on and off of genes by proteins—described the mechanism by which combinatorial complexity could be generated from the one hard copy of genetic information in a cell. But it could not explai
n the copying of genes themselves: How are genes replicated when a cell divides into two cells, or when a sperm or egg is generated?
To Watson and Crick, the double-helix model of DNA—with two complementary “yin-yang” strands counterposed against each other—instantly suggested a mechanism for replication. In the last sentence of the 1953 paper, they noted: “It has not escaped our notice that the specific pairing [of DNA] we have postulated immediately suggests a possible copying mechanism for the genetic material.” Their model of DNA was not just a pretty picture; the structure predicted the most important features of the function. Watson and Crick proposed that each DNA strand was used to generate a copy of itself—thereby generating two double helices from the original double helix. During replication, the yin-yang strands of DNA were peeled apart. The yin was used as a template to create a yang, and the yang to make a yin—and this resulted in two yin-yang pairs.
But a DNA double helix cannot autonomously make a copy of itself; otherwise, it might replicate without self-control. An enzyme was likely dedicated to copying DNA—a replicator protein. In 1957, the biochemist Arthur Kornberg set out to isolate the DNA-copying enzyme. If such an enzyme existed, Kornberg reasoned, the easiest place to find it would be in an organism that was dividing rapidly—E. coli during its furious phase of growth.
By 1958, Kornberg had distilled and redistilled the bacterial sludge into a nearly pure enzyme preparation (“A geneticist counts; a biochemist cleans,” he once told me). He called it DNA polymerase (DNA is a polymer of A, C, G, and T, and this was the polymer-making enzyme). When he added the purified enzyme to DNA, supplied a source of energy and a reservoir of fresh nucleotide bases—A, T, G, and C—he could witness the formation of new strands of nucleic acid in a test tube: DNA made DNA in its own image.
The Gene Page 21