Arrival of the Fittest: Solving Evolution's Greatest Puzzle

Home > Science > Arrival of the Fittest: Solving Evolution's Greatest Puzzle > Page 16
Arrival of the Fittest: Solving Evolution's Greatest Puzzle Page 16

by Andreas Wagner


  This work was analogous to building one of those flight simulators that are indispensable for training military and commercial pilots—by reproducing not only all the complex machinery of a cockpit, but also disturbances like turbulence and instrument failures. Similarly, John’s fly simulator collected mountains of data about the regulators of early fly embryos and how they regulate each other, encapsulated this information in equations, and simulated them in a computer. And like a good flight simulator, this one worked—not a small achievement. It can mimic the early development of fruit flies, and does so at enormously accelerated speed. It can be run over and over again, to tease out patterns that might be missed in isolated examples. And more than reenacting the choreography of a normal embryo, it can also simulate the plane crashes of regulator malfunction, and explain how mutant genes lead to deformed embryos.26 As I write these lines, John has devoted decades of his life to building this simulator, often in the face of ignorant and condescending peers. His dedication often crosses my mind when I am about to swat a fly. (Then I swat.)

  Beyond sharing a backbone and a spinal cord, the more than sixty thousand species of vertebrates, which include fish, mammals, amphibians, reptiles, and birds, have incredibly diverse bodies. This diversity is built on similar structures, however, because all vertebrates can trace their lineage back to a common ancestor that appeared more than five hundred million years ago. For example, the paired fins—one pair in front, one in the back—that help fish push and steer themselves through water gave rise to the arms and legs of animals that crawl, walk, jump, and run on land. And the forearms of some of these animals—dinosaurs—changed into the wings of birds.

  Limbs are key innovations of land-living vertebrates. They have three familiar parts, the upper arm and upper leg, the lower arm and lower leg, and the hands and feet. The major bones in our arms and legs correspond to arm and leg bones in horses, dogs, eagles, bats, pigs, crocodiles, and many other animals. Transform their sizes, as evolution did, and many specialized functions become possible, such as the slender limbs of horses custom-made for running, and the light bones of wings perfected for flying.

  Limbs—old and new—owe their existence to a family of regulators that are used in building the bodies of thousands of organisms, from jellyfish to humans.27 Though these regulators are essential for normal body development, the name for the genes that encode them—homeobox, or Hox, genes—comes from their role in homeosis, a process that creates malformed organisms when these genes are mutated, such as flies with (useless) legs sprouting from their heads in place of antennae. For better or for worse, changing life’s recipes can have dramatic effects.28

  The homeobox is a protein sequence of sixty amino acids that binds DNA and allows Hox regulators to control gene expression. In organisms as different as fruit flies and humans, these regulators are lords over hundreds of other genes that give texture to cells, tissues, and organs. Hox regulators also regulate one another’s expression. That is, they form a regulator circuit like that of figure 16, but much more complex, because animals can have forty or more of them. This circuit shapes important parts of many bodies—including ours. Among these parts are the thirty-three vertebrae in our spinal column and their unique identities—two vertebrae in our neck with flexible joints, twelve vertebrae in our thorax with attached ribs, and so on.

  The Hox gene circuit expresses different combinations of genes in the neck, thorax, and abdomen as our backbone develops in the womb. Each of these combinations is a gene expression code, an on-off pattern of Hox genes that is specific to each body region. One on-off pattern specifies neck vertebrae, another specifies thoracic vertebrae, and so on.

  Hox genes mold not just the human body but also the bodies of vertebrates like pythons and other snakes whose body plan—another ancient innovation—allows them to slither, burrow, and swim. Some snakes have more than three hundred vertebrae, most of them identical and rib-carrying like our twelve thoracic vertebrae. Hox genes are responsible for these differences between snakes and other animals: In most vertebrates, the Hox code for the thorax is expressed in only a small region of the embryo, but this region stretched like a rubber band when snakes evolved from lizards more than a hundred million years ago. The thoracic Hox code became expressed along most of the main body axis, and allowed snakes to build the hundreds of vertebrae that define their new body plan.29

  Hox genes shape not only the main axis of an organism’s body—the axis defined by the backbone in vertebrates—but evolution also co-opted them to mold another ancient innovation, the fins of fish.30 And it did not stop there. Over millions of years, evolution transformed these fins into limbs, by altering, refining, and differentiating the fins’ Hox code. Eventually it created a three-part code in organisms that walk or fly, a specific combination of Hox genes for the upper arm, another for the lower arm, and a third for the hand. We know all this from the effects of mutations that garble this code and appear as horrific birth defects that are well studied in animals. When two genes called Hoxa11 and Hoxd11 are not expressed while limbs develop, the results can be no lower arms at all, or a hand sprouting from somewhere near the elbow. Likewise, missing fingers or palms in a newborn can result from a failure to express two hand-specific Hox genes—Hoxa13 and Hoxd13. If the expression of a third group of Hox genes fails, only the upper arm will form.31

  Most of the time, though, Hox genes do their job very well. And they do it in an impressive number of locations, helping form structures from the pelvis to the vertebrate brain. They also help to build the bodies of organisms as different as shrimp, jellyfish, worms, and even fruit flies, where the Hox circuit is as important as the segmentation circuit. In fact, the two work sequentially. After the segmentation circuit establishes segment numbers, the Hox circuit specifies segment identities—which segments will carry legs, which ones wings, and so on. And these circuits are only two among many that flies and most animals use to build their bodies, and have used ever since the first animals emerged hundreds of millions of years ago.

  Hox gene circuits were instrumental in the origin of new body parts—like limbs—as well as new body plans, like those of snakes. How exactly these innovations originated may be forever lost in the mists of life’s deep history, but one principle is crystal clear: They originated through changes in regulation.

  The same principle is just as clear in other, more ancillary innovations.

  Imagine a slender lizard that weaves its way through a dense meadow hunting for its next meal when suddenly a huge pair of eyes stares into its face. It freezes, knowing that it will be torn to pieces in a moment. But then two wings flap, and the eyes are gone like a mirage. No predator was near, just two enormous colored spots on the wings of a tasty butterfly.

  The eyespots of butterflies are lifesaving bluffs, formed by an unusually versatile regulator protein called distalless.32 A member of a circuit that molds the legs, wings, and antennae of flies, distalless has also been co-opted to paint eyespots on the butterfly’s wings. We know that distalless is part of an eyespot-specific expression code, because developing butterfly larvae make distalless exactly where the eyespots will later form. Some butterflies have smaller eyespots, others larger eyespots, some have only one eyespot, others several. Regardless, developing butterflies unfailingly express distalless in the location of their eyespots. And distalless is really a cause of eyespots rather than just correlated with their appearance: If one transplants distalless-producing cells of a developing wing to different wing locations, development will paint an eyespot there.33

  The cathedral of a butterfly’s body is built by regulation, from the nave of its main segments to the gargoyles of its eyespots. So are bodies with completely different blueprints, like those of plants, with their roots, stems, flowers, and leaves. When flowering plants first originated more than two hundred million years ago, they had simple leaves whose blades were undivided and formed one continuous surface. Later, simple leaves gave rise to the innovation of a
dissected leaf, where many small leaflets subdivide a leaf blade (figure 17).

  FIGURE 17. Leaf shapes

  Dissecting a simple leaf into leaflets offers more than one advantage. Dissected leaves have a greater surface area than simple leaves. They can absorb more carbon dioxide for photosynthesis, which allows plants to grow faster, and they can prevent leaves from overheating in hot environments, which can slow down photosynthesis and damage the leaf.34 If dissected leaves are so useful, we might expect them to appear more than once in evolution, and indeed they did: Dissected leaves arose more than twenty times in flowering plants alone.35

  Each time this innovation required changes in regulation. When a plant seedling germinates and pushes through the soil, a tiny speck of tissue at its very tip contains dividing cells that enlarge the seedling and push it upward. It is here that leaves begin to form. Before you can see a nascent leaf with the naked eye, a cluster of multiple cells—the leaf primordium—is already set aside around the tip to become a leaf. Cells in this primordium express a regulator protein called KNOX. When Angela Hay and Miltos Tsiantis from Oxford University manipulated this protein in the modest weed known as the hairy bittercress, which sports dissected leaves, they found how crucial this regulator is. By decreasing the amount of KNOX, they could reduce the number of leaflets down to one, creating a simple leaf. Increasing the amount of KNOX created leaves with more leaflets. Plus, they found that KNOX plays this role not just in the hairy bittercress but in several other plant species with dissected leaves.36

  These examples and hundreds more illustrate the power of regulation to innovate. The lab notebooks of thousands of researchers and the pages of dozens of scholarly journals are overflowing with research on regulators like KNOX in plants, distalless in butterflies, and engrailed in fruit flies. Our own genome encodes more than two thousand different regulators in dozens of separate circuits.37 A half century of research has told us how important regulation is to building bodies old and new. It has helped us to understand the natural history of many innovations and the new expression codes behind them.

  But a list of examples, however long, cannot go beyond that. Lizards’ limbs and fishes’ fins are shaped by different variants of Hox circuits—different circuit genotypes—that produce different expression codes. Identifying any one such circuit variant does not explain how evolution found the one whose expression code is best suited for a task. (If there are too many circuit variants, this could be impossibly hard.) What’s more, while circuits change little by little in evolution, useful expression codes need to be preserved before new and better ones are found. No list of examples, however long, could tell us how innovation through regulation is even possible.

  If the problem is familiar, so is the solution: Study not just one circuit but many, an entire library of circuit genotypes and their expression phenotypes. The texts in this regulation library are the DNA genotypes that encode regulators and the words they recognize. But writing them like that would be unnecessarily long and tedious, as if you described a house through the position of all its molecules, rather than by an architect’s blueprint. Much better to write them as wiring diagrams like those of figure 16.

  The entire library comprises all possible such circuits—all possible wiring diagrams. To compute its size we need to count these wiring diagrams. That may seem hard, but it is surprisingly easy. Any regulator in a circuit, call it A, can influence another regulator, B, in three principal ways. Regulator A can activate B, it can repress it, or it can have no effect. The same holds for any other pair, say, A and C, or D and E, in the circuit of figure 16. One can activate the other, repress the other, or have no effect on it. These are the only three options. This simple idea takes us almost all the way to counting all five-gene circuits. What’s left is to count the number of gene pairs. The circuit of figure 16 has 5 × 5 = 25 of them, each with three regulation options.38 To find the total number of circuits, we then need to multiply three for the first gene pair with three for the second gene pair with three for the third pair, and so on, for all twenty-five pairs. Three multiplied with itself 25 times yields 325, or more than 800 billion circuits.

  FIGURE 18. Two neighbors in the circuit library

  An impressive number. Five genes. More than 800 billion circuits. Especially since actual regulation circuits can have many more than five genes. The Hox gene circuits of vertebrates, for example, comprise some forty-odd genes.39 To count the number of circuits that these genes could form, we use the same idea: Compute the number of gene pairs (40 × 40 = 1,600), and then multiply the number 3 with itself 1,600 times. The magnitude of the resulting number has the ring of familiarity. It is greater than a 1 with 700 zeroes behind it, more zeroes than would fit on this page.

  But even that number, impressive as it is, doesn’t yet capture all circuits. So far, we assumed that all regulators were equally influential, turning their target genes either on or off. But remember the king’s cabinet of counselors. Some regulators can be weak, others strong, and that difference further increases the number of circuits: Any two genes might face not three but five possibilities, no regulation, weak or strong activation, weak or strong repression. In this case, we would need to multiply the number 5—not the number 3—by itself many times. And why stop there? We could distinguish ever-finer gradations of activation or repression leading to ever-increasing numbers of possible circuits.40 Fortunately, research in my laboratory has shown that these finer gradations of influence don’t change the library’s organization—a good thing, since even three gradations create enough circuits to fill yet another hyperastronomical library.

  The circuit library and its genotype texts have much in common with the metabolic library and the protein library we encountered earlier. Clip or add a wire in a circuit through DNA mutations—remember that these “wires” are not made of metal, but symbolize regulatory connections between two genes that can be altered through DNA mutations—and you create one of the circuit’s neighbors, like that on the right of figure 18, where gene B no longer regulates gene D (see the thick black arrow in the left circuit). Each circuit has many such neighbors, more than three thousand for a circuit with forty genes. If we arrange all circuits on the corners of a hyperdimensional cube, one circuit per corner, then stepping away from a circuit is like moving along an edge of this hypercube to the next corner.41 And many edges lead away from each circuit, because this hypercube also has many dimensions, sixteen hundred for circuits of forty genes. It has even more corners, 10700 of them, the number of texts in the entire library of forty-gene circuits.42

  As in the other two libraries, each circuit on each corner has a neighborhood that includes all texts on nearby shelves—the circuits that differ from it in only one or a few wires. Evolution can easily explore this neighborhood in a few steps, DNA alterations that change as little as one DNA word, and create or destroy regulation between two genes. Walk beyond this neighborhood and you encounter circuits that are ever more distant—another familiar concept. Here, distance is the number of wires by which two circuits differ. Neighboring circuits are closest, and farthest apart are two circuits that do not share a single wire. They are texts in opposite corners of the hypercube library.

  Many circuit genotypes will be as meaningless as a random string of English letters. Others may encode meaningful words or sentences, even though the text as a whole may be incoherent, or even destructive like the mutant Hox circuits that create crippled arms without hands. The language of meaningful texts is once again a chemical language, that of gene regulation and expression codes that cells and tissues understand. It ultimately manifests itself in a backbone, a leaf, or a hand, each one a parcel of meaning embodied in flesh.43 And when evolution creates new embodied meaning, it does so through the kinds of mutations that turn a simple leaf into a dissected leaf.44

  A circuit’s meaning is expressed through the elaborate choreography of gene regulation I described earlier. Starting from a pattern of expressed regulators—like t
he one a fly imposes on its egg through its chemical signals—circuit genes regulate each other and change this pattern. Genes twinkle on and off until a circuit finds an equilibrium resembling the human sculptures that troupes of circus acrobats build with their bodies. In such a sculpture, the acrobats are in a stationary equipoise where the push of one body equals the pull of another, and where the structure would collapse if only one acrobat were to let go.

  After many years of research, we have learned enough about this kind of regulation to compute this equilibrium, as John Reinitz showed with his fly simulator.45 This means we are ready to read not just one, not a few, but millions of circuits. We can map an entire hyperastronomical library of them.

  We already know that this library contains unimaginably many circuits, but the number of their expression codes is not for the fainthearted either. If each gene in a forty-gene circuit can only be on or off, it contributes two possibilities to a gene expression pattern. To calculate the total number of possible expression patterns, we need to multiply two with itself, as many times as there are genes, to arrive at 240 possible phenotypes. This number is already greater than a trillion, but it grows much larger if we consider that a gene can be more than just on or off—it can express a small, medium, large, or very large amount of regulator protein. What’s more, several circuits often cooperate to shape any one body part, which multiplies the number of possible expression codes.46 Compared to the number of these possible meanings, the few hundreds of cell types and tissues of a complex body like ours are paltry. Even if we allow that all cells in a body must be laid out with precision in space, there are plenty of expression codes to go around—perhaps too many to find any one of them.

 

‹ Prev