Other extremes too have no known ultimate limits. In the crust of the Earth and deep in the oceans, high pressures can contract and constrict the molecules of the cell, yet life is found at the bottom of the Mariana Trench, some eleven kilometers deep in the oceans, where pressures are a thousand times higher than at sea level. In the crust of the Earth, deep underground, life thrives. Adaptations are found to deal with these elevated pressures. Pores and transporters across the cell membrane help expel waste and take up nutrients; proteins are modified in these so-called barophiles, or pressure lovers. As you dig down into the crust, temperature will most likely limit life before pressure does, the geothermal gradient exposing life to prohibitive heat before the effects of pressure can immobilize a cell.
Here again we have much to learn. In the absence of the temperature problem, does life succumb to some pressure limit? The problem with high pressure is that it indirectly affects many other things, such as the solubility of gases and the behavior of fluids. Whether life at pressure extremes would ultimately give in because of the direct effects on the cell or because the changing behavior of its milieu might starve it of nutrients or energy are matters still to be fully studied.
Among the plethora of other extremes, one extreme might pose a hard limit to life. Ionizing radiation, like high temperatures, imparts energy to biological molecules and damages or destroys them. We know that life can resist the effects of radiation. Molecules such as DNA can be repaired where the strands are damaged. Proteins can be constructed again, and some pigments, such as the carotenoids, can quench the reactive oxygen states produced by the radiation as it slams into water. Life has an armory of responses for dealing with molecular damage caused by radiation, and when these responses are assembled into a single microbe, the results can be impressive.
The humble Chroococcidiopsis, a cyanobacterium that lives in rocks in the world’s deserts, can take a dose of about fifteen kilograys, about a thousand times the dose that would kill a human. This microbe joins the ranks of Deinococcus radiodurans, another bacterial life form that, through repair and damage mitigation, can hold off ten kilograys of radiation or more.
There must be an upper limit to radiation. Assault a cell with enough of this form of energy, and the ability to repair and make new molecules will be overwhelmed in a similar way to the destruction caused by high temperatures. On our planet, where only a few natural and artificial environments expose life to sustained high doses of radiation, excesses of radiation have likely not confronted evolution with a hard limit as often as, say, extremes of temperature have done. Nevertheless, we might imagine that such a boundary exists.
The biosphere is like a zoo, surrounded by a wall. Within it, all manner of living things, minuscule and giant, have evolved, guided by laws into predictable forms. Restrictive though these rules are, they permit the burgeoning of an experiment in biological complexity that is extraordinarily diversified in its minutiae. However, the biosphere’s potential is ruthlessly curtailed by the tough perimeter surrounding the zoo. Some of these limits are probably universal. No evolutionary roll of the dice can overcome a lack of a solvent within which to do biochemistry or the energetic extremes of high temperatures. The details, the temperature sensitivity of this and that protein, may well modify the exact transition between the living and the dead, particularly for individual life forms, maybe for life as a whole. But in broad scope, life’s boundaries, the insuperable laws of physics, establish a solid wall that bounds us all together.
This zoo is by no means expansive. From a cursory survey of the kaleidoscope of life on Earth, it is easy to think of life’s diversity as endless, and its small variations are. The physical space that life occupies at the planetary scale, and the physical and chemical conditions it can adapt to, within the vast range of conditions found across the known universe, are petite. We live in a diminutive bubble, circumscribed by universal extremes, within which the restricted trajectories of evolution explore their reach.
CHAPTER 7
THE CODE OF LIFE
“WE HAVE FOUND THE secret of life!”
These immortal words were uttered in The Eagle, a pub on Free School Lane in Cambridge, England, the day the structure of the genetic code, DNA, was discovered.
I suspect what was actually said was, “Jim, what are you having?” “Oh, a pint of lager please, Francis,” “OK. That’ll be a pint of lager and a Guinness, please, and two packets of pork scratchings.” Or something like that.
Anyway, far be it for me to ruin the romantic dreams of the credulous. But I have little hesitation in saying that in February 1953, when James Watson and Francis Crick, with inspiration from X-ray images made by Rosalind Franklin, proposed a structure of DNA, a monumental step forward was made in deducing the centerpiece of life. This molecule is the code that contains the instruction manual from which living things on Earth are made—the cipher of the cell.
When the secrets of this molecule were unraveled, it was not surprising that those who surveyed its features would, for many years to come, consider DNA a freak, a chance product of evolution—a molecule whose structure was special. Its architecture seemed so unlikely that if anyone was asked whether such a molecule could evolve on another planet, they would probably have answered that it wasn’t impossible, but that an event of such low probability would be astounding. In an early paper on the evolution of the genetic code, Crick himself described it as a “frozen accident” that occurred at the birth of life, got fixed in the very bedrock of living things, and, once there, could not be displaced without a catastrophic effect on the cell, probably causing its death. Once such a crucial code and its entourage of structures to read it were in place, the smallest error or alteration would be fatal. This view is compelling, but seems increasingly unlikely.
At this next level down in the hierarchy of life, from cells to the molecules that encode and fabricate their form, new light has been shed on evolution’s choices. Here too we have begun to see the indelible mark of physical principles operating through the chemistry of living things, channeling and cajoling the code of life into an edifice that seems to have very much more to it than the mere quirks of chance.
Take this molecule, a double helix. Unwind it flat, and place it down on the table in front of you, magnified. On your left and right are the two backbones running down the table toward you. Made of repeating units of phosphate and ribose (a simple sugar), these two chemical struts of the DNA ladder hold the whole molecule together. Between them, running down through the middle, are the guts of the machine, the rungs of the ladder. Sequentially attached to the backbone on the left and right is the four-letter alphabet of the genetic code, made up of adenine (A), thymine (T), cytosine (C), and guanine (G). Strung along, one by one in endlessly varied combinations, these nucleobases spell out the information that the cell will read to grow, repair, and build copies of itself.
These four little molecules are peculiar, for they bind to other members of the group in a very specific way. An A can only bind with a T, and the C only with a G, and vice versa, to form a pair. As the letters of the code have this very fussy binding preference, if there is an A on the left, then there will be a complementary T on the right, and so on. Traversing through the center, attached to the two backbones of DNA on either side, are these base pairs, A’s and T’s linked together, spiraling down the center of the double helix, intermixed with pairs of C’s and G’s.
Right here, we have the first suggestion of something strange. Few molecules in nature have this tendency for a very particular ability to lock onto another to form such a small and tight little family of structures. A freak accident, it seems.
This apparently odd property was not lost on Watson and Crick, who observed in their paper describing the assembly of DNA, “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.” Pull the two chains of the DNA apart down the middle, and now it is a relat
ively simple matter for the cell to produce two copies of DNA, since the one single strand can be used to resynthesize the other one. It knows that any A must be bound to a T, and C to a G. So two single strands can be used to make two new double-stranded DNA molecules.
At the core of the code are those four chemicals, the four letters of the alphabet, A, T, G, and C. Surely it is just chance that the code has a magic number of four? Why not two, six, or eight?
Long before the emergence of life, scientists think that the world was monopolized by RNA, the close relative of DNA. To this day, RNA acts as the intermediate between the code in DNA and functioning proteins. Slightly more reactive and less stable than DNA, the RNA molecule has the remarkable ability to fold in on itself and, like proteins, to make active molecules that can catalyze chemical reactions and even replicate themselves. In this “RNA world” over four billion years ago, self-replicating molecules were dominated by RNA, with proteins attached here and there. Eventually, the sequence of letters in RNA, by some trick we do not yet know, is thought to have been coded into the more stable DNA molecule that today stores genetic information during the cell cycle.
Imagine a genetic code with a two-letter alphabet, say just C and G, the entire genetic code just a long Morse-code-like iteration of these two chemicals. In the RNA world, these two bases could bind as they do today to form C-G base pairs, allowing for the RNA molecule to fold and make complex shapes that can multiply and carry out chemical reactions. However, the binding is not very particular. Each base has a chance to bind with 50 percent of the other bases in the code (assuming they are split fifty-fifty in abundance, C’s might bond to any G’s and vice versa), making the association between bases rather unfussy. Now add another two bases—A and U (uracil, which replaces thymine in RNA)—to make four, and we have something that can do more-complex binding tricks and contains more information, more complexity. Each base can now link only with 25 percent of bases, and the structures can be more refined. Pairing becomes fussier and the molecules more intricate. In essence, the more bases there are, the more information you can have in a molecule or the shorter the molecules can be with the same information.
However, push that number beyond four, say to six or eight, and you have more information, but now there are other problems. As you add more bases, it gets more difficult to find ones that are sufficiently dissimilar to make it easy for them to be distinguished when the molecule replicates. One consequence is that the rate of errors is usually higher and mismatches are more common when the code multiplies. Computer modeling of these early replicating entities suggests that like Goldilocks making a genetic code, the number of bases in the real code, four, is just right.
Other lines of evidence have converged on the same conclusions. Studies using computer models of RNA molecules virtually multiplying and changing have found that of the various numbers of nucleobases possible, the use of four gives the molecule the greatest fitness and the greatest ability to evolve.
As with many of these ideas, they somewhat flounder because we do not have a time machine. Did these molecules replicate as we imagine they did on early Earth? Was there an RNA world, and was it really as we imagine? No one would claim that we have a conclusive answer, but there is something uncanny about the outcome of these different tests, that when we do them, we do not arrive at the startling realization that life on Earth probably has it all wrong. We do not discover that life would be much more effective and more likely to have evolved more efficiently if it had a code with a different number of bases. Instead, we keep coming back to what we observe in the structure of our own biology.
This conclusion does not rule out the possibility of frozen accidents, chance evolutionary paths locked into the structure of early life that, once there, could not easily be altered. Furthermore, the findings rely on a dim and distant past. Many of the hypotheses for why the code has four letters are rooted in a supposition about an RNA world, a four-letter advantage with its heyday in a world now long since gone. Despite these limits in our knowledge, the research suggests nonrandomness. The code of life and the way it is read are probably not mere contingency, one of a plenitude of possible paths. Instead, many of its routes and diversions, experiments and errors, ultimately led it to a structure that is predictable and congruent with physical processes and rules we are beginning to fathom.
The number four may have some significance, but surely the chemicals themselves could be anything? Surely what matters is that they are different and therefore by simply stringing them along a chain in different combinations that can be read, we can produce a diverse code with many permutations of “letters” to build the things needed to construct a life form?
Since the turn of the century, extraordinary advances have been made in modifying the natural genetic code. Motivated by a desire to “expand the alphabet of life,” as it is sometimes referred to, synthetic biologists can produce genetic codes with more than four letters. A larger alphabet would allow them to pack more information into the code (accepting that this increase can lead to more errors in replication) and to experiment with making cells that would produce new drugs and other useful products. This motivation has forced synthetic biologists to try to discover how the structure of the code evolved and whether different chemistries are possible. Could the code be something else among many possibilities?
Laboratory investigations with alternative bases that share similar chemical structures with the known code but with slightly different configurations of atoms have thrown up a series of other potential choices. The unwieldly named xanthosine and 2,4-diaminopyrimidine couple is one such base pair. Isoguanine and isocytosine, which share the same chemical formula as the traditional bases of guanine (G) and cytosine (C), but which have some of their atoms flipped into different positions, is yet another. Some isoguanine and isocytosine can even be incorporated into cells that can be deceived into replicating with these alternatives added into their DNA.
Experiments like these show that nature can use different codes, but to explain why nature picked the bases it did, researchers need to methodically try out all sorts of chemicals. Scientists at various institutions, from the Scripps Institute for Chemical Biology and Harvard University in the United States to the Eidgenössische Technische Hochschule in Switzerland, have painstakingly looked at pairing in many possible bases in RNA. Their work is like a journey across a landscape of chemicals, prodding around to see whether going in different directions makes any difference to pairing.
They tried to make RNA out of bases made of hexopyranoses, which share a chemical similarity to our familiar bases, but hexopyranoses are made of a six-carbon ring and not a five-carbon ring and are slightly larger. This greater size hinders them from forming a proper pair. Only in one instance when some of the chemical groups (specifically, -OH groups) were removed from one of the rings could base pairing occur, but this is not a likely natural chemical to be found in a genetic code. With this result alone, the research shows that the four letters chosen by life are not random, but the makeup and arrangement of atoms play an important part in how a genetic code can be assembled. Make the molecules too big, and they will not pair up.
On the chemists slogged, into wider territory. When they made isomers of RNA, where the chemical structure was the same but the chemical groups were attached to different positions, such as the laboriously named pentopyranosyl-(2’→4’) systems, the researchers made new base pairs that worked. Somewhat remarkably, some pairings are even stronger than those found in natural RNA. Does this suggest that here, in a backwater of the nucleic acid world, was an undiscovered set of compounds that would make better bases than the ones taken up by life, a small oasis of chemistry more propitiously placed to provide the key components of a genetic code?
One feature of nucleic acids is flexibility and the possibility of opening and closing base pairs to replicate or read the code into protein. In the hypothetical RNA world, base pairs had to be sufficiently strong so that a mo
lecule would remain folded into the right structure, but sufficiently weak that it can be flexible, allowing for that folding in the first place. In this outpost of structures that have stronger binding than natural RNA, we probably have molecules that are too inflexible. Consequently, RNA might not have been made better had it used these other bases; its structure and choice of bases reflect an optimization of base pairing and not a maximization of the strength of base pairing.
Synthetic biologists will doubtless have much more to say about the choice of life’s genetic alphabet as they continue to try new chemical arrangements. This work will ultimately shed light on the fundamental and ancient choices made by evolution in building an information storage system. But for now, we can say that the choice of chemicals in this code seems to be determined by simple physical mechanisms.
Onward with our exploration, we might wonder whether reading the code into something useful entails more chance, less room for predictable physical channels. The first stage of reading the genetic code is to render a complementary copy of the DNA in RNA. Not surprisingly, this strand is called messenger RNA, for its task is to be the proverbial messenger, a complementary copy of the long DNA code that can be carried away and turned into the final product, protein. This RNA copy of the DNA code is synthesized by RNA polymerase, a large enzyme that ratchets along the DNA, binding bases together to slavishly create our tentacle-like messenger.
Along the length of this messenger RNA, bits of yet another RNA molecule assemble. These bits, called transfer RNAs, bring to the strand their little cargos of amino acids, the component blocks of proteins. Each transfer RNA has its own special amino acid, and each prefers to bind to a very particular part of the code.
The transfer RNA must bind to three letters of the code, a so-called codon. As the transfer RNAs shuffle and snuggle up to each group of three consecutive letters along the messenger RNA strand, their passenger amino acids come into contact and bind to one another to make a chain of amino acids. From this machinery, itself protected by the ribosome, a collection of giant RNA structures, a strand of amino acids appears, like a snake emerging from a hole. Once released from the ribosome, the long chain of amino acids will spontaneously fold together in intricate contortions. A protein has been formed. This newborn molecule is ready to carry out a chemical reaction, participate in building a membrane, or do one of the numerous things required to build a self-replicating life form.
The Equations of Life Page 15