Oxygen
Page 22
As far as our story of LUCA’s identity is concerned, the movement of genes from free-living bacteria into eukaryotes has a profound impact on how we must view the web of genetic relations between living things.
Clearly, the nuclei of eukaryotic cells contain bacterial genes abstracted from mitochondria. Any attempt to trace the earliest genetic heritage of eukaryotes on the basis of these genes would be misleading: they are a late graft rather than an ancestral trait of the eukaryotes. But in many respects the mitochondrial genes are easy to track. At least we know their context and their function. What we don’t know is how many of the rest of the genes in the eukaryotic nucleus were once subsumed in this manner; or indeed, how to tell which ones they are. This is the general problem posed by lateral gene transfer — the movement of genes from one organism into another by a means other than by direct inheritance.3 If genes circulate with the freedom of money in an economic union, it becomes virtually impossible to trace the descent of an organism — it may have inherited its genes vertically from its own ancestors, or laterally from an unrelated species. The further back we go in time, the more twisted and obscure this web becomes.
3 The phenomenon of lateral gene transfer appears to be relatively common among bacteria over evolutionary time. As well as exchanging genes with close relatives by conjugation, bacteria in general are able to take up pieces of DNA from their environment, and occasionally these will become incorporated into their own DNA.
Last Ancestor in an Age Before Oxygen • 155
In the late 1960s, the web of genetic relatedness between organisms came to obsess a young researcher at the University of Illinois, a biophysi-cist turned evolutionary biologist by the name of Carl Woese. Woese recognized that if entire genomes could be sequenced, the ‘average’
relatedness of different species might still shine through the superimposed layers of lateral gene movement. At the time, however, sequencing such a massive number of genes was not feasible. What was needed instead was a single gene that could be relied upon to have stayed put — a gene that would not be transmitted sideways, but only vertically to the next generation. The fate of such a gene would be linked irrevocably with individual lineages, allowing, in principle, a grand reconstruction of all evolution.
This rare gene would also need to be highly resistant to change. The problem here is that the sequence of ‘letters’ in a gene gradually changes over evolutionary time, as a result of random mutations that change, insert or delete letters. Most genetic mutations that affect the protein or RNA product of the gene are harmful, but some are ‘neutral’, that is, they have no effect on the production or function of the gene’s product, and a few are beneficial. As neutral or beneficial changes are not penalized by natural selection, they can accumulate over time. The outcome is that if you look at the ‘same’ genes from two species that have diverged from a common ancestor, their sequences will differ. In theory, the more closely related the species, the less the sequences will differ, as there has been little time for mutation to occur, whereas distantly related species will have more differences in sequence.
For example, the genes encoding the oxygen-carrying haemoglobins have diverged at a rate of about 1 per cent every 5 million years. This means that close relatives, which diverged only recently, have similar haemoglobin sequences, whereas distant relatives have quite different haemoglobins. Similar patterns apply to other essential and widely shared genes, such as that for the respiratory protein cytochrome c. Our gene for cytochrome c is approximately 1 per cent different from chimpanzees, 13
per cent different from kangaroos, 30 per cent different from tuna fish and 65 per cent different from that in the fungus Neurospora. Clearly, at this rate, genetic drift may result in the complete loss of any sequence similarity between genes over billions of years, even if they do share a common ancestor.
Some DNA sequences drift faster than others. The fastest changes take place in junk DNA, as these sequences do not code for anything and
156 • LOOKING FOR LUCA
so are not subject to the restraining influences of natural selection. On the other hand, a few genes are so central to the life of the cell — as structurally important as a cantilever — that almost any tampering is detrimental. As any cell is likely to pay with its life for such changes, the
‘cantilever genes’ are the least likely to drift. Changes are almost never passed to the next generation because almost all the affected cells die.
Even so, very rarely, a change will occur that is not penalized by natural selection. Changes in such genes in different species would accumulate very slowly over billions of years, and could be used to produce a branch-ing tree of relationships that preserves a record of the earliest evolutionary patterns.
Do such genes exist? Woese reasoned that cells depend on a supply of building materials in the same way that a society depends on a supply of bricks and steel to build schools, factories and hospitals. Just as society would quickly grind to a halt if no building materials were available, Woese argued that life is unthinkable without proteins or the DNA code to ensure the subtlety and continuity of protein function. Protein synthesis must therefore be one of the most ancient and fundamental aspects of life, so it is no surprise to find that the pathways of protein synthesis are deeply embedded in the workings of a cell. As any changes in the genes controlling protein synthesis are highly likely to be fatal, these genes, more than any others, are likely to have been present in LUCA, to be very stable, accumulating relatively few genetic changes over time, and to be unlikely to move around the gene pool by lateral gene transfer.
We have seen that proteins are built on ribosomes. Ribosomes themselves are made from a mixture of proteins and yet another form of RNA, called ribosomal RNA. Both the proteins and the ribosomal RNA are encoded by DNA and so both are subject to the restraints of natural selection. Woese recognized that of all the components of a cell, ribosomes were the closest approximation to a cantilever — absolutely indispensable to all aspects of cellular function — and were therefore highly unlikely to undergo rapid mutation or wander around the gene pool. Furthermore, because the sequence of letters in ribosomal RNA is an exact replica of the gene, ribosomal RNA sequences could be compared directly, without recourse to the genes themselves. In the 1960s and 1970s this was invaluable, as ribosomal RNA was then much easier to isolate and sequence than the parent genes. Woese therefore settled on ribosomal RNA as a yardstick of evolution. He set about comparing ribosomal RNA sequences from his own lab and from the literature, to produce a map of the genetic related-
Last Ancestor in an Age Before Oxygen • 157
ness of all life. This grand objective was taken up by many research groups, and the project quickly gathered momentum.
Along with everyone else working in the field, Woese expected to uncover an ancient ancestral genetic link between the prokaryotes and the eukaryotes — something analogous to the clear relationship between mitochondria and alpha-proteobacteria. Two great surprises were in store.
First, the gap between the two domains continued to yawn. No microbial missing link could be found, nor indeed, any continuum between bacterial and eukaryotic ribosomal RNA sequence, as would be expected if the eukaryotes had simply evolved from bacteria. Instead, the RNA sequences clustered obstinately into two distinct groups, as if they had nothing in common. This could only mean that the split between bacteria and eukaryotes had taken place very early indeed, perhaps not long after the first stirrings of life itself. This in turn meant that the eukaryotes could not have evolved gradually from bacteria over 2 billion years, as everyone had expected. The split must have happened very quickly and very early.
Then came the second surprise, announced by Woese and Fox in 1977, and now seen as one of the great paradigm shifts in biology. A deep divide emerged within the prokaryotic domain itself. A little-known group of prokaryotes, most of which inhabited extreme environments such as hot springs and hypersaline lakes, confounded all expec
tations when their ribosomal RNA was analysed. The analyses showed that they shared little more with the bacteria than the absence of a nucleus. As more of their ribosomal RNA was sequenced and compared, it became clear that the divergence was not just a new kingdom within the prokaryotes, but something much more basic — an entirely new domain, which has become known as the Archaea (Figure 9). Today, instead of five kingdoms, we recognize three great domains of life: the Bacteria, the Archaea and the Eukaryotes. We ourselves, as animals, occupy no more than a small corner of the Eukaryotes (Figure 10).
The existence of the Archaea allows us to paint a far more convincing picture of LUCA. We can now compare the characteristics of three different domains of life. Archaea are obviously comparable to bacteria in that they lack a cell nucleus, and so are defined as prokaryotes. The organization of their genes is also similar to that of bacteria: they have a single circular chromosome, they cluster groups of related genes into operons, and they carry little junk DNA. Other aspects of their organization, such
158 • LOOKING FOR LUCA
as the structure and function of proteins in the cell membranes, bear a more superficial resemblance to bacteria. Most archaea have a cell wall but, unlike bacteria, a few do not. Again unlike bacteria, the cell wall contains no peptidoglycans. The similarities quickly tail away.
In other respects, the archaea lie much closer to the eukaryotes.
Although they do not have as many genes as eukaryotes, archaea have on average more than twice as many genes as bacteria. The DNA of archaea is not naked, but is wrapped in proteins similar to those used by eukaryotes.
The detailed mechanism of DNA replication and protein synthesis is much closer to that of the eukaryotes. For example, they switch their genes on and off using mechanisms very similar to those in eukaryotes.
The protein constituents of the ribosomes also resemble those of the eukaryotes in their structure. Other details of ribosomal function, includ-
(a)
(b)
(c)
Figure 9: Predicted structures of ribosomal RNA from (a) the archaeon Halobacterium volcanii, (b) the eukaryote baker’s yeast, and (c) cow mitochondria.
RNA is single-stranded, but, as in DNA, the letters can pair up to form bridges between two chains. In the case of RNA, a single chain doubles back on itself to form loops and hairpins (whereas the famous DNA double helix is in fact two distinct chains entwined in a helix). The ‘bubbles’ in this illustration are single-stranded RNA, in which the letters have not paired up. A comparison of the three ribosomal RNAs shows that the overall shape and structure of ribosomal RNA (its secondary structure) has been maintained throughout evolution. However, the actual sequence of letters has drifted substantially, and the sequence similarities are very low. The mitochondrial RNA structure is reminiscent of bacterial RNA, from which the mitochondria originated.
Adapted with permission from Gutell et al., and Progress in Nucleic Acid Research and Molecular Biology.
Last Ancestor in an Age Before Oxygen • 159
ing the initiation of protein synthesis, the elongation of protein chains, and the termination steps, parallel the eukaryotic process. Finally, and most convincingly of all, genetic analyses of so-called paralogous gene pairs — the products of gene duplications in a common ancestor, followed by divergent evolution in different groups of descendants — indicates that archaea are indeed relatives of eukaryotes. In essence, archaea are prokaryotes with many features of eukaryotes. They are as close to a missing link as we are ever likely to find.
Bacteria
Archaea
Eukarya
Animals
544 Ma
Fungi
Euryarchaeota
Gram
Halophiles
Plants
positives
Methanogens
Purple
Crenarchaeota
Ciliates
bacteria
Hyperthermophiles
1200–1000 Ma
Cyanobacteria
2800 Ma
2100 Ma
2700 Ma
2700 Ma
Single-celled
eukaryotes
LUCA
Figure 10: A ‘rooted’ universal tree of life, showing the three domains of life.
The tree is based on sequence comparisons of ribosomal RNA, analysed by Carl Woese and his colleagues. The order and length of branches are proportional to the sequence similarities within and between the domains and the kingdoms of life — in other words, they are directly proportional to the genetic similarities between species. It is humbling to note that the animals, plants and fungi account for just a small corner of the Eucarya domain, and that there is less variation in ribosomal RNA sequences within the entire animal kingdom than there is between different groups of methanogen bacteria.
The common ‘root’ represents LUCA, the Last Universal Common Ancestor of bacteria, archaea and eukaryotes. Notice that the Archaea are intermediate between the Bacteria and Eucarya, as inferred from many of their detailed morphological and biochemical properties, as well as their ribosomal RNA sequences. The boxed dates indicate the minimum age of selected branches, based on fossil evidence and biochemical fingerprints, such as the characteristic membrane steroids found by Jochen Brocks and his colleagues in the shales underlying the Hammersley iron formation in Australia. Adapted with permission from Andrew Knoll and Science.
160 • LOOKING FOR LUCA
What does all this say about the identity of LUCA? It seems likely that the split between the Archaea and the Bacteria occurred very early in the history of life, perhaps 3.8 to 4 billion years ago. We assume that both archaea and bacteria retain some of the original features of LUCA herself.
Calculations suggest that eukaryotes split from archaea later, perhaps around 2.5 to 3 billion years ago, as they share far more fundamental traits with archaea than with bacteria (see Figure 10). We know that eukaryotes acquired mitochondria and chloroplasts around 2 billion years ago by engulfing bacteria. We also know that some of these bacterial genes became part of the chromosomes of the eukaryotic cells. Here we return to the problem of lateral gene transfer. If the eukaryotes are essentially a fusion of archaea and bacteria, then it is plain that lateral gene transfer has taken place across domains. If we wish to draw a portrait of LUCA by comparing the properties of the different domains, can we be sure that they are not completely mixed up?
Luckily, there is some evidence that lateral gene transfer is not common across domains. The development of the eukaryotes seems to have been a singular event, possibly propelled by the unique environmental conditions around the time of the snowball Earth of 2.3 billion years ago (Chapter 3). In general, however, the archaea have kept themselves very much to themselves, and give every appearance of having changed little since the beginnings of time. No archaea are pathogenic, which means that they do not infect eukaryotes, and so do not have much opportunity to mix their genes with eukaryotes in the course of intimate warfare. Nor do they compete with bacteria in other settings. Their predilection for extreme conditions isolates them from most other organisms, even bacteria. Hyperthermophilic archaea, such as Pyrolobus fumaris, live at searing temperatures, well over 100°C, and high pressures in deep-sea hydrothermal vents. Other archaea, such as Sulfolobus acidocaldarius, add acidity to the heat and live in sulphur springs in places like Yellowstone National Park, at pH values as low as 1, the equivalent of dilute sulphuric acid. At the other end of the pH scale, some archaea thrive in the soda lakes of the Great Rift Valley in East Africa and elsewhere, at a pH of 13 and above —
enough to dissolve rubber boots. The halophilic archaea are the only organisms that can live in hypersaline salt lakes, such as the Great Salt Lake in Utah and the Dead Sea. The psychrophiles prefer the cold and grow best at 4°C in Antarctica (their growth is actually retarded at higher temperatures).
Many of these favoured environments have barely changed for
L
ast Ancestor in an Age Before Oxygen • 161
billions of years. Without calamity or competition, the selection pressure for innovation and change must have been negligible. While it is true that some archaea do live in more normal environments — among plankton at the ocean surface, for example, and in swamps, sewage and the rumen of cattle — the genes of their extreme and reclusive cousins have surely had little traffic with the rest of life.
The extraordinary features of archaea quickly stimulated scientific and commercial interest, and the field blossomed as a distinct discipline during the 1990s. Enzymes that function normally at high temperatures and pressures are an answer begging for an application. Already enzymes extracted from archaea have been added to detergents and have been used for cleaning up contaminated sites such as oil spills. To enlist the skills of a microbe on an industrial scale, however, requires a working knowledge of its genes. Complete genome sequences have now been reported for representatives of all the known groups of archaea. These sequences at once confirm the great antiquity of archaea, and their splendid isolation over the aeons. But the greatest surprise is how many genes the archaea do have in common with bacteria.
Of the genes involved in energy production through respiration (aerobic or anaerobic), at least 16 have been found in both archaea and bacteria. From the close similarities in their sequences, it seems likely that these genes were present in LUCA, and were later inherited by both archaea and bacteria, as they diverged from each other to occupy their distinct evolutionary niches. This conclusion — that the 16 respiratory genes were present in LUCA herself — is supported by two independent lines of evidence, as argued by José Castresana and Matti Saraste of the European Molecular Biology Laboratory in Heidelberg.
The first line of evidence relates to evolutionary trees. The similarities between the DNA sequences of the 16 respiratory genes can be used to construct a tree of relatedness. This family tree is then superimposed over the family tree based on ribosomal RNA sequences. If the respiratory genes had been passed horizontally by lateral gene transfer, then closely related respiratory genes would be found in organisms that were otherwise only distantly related to each other. Put another way, the evolutionary histories of the respiratory genes would differ from the true evolutionary roots of their host organisms, just as the history of mitochondrial genes differs from that of the nuclear genes of eukaryotes. On the other hand, if the respiratory genes had stayed put in their respective organisms, then the evolutionary trees constructed from ribosomal RNA and the respiratory genes should