So how would we make those changes? First, we have to figure out what they are. Many (if not most) of the differences between the Asian elephant genomes and mammoth genomes can probably be identified by sequencing and assembling both genomes, lining them up, and scanning them for sites at which they differ from each other. Since we know that we will not be able to sequence and assemble a complete mammoth genome, we’ve already stumbled upon the first problem with this approach. Ignoring that problem, the next step is to design a strategy to change each of the elephant sites that differ into the mammoth version using genome-editing tools. If we assume that each edit will require its own CRISPR-RNA (the CRISPR-RNA is the part of the CRISPR/Cas9 system that finds and then binds to the part of the genome where the edit is to be made), then we need to design and deliver into the cell 70 million different CRISPR-RNAs. However, George Church’s lab has been improving techniques to insert larger and larger fragments of DNA at once, which may allow us to change multiple bases at the same time. Let’s assume that the technology gets really good, and we can make, on average, ten changes with each CRISPR-RNA. This would reduce the number of CRISPR-RNAs necessary to around 7 million.
In their mammoth hemoglobin work, George Church’s revivalists designed two CRISPR-RNAs to make three changes to the hemoglobin gene (one CRISPR-RNA made a single edit and the other made two edits). Editing the elephant sequence takes place in three steps: First, everything necessary to edit the genome—the CRISPR-RNAs, Cas9 (the molecular scissors), and the mammoth template DNA—has to be delivered into the cell. Second, the CRISPR-RNAs have to find the part of the genome that they are intended to cut. Third, the cellular-repair machinery has to paste in the mammoth version of the gene.
Because the mammoth revivalists have actually performed this experiment, we can use their results to estimate the overall efficiency of the cut-and-paste process. In other words, we can ask, what proportion of edited elephant cells end up with all three changes? The mammoth revivalists found that each CRISPR-RNA had a different efficiency in targeting the right part of the genome (the “cut” step), and that the cellular machinery had a different efficiency in fixing each cut the way we want it to be fixed (the “paste” step). In this experiment, they estimated that the cut-and-paste efficiency of one of their CRISPR-RNAs was about 35 percent, and the other (the one that makes two changes) was about 23 percent. This means that only 8 percent of cells ended up with all three changes.
Even if we were able to reduce the number of CRISPR-RNAs that we need to make to, say, 100 (many fewer than the 7 or 70 million estimated above), and we assume, generously, that the efficiency of each of these is somewhere around 30 percent, that would mean that we would have to change at least 5 × 1053 cells in order to find one cell in which all 100 changes were made at the same time. That’s a big number. To put this in some perspective (although perspective is very hard at this scale), scientists have estimated that there are around 40 trillion (4 × 1013) cells in a human body and 7.5 × 1018 grains of sand on Earth.
Fortunately, we may be able to narrow down the number of changes we need to make without resorting to targeting specific traits. First, some of the species-specific differences that we observe when we compare one Asian elephant genome and one mammoth genome will not exist if we were to compare all mammoths and all elephants. These sites will look at first like they differ between species because we have only a single individual of each species to compare. But, if we were to have multiple genomes from each species, we would notice that some differences are not fixed within either species but are instead variable among mammoths or among elephants. Since not every mammoth or not every elephant has these changes, we can conclude that these changes are not important in making a mammoth look and act like a mammoth (or making an elephant look and act like an elephant). We could therefore exclude these sites from genome editing.
Another way to limit the number of necessary edits is to make only those changes that occur within genes. The genome is a big place, and only a small portion of the genome—around 1.5 percent of the human genome, for example—is made up of genes that code for proteins, while the rest of the genome is made up of other, noncoding DNA. Because genes code for proteins and proteins make phenotypes, the most important genetic differences between two species might be those that are found within the sequences of the genes themselves.
There are, unsurprisingly, several problems with this strategy. We do not, for example, know the location of all of the genes in the mammoth genome, so finding them will require educated guesswork—comparison with better-studied genomes—and even then we may not find all of the genes. Also, targeting only those differences that occur within genes may miss important differences that are found in the noncoding portion of the genome, such as differences that change when or how much of a gene is expressed. Differences in gene expression can result in different phenotypes even if the sequence of the gene itself is exactly the same.
Perhaps, then, we will need to make every change in the genome sequence. George Church believes that this will soon be feasible. The key, according to George, is to reduce the number of CRISPR-RNAs by cutting and pasting very long—very, very long—fragments of DNA. Instead of making a only few changes with each CRISPR-RNA, we will need to make thousands of changes, if not tens of thousands of changes, at once. Right now, George’s group is able to synthesize strands of DNA that are 50,000 base-pairs long. While the accuracy of such long synthetic sequences is still less than ideal, the technology is improving while the cost is going down. If it were possible to synthesize the entire mammoth genome in, say, 100,000 base-pair chunks, then we could cut-and-paste the entire mammoth genome into an Asian elephant genome using fewer than 350 CRISPR-RNAs.
Still, 350 is a big number and, following the logic above, would require an absurd number of cells even if each cut-and-paste experiment worked with exceptionally high efficiency. The logic presented above is not particularly logical, however, and does not describe how we would perform the experiment in reality. Rather than try to luck into a scenario in which 100 (or 350) things that are unlikely to happen all occur at the same time, we would perform the experiment in steps, where a few changes will be made and validated, and then a few more introduced to those cells that were edited successfully, and so on. The experiment would still be challenging, and it would still take a long time to complete, but it might just be feasible.
Today, we do not know the complete genome sequence of the mammoth. However, we are likely to learn most of the mammoth genome sequence within the next few years. Today, we cannot edit an Asian elephant genome so that it looks entirely like a mammoth genome. This technology is also improving. In fact, this particular step in the de-extinction process is probably the fastest moving in terms of technology development.
MORE THAN THE SUM OF ITS NUCLEOTIDES
Genome editing will become an increasingly efficient way to transform all or part of a genome of a living species into something that resembles the genome of a species that is extinct. However, some important differences between species may have nothing at all to do with the sequence of their genomes. Simply changing the genome sequence might not, therefore, be sufficient to resurrect the extinct phenotype.
Genomes are complicated places. Genomes live in cells, which live in bodies, which live in environments. And in different cells, different bodies, and different environments, the same genomes—genomes that are identical in both the coding and noncoding portions—can produce very different phenotypes. Identical twins, for example, have identical genomes. But, as they get older, identical twins become more and more phenotypically and behaviorally different from each other. How can this happen, if their genomes are the same?
In addition to their genome, all organisms have what is called an epigenome. The epigenome is a confusing concept, and not all scientists define or describe the epigenome in the same way. As I understand it, the epigenome can be thought of as a suite of tags that that are attached to the genome. These
tags indicate whether a gene is turned on (making proteins) or off (not making proteins). Importantly, these tags are not actually part of the genome, which means they can be in flux throughout the organism’s life. Epigenetic tags can be heritable—that is, the epigenetic state of a particular gene is sometimes passed on from parent to offspring. These epigenetic tags might tell a cell to turn on only those genes necessary for being a heart cell, for example. Other epigenetic tags are not heritable in this traditional sense and, instead, may appear or change because of interactions that take place between the organism and the environment in which that organism lives.
A variety of environmental stimuli are known to affect the epigenome. An organism’s diet, the amount of stress or toxins it is exposed to, and how much physical exercise it gets will all alter the epigenome, changing which genes are expressed, when they are expressed, and how much they are expressed. By the time identical twins become adults, their epigenomes differ considerably, although their genomes remain identical. It is the combination of their genome sequence and the epigenetic differences that accumulate over each twin’s lifetime that results in their distinct phenotypes.
Will epigenetics complicate de-extinction efforts? We don’t know. If we edit an elephant gene to contain mammoth DNA sequences, it will, as it begins to develop, contain the elephant epigenome. In the womb, it will be exposed to the developmental environment of an elephant: a mom that eats an elephant diet, lives in the elephant habitat, and expresses elephant genes. It will survive by virtue of the elephant placenta, which will be expressing elephant genes modified by that particular mother elephant’s epigenome.
While we cannot study the effects of the developmental environment using identical twins (because they develop in the same intrauterine environment), we know the health and diet of the mother during pregnancy can have profound effects on fetal development. Her diet can even affect health outcomes later in life, such as risk of heart disease and obesity. Fascinatingly, we also know that the mother’s diet before pregnancy can influence the epigenetic state of her genes, with consequences to the developing embryo. Almost certainly, the diet and amount of stress to which the mother elephant is exposed will affect her developing mammoth (or mammoth-like) embryo, but exactly what these effects will be remains unknown.
In some instances, a species-specific developmental environment is not critical to a successful gestation. Robert Lanza’s genetic-engineering firm, Advanced Cell Technology, successfully cloned both a gaur and a banteng (both living but endangered species that are closely related to cattle) using nuclear transfer and female domestic cows as surrogate mothers. Both pregnancies went well, and both calves appeared to thrive. It is unclear, however, how these animals might differ from clones that were born from surrogate mothers of their own species.
What about the environment after birth? Epigenetic changes accumulate throughout life and are driven by the environment in which an organism lives. How much of looking and acting like a mammoth is due to having a mammoth genome, and how much of it is due to living life in the steppe tundra? This is something we may have to wait to learn.
Understanding the genome and how the genome interacts with the environment is among the major technical hurdles standing in the way of successful de-extinction. It is unclear today how this hurdle will be surmounted. Will we finish sequencing the mammoth genome and learn where all of the genes are and what all of the genes do, so that we can make a minimal number of changes and still end up with a mammoth? Or will genome-editing technology advance to the point where we can make all the changes necessary to create a genome that is 100 percent mammoth-like? Will we devise a way to infer the epigenetic state of ancient tissues, as a first approach to learning which genes should be turned on or off in unextinct individuals?
Answers to these questions may come soon. Knock-in and knock-out experiments—where scientists either turn on or turn off specific genes in organisms like yeast, fruit flies, and mice—are being used to discover where genes are, what they do, and how they interact with each other. Large, population-level human genome-sequencing projects are being used to identify specific genetic changes that are associated with distinct phenotypes, such as adaptation to life at high altitudes or susceptibility to cancers or other diseases. These experiments are homing in on ways to identify what are likely to be the most “important” changes to make. At the same time, the technology behind CRISPR/Cas9 systems is developing rapidly. These systems have so far been used to edit the genomes of more than twenty different species, chopping out and inserting fragments of the genome that are on the order of tens of thousands of nucleotides long. We may eventually arrive at a solution where it is possible to edit an entire genome.
Ancient epigenomes may even be within reach, thanks in part to how DNA degrades over time. It turns out that DNA methylation, which is one way that the epigenome marks the genome, interacts with DNA degradation in an interesting and useful way. In methylation, the epigenome modifies the genome by attaching a methyl group (CH3) to a cytosine—one of the four nucleotide bases that make up DNA. DNA degradation also affects cytosine bases, but in a different way. Cytosine bases are often deaminated as DNA degrades—they lose part of their chemical structure (an amine group) and become uracil, which is a nucleotide base that is otherwise not found in DNA. When methylated cytosine bases become deaminated, however, the interaction between the two chemical modifications converts the cytosine into thymine, another of the four nucleotides found in DNA, rather than uracil. The ancient epigenome can be reconstructed by distinguishing deaminated cytosine bases that become thymine bases (which degraded after being tagged by the epigenome) from those that become uracil bases (which also degraded but were not epigenetically tagged).
This approach was first used by Ludovic Orlando’s research group at the University of Copenhagen in Denmark to reconstruct the epigenome of a 4,000-year-old Paleo-Eskimo from Greenland’s Saqqaq culture. Soon afterward, a team of scientists from the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, and the Hebrew University of Jerusalem, Israel, mapped the epigenome of two archaic hominins—a Neandertal and a Denisovan. The team found around 2,000 differences between the reconstructed epigenomes of the archaic hominins and the epigenomes of modern humans, some of which may underlie some of the skeletal differences between us and our archaic cousins.
While technologies to sequence, edit, and understand genomes are all developing at a rapid pace, new tools that become available tend to work best for those species that are the best studied. Far less is understood about elephants than about mice, fruit flies, or humans, and the same is true for many of the candidate species for de-extinction. These tools can be adapted for research on other species, but, for now, the hurdles standing in the way of fully reconstructing the genomes of extinct species remain high. George Church, however, is very tall.
Plate 1. Martha, the last known living passenger pigeon, in her enclosure at the Cincinnati Zoo in Ohio, USA. Photo courtesy of the Wisconsin Historical Society, WHI-25764.
Plate 2. Bones of mammoths (first), reindeer (second), bison (third) and horses (fourth) collected along the banks of the Kolyma River, Duvanniy Yar, Siberia. All of the approximately 1,000 bones depicted here were collected in a single day and over an area of about 1 hectare. Photo credit: Sergey Zimov.
Plate 3. Leg bones from three passenger pigeons whose genomes are being sequenced at the University of California, Santa Cruz, as part of the passenger pigeon de-extinction project. These were among the remains excavated by Dr. Greg Sohrweide from a site in Onondaga, New York, USA, and date to the 1690s. Photo credit: Andre Elias Rodrigues Soares.
Plate 4. Sorting the remains of stingless bees from fragments of ancient amber in the ancient DNA facility at the Pennsylvania State University. Although amber-preserved insects were once thought to harbor preserved ancient DNA, research has shown that DNA does not survive in amber, even over relatively short periods of time. Photo credit: Mathias Stiller.
Plate 5. Field sampling of ice age bones. Only a small amount of tissue is required for DNA extraction and analysis. Here, a small fragment of bone is removed from a sample collected on the Taimyr Peninsula, Siberia, during our 2008 field season. Photo credit: Beth Shapiro.
Plate 6. Placer mining near Dawson City, Yukon Territory, Canada. Here, gold miners blast the frozen soil with high-pressure water to expose the gold-bearing gravels beneath. As the soil is washed away, bones, teeth, tusks and other remains are revealed and can be collected. Photo credit: Tyler Kuhn and Mathias Stiller.
Plate 7. The partial skull of an ice age horse recovered from an active placer mine near Dawson City, Yukon Territory, Canada. Photo credit: Tyler Kuhn and Mathias Stiller.
Plate 8. A cervical vertebral bone from a mammoth is slowly exposed by placer-mining activities near Dawson City, Yukon Territory, Canada. Sometimes, several bones from the same animal are recovered in close proximity. This particular mammoth bone was recovered in 2010 with four other vertebrae. Photo credit: Tyler Kuhn.
Plate 9. A mammoth tusk exposed by placer mining near Dawson City, Yukon Territory, Canada. Although it took several days for the soil surrounding the tusk to thaw completely, we eventually recovered the entire 2.5 m, 45 kg tusk. The tusk is now part of the paleontological collection of the Department of Tourism and Culture in Whitehorse, Yukon Territory, Canada. Photo credit: Tyler Kuhn.
Plate 10. As running water cuts through the permafrost, the remains of ice organisms are exposed. Geologists believe that a small stream began to cut through this area of permafrost near the Yana River in northeastern Siberia around 60 years ago. When the cut reached an ancient lake, rapid erosion formed what is now the Batagaika crater. Such fresh exposures are common along rivers throughout Beringia. Photo credit: Love Dalén.
How to Clone a Mammoth Page 15