Book Read Free

Neanderthal Man

Page 22

by Pbo, Svante


  In April 2007, in preparation for the Cold Spring Harbor Genome Meeting, Jim and David sent me their first analysis of the Neanderthal sequences we had generated with the 454 technique. To test the method, they had first analyzed a present-day European individual at SNPs where another European and an African were known to differ from each other. They found that the European matched the other European at 62 percent of the SNPs and the African at 38 percent of the SNPs. Thus, as we had expected, on average people from the same part of the world shared more SNP variants with each other than with people from other parts of the world. They were able to compare the Neanderthal sequence to 269 positions where the European and African individuals differed and they found that the Neanderthal matched the European at 134 positions and the African at 135 positions. This was as close to 50:50 as the data could possibly be and it perfectly fit my preconceived idea that there had been no admixture. I liked this result for another reason as well. It meant that what we had was the DNA of a person who seemed to be equally related to Europeans and Africans. In short, there couldn’t be very much DNA contamination from present-day humans among our Neanderthal sequences, since any such contamination would likely have come from a European individual and therefore make the Neanderthal look closer to the European than to the African individual.

  On May 8, 2007, the day before the meeting started, all the members of what was now officially called the Neanderthal Genome Analysis Consortium met for the first time at Cold Spring Harbor. I started the meeting by describing the tag that we had introduced to rule out any contamination that occurred after the libraries left our clean room. I also talked about the three archaeological sites (see Chapter 12) and the bones from which we now had generated data. We had 1.2 million nucleotides of Neanderthal DNA determined from Vindija with our new tagged library approach. We also had about 400,000 nucleotides from the type specimen from Neander Valley in Germany, the bone from which we had determined the mtDNA segment in 1997. Finally, we had 300,000 nucleotides from El Sidrón, the cave in Spain where Javier Fortea and his team had collected bones under sterile conditions for us.

  My description of the Neanderthal sites was a welcome relief from the rather arcane technical discussion of how we had extracted and sequenced DNA from the bones and how these sequences might be analyzed. Everybody was impressed that the Neanderthal seemed to be equally distant from an African and a European individual, but David Reich correctly pointed out that with just 269 SNPs, we could only exclude a very large genetic contribution from Neanderthals into Europeans. In fact, the 90 percent confidence interval for the 49.8 percent estimate of the SNPs matching the European was 45.0 to 55.0 percent. This meant that, with 90 percent confidence to be correct, we could only say that Neanderthals hadn’t contributed more than 5 percent of the genome to Europeans. In other words, there was a 10 percent chance that Neanderthals had contributed more than 5 percent. This uncertainty drove home for me a powerful advantage of molecular genetic analyses over paleontological analyses. If we had been discussing the forms, shapes, holes, and ridges of Neanderthal bones, we couldn’t have made any realistic estimate of how sure we were of what we found. Neither could we have been confident of being able to collect more data to resolve the issue with greater confidence. With DNA, we could.

  David had also used the SNPs Jim had detected in present-day humans in other analyses. He compared the DNA sequence for each SNP to that of the chimpanzee to determine which of the two variants, or alleles, was ancestral, and which was derived. The further back in time the Neanderthal population became separated from modern human populations, the less often the Neanderthal would carry the newer, derived SNP alleles found in people today. When David analyzed 951 SNPs that had been discovered in Africans, he found that a present-day European carried derived alleles at 31.9 percent of the SNPs. When he analyzed our Neanderthal sequences, he found that they carried the derived alleles at 17.1 percent of the SNPs, about half as often as the present-day European. Given certain assumptions, such as constant population size over time, this suggested that the Neanderthals separated from Africans some 300,000 years ago. I was delighted by these results. The sequences we had determined clearly came from a creature with a history very different from that of people living today. However, David dampened my enthusiasm by again pointing out that we didn’t have very much data yet. In fact, the 90 percent confidence interval for the percentage of derived alleles in the Neanderthal ranged from 11 to 26 percent. Still, we were clearly on the right track.

  After we shifted to using the Illumina sequencing machines and began generating DNA sequences at a much faster rate, our twice-per-month phone meetings with the consortium became longer and we started having them every week. In January 2009, as the AAAS meeting drew near, I pleaded with David and Nick to do a quick analysis of our 454 sequences, which represented about 20 percent of all our data. Although I still didn’t think there had been any interbreeding between Neanderthals and modern humans, I wanted David to come up with an estimate of how small any contribution by Neanderthals to Europeans could maximally be without our detecting it. In other words, how big a contribution could we exclude? That was the number I wanted to present at the press conference and the AAAS meeting.

  On February 6, 2009, I received an e-mail from David. It said, “We now have strong evidence that the Neanderthal genome sequence is more closely related to non-Africans than to Africans.” I was totally taken aback. David found that our Neanderthal sequences matched Europeans at 51.3 percent of SNPs. This may not seem very different from 50 percent, but we now had so much data that the uncertainty was just 0.22 percent, which meant that even if we subtracted 0.22 from 51.3, we still had a number that was different from 50 percent. I realized I might have to revise my ideas and concede that there had been genetic mixing between Neanderthals and the ancestors of Europeans. But there was another observation that made me wonder if there was something wrong with the analysis after all. When David compared the Chinese and African genomes, the Neanderthals matched the Chinese 51.54 percent of the time and the uncertainty was 0.28 percent despite the fact that there had never been Neanderthals in China. David himself was intrigued as well as worried by these results. We both agreed that this finding was potentially very exciting but also that the results had the potential to be spectacularly wrong. There was a frantic exchange of e-mails and David, Nick, and I agreed that we should keep our admixture result secret at the press conference and AAAS meeting. If we mentioned it, all the media would write about it. If it then later turned out to be due to some sort of error, we would look like idiots. Instead, I decided to talk about less hot topics in Chicago. Discussions about the potential admixture would have to be postponed to a meeting that the consortium would have in Croatia just after the AAAS meeting.

  Chapter 17

  First Insights

  __________________

  Two days after returning from Chicago, I was again in an airplane, this time on my way to Zagreb to give a lecture on our project at the Croatian Academy of Sciences and Arts. The next day I flew south to Dubrovnik, where our consortium and our Croatian collaborators were to meet in a hotel on the coast outside the city. We were not there just to celebrate, but to hammer out how we would analyze and publish the Neanderthal genome.

  But the flight to Dubrovnik didn’t go quite as planned. The Dubrovnik airport is squeezed between mountains and the sea and has a bad reputation for difficult side winds. It was at this airport that US Secretary of Commerce Ron Brown died in an airplane accident in 1996. The US Air Force investigation later attributed the crash to pilot error and a poorly designed landing approach. As we approached the airport it was windy and the plane was jumping around. The Croatian pilot, probably wisely, decided not to try to land. Instead he flew to Split, a city some 230 kilometers away. We arrived there late in the evening and were packed into an overcrowded bus that took us through the night to Dubrovnik. I was exhausted when our first session started at 9:00 a.m.

  De
spite how tired I was, I felt energized by the presence of almost all the twenty-five members of our analysis consortium in the conference room (see Figure 17.1). Together, we were now going to tease out the information in the 40,000-year-old DNA sequences we had determined. I gave the first talk, an overview of the data we now had in hand. This was followed by a technical presentation by Tomi about his library preparation. Ed described how we estimated the level of present-day human DNA contamination, the issue that had plagued our first paper back in 2006. Our “traditional” mtDNA analysis yielded an estimate of 0.3 percent. By the time of the meeting, we had also devised an additional analysis not based on mtDNA. It relied on using the large numbers of DNA fragments we had from certain portions of the genome—specifically, the sex chromosomes, X and Y. Because females carry two X chromosomes while males carry one X chromosome and one Y chromosome, if a bone came from a female, we should find only X chromosome fragments and no Y chromosome fragments. Therefore, any Y chromosome fragments we detected in libraries derived from a female’s bone would be indicative of contamination by modern males.

  This analysis, suggested during one of our Friday meetings in Leipzig, initially sounded simple. But as with so many of the things Ed did, it was not as straightforward as it seemed. The complication was that, although the X and Y chromosomes are morphologically distinct, some of their parts share a close evolutionary relationship. The DNA they share as a result of this relationship could confuse the analysis when we mapped our short DNA fragments. To avoid this problem, Ed identified 111,132 nucleotides on the Y chromosome that weren’t similar to anything else in the genome, even if these bits were fragmented into pieces as small as 30 nucleotides in length. When he looked among the Neanderthal DNA fragments, he found just four fragments carrying these Y chromosomal sequences; if the bones we used had all come from males, he would have expected to see 666 of them.

  Figure 17.1. The consortium meeting in Dubrovnik, Croatia, in February 2009. Photo: S. Pääbo, MPI-EVA.

  Thus, he inferred that all three bones came from female Neanderthals, and that the four Y chromosomal DNA fragments must have come from DNA contamination. This suggested that we had about 0.6 percent male contamination. This estimate was not perfect since we could detect only contaminating DNA from men, but it suggested that the level of contamination was low and similar to what we had estimated from the mtDNA.

  We discussed other ways to estimate contamination. Philip Johnson from Monty’s group at Berkeley suggested an approach that relied on examining nucleotide positions where most people today have a derived allele but where a Neanderthal individual had the ancestral, ape-like allele. In cases where a different DNA fragment from the same or another Neanderthal individual didn’t turn out to carry the ancestral allele, Philip suggested that we take a mathematical approach and model the likelihood that this was due to either normal variation among Neanderthals, sequencing errors, or contamination from present-day humans. When Philip later implemented this, the extent of contamination again turned out to be below 1 percent. We finally had estimates of contamination that I trusted and which showed that the quality of our sequences was excellent!

  Martin talked about the Illumina data, which we hadn’t yet mapped. It made up more than 80 percent of all the fragments sequenced, or almost 1 billion DNA fragments. Much of the discussion centered on the challenges Udo faced in modifying the computer algorithm so that it would map these fragments quickly on the computer cluster in Germany. Although the analysis of the whole genome would have to wait until Udo had mapped all the fragments, we nonetheless discussed how we would do it. The first question was how different the Neanderthal genome was from that of present-day humans. Answering this seemingly simple question was complicated by the errors in the Neanderthal sequences due either to modifications of nucleotides in the ancient DNA or to errors caused by the sequencing technology. Illumina generated up to one error in every hundred nucleotides. To compensate for this, we had sequenced each ancient molecule many times over. But we still estimated that the errors in the Neanderthal DNA sequences added up to about five times as many as in the “gold standard,” the human reference genome sequence. Therefore, if we simply counted how many nucleotides differed between the Neanderthal and human genomes, we would be counting mostly errors in our Neanderthal genome.

  Ed had a way around this problem. It relied on disregarding all differences that were seen only in the Neanderthal fragments and instead scoring what the Neanderthal carried at positions where the human genome had changed and now differed from the ape genomes. To do this, he simply found all the positions where the human genome differed from the chimpanzee and macaque genomes. He then checked whether the Neanderthal carried the modern human-like nucleotide or the ape-like nucleotide at those positions. If the Neanderthals carried the modern human-like nucleotide, the mutation that caused it was old and predated the split between the Neanderthal DNA fragment and the human reference genome. If the Neanderthals carried the ape-like nucleotide, the mutation was recent and happened in humans after they split from the Neanderthals. Thus, the percentage of substitutions where the Neanderthal was “ape-like” as a fraction of all substitutions along the human lineage gave an estimate of how far back along the human lineage the Neanderthal DNA sequences split from DNA sequences in humans today. The answer was 12.8 percent.

  If we assume that our common ancestor with chimpanzees lived 6.5 million years ago, this would mean that the last men and women to transmit their DNA sequences both to people living today and to Neanderthals lived 830,000 years ago. When Ed did the same calculation for pairs of people living today, their common DNA ancestors were found to have lived about 500,000 years ago. So Neanderthals were clearly more distantly related to people living today than people today are to one another; in other words; the Neanderthals are about 65 percent more distantly related to me than I was to another person in the room in Dubrovnik. I could not stop myself from secretly peeking at some of my friends in the sun-lit room and imagine a Neanderthal sitting among us. For the first time I now had a direct genetic estimate of how much closer I was to one of them than to a Neanderthal.

  The biggest question on everyone’s mind was whether or not Neanderthals and modern humans had interbred. This was David’s question to answer, and although he hadn’t been able to join us in Dubrovnik, he explained his analyses that suggested interbreeding over a speaker phone. We discussed his results not only in the sessions but throughout coffee breaks and long, lavish and delicious Mediterranean meals that our hosts had organized. The question even dominated the morning runs that Johannes and I took on the outskirts of Dubrovnik, distracting us from the city’s medieval beauty and the damage suffered during the recent Balkan war, although not so much that we failed to stick to the paved roads to avoid land mines. Our conversations invariably centered on the intimate relations that may have taken place between modern humans and Neanderthals, who until 30,000 years ago had lived in the very area where we were jogging.

  One thing that worried us was that all of our admixture analyses relied on Nick’s count of nucleotide matches between the Neanderthal data and either African, European, or Chinese individuals. That left us vulnerable to error in Nick’s computer code, the products of which Nick himself was the first to stress that we needed to check. An error could come from some subtle but systematic differences in the techniques used to sequence the modern humans, or from the way Jim Mullikin had mapped them to the human reference genome to find SNPs. The effects of error could be great even if the errors were small; after all, we were talking about differences of only 1 or 2 percent.

  During the sessions we collected a list of things to do to check Nick’s and David’s results. Jim would align his modern human sequences to the chimpanzee genome instead of the human genome to eliminate any bias that might come from the fact that the human reference genome came partly from a European and partly from an African individual. But we also felt that we needed to generate our own DNA sequences from
present-day humans. By doing so, we could be certain that they were all produced and analyzed in exactly the same way. Accordingly, if there were systematic problems in our process, we could be certain that the sequences had the same types of errors in them. We decided to sequence the genomes from one person from Europe and one from Papua New Guinea. That might seem an odd choice, but it was prompted by the intriguing observation that we saw an admixture signal that was as strong in China as in Europe. Conventional wisdom had it that Neanderthals had never been in China, but I have always been ready to question paleontological conventional wisdom. Maybe there had been what I liked to call “Marco Polo Neanderthals” in China? After all, Johannes had shown in 2007 that Neanderthals—or at least humans carrying Neanderthal mtDNA—had lived in southern Siberia, some 2,000 kilometers further east of where paleontologists had thought they lived. Maybe some of them had made it to China? However, we were sure that there had never, ever been any Neanderthals in Papua New Guinea, so if we saw the admixture signal there, too, then Neanderthal genes had entered the ancestors of Papuans before they came to Papua, and presumably before Chinese and Europeans separated from each other. We also included a West African, a South African, and a Chinese person in our sequencing plans. With the genomes of these five individuals, we would then do all the analyses again to see if the results held up.

 

‹ Prev