Dna: The Secret of Life

Home > Other > Dna: The Secret of Life > Page 20
Dna: The Secret of Life Page 20

by Watson, James


  The next order of business was to investigate and develop alternative sequencing technologies with a view to reducing overall cost to about 50 cents a base pair. Several pilot projects were launched. Ironically, the method that eventually paid off, fluorescent dye-based automated sequencing, did not fare especially well during this phase. In retrospect, the pilot automated machine effort should have been carried out by Craig Venter, an NIH staff researcher who had already proved adept at getting the most out of the procedure. He had applied to do it, but Lee Hood, as the technology's original developer, was preferred. This early rebuff of Venter was to have repercussions later.

  In the end, the HGP did not involve the wholesale invention of new methods of analyzing DNA; rather, it was the improvement and automation of familiar methods that ultimately enabled a progressive scaling up from hundreds to thousands and then to millions of base pairs of sequence. Critical to the project, however, was a revolutionary technique for generating large quantities of particular DNA segments (you need large quantities of a given segment, or gene, if you are going to sequence it). Until the mid-eighties, amplifying a particular DNA region depended on the Cohen-Boyer method of molecular cloning: you would cut out your piece of DNA, insert it into a plasmid, and then insert the modified plasmid into a bacterial cell. The cell would then replicate, duplicating each time your inserted DNA segment. Once sufficient bacterial growth had occurred, you would purify your DNA segment out from the total mass of DNA in the bacterial population. This procedure, though refined since Boyer and Cohen's original experiments, was still cumbersome and time-consuming. The development of the polymerase chain reaction (PCR) was therefore a great leap forward: it achieves the same goal, selective amplification of your piece of DNA, within a couple of hours, and without any need to mess around with bacteria.

  PCR was invented by Kary Mullis, then an employee of Cetus Corporation. By his own account, "The revelation came to me one Friday night in April, 1983, as I gripped the steering wheel of my car and snaked along a moonlit mountain road into northern California's redwood country." It is remarkable that he should have been inspired in the face of such peril. Not that the roads in Northern California are particularly treacherous, but as a friend – who once saw the daredevil Mullis in Aspen skiing down the center of an icy road through speeding two-way traffic – explained to the New York Times: "Mullis had a vision that he would die by crashing his head against a redwood tree. Hence he is fearless wherever there are no redwoods." Mullis received the Nobel Prize in Chemistry for his invention in 1993 and has since become ever more eccentric. His advocacy of the revisionist theory that AIDS is not caused by HIV has damaged both his credibility and public health efforts (see Plate 39).

  PCR is an exquisitely simple process. By chemical methods, we synthesize two primers – short stretches of single-stranded DNA, usually about twenty base pairs in length – that correspond in sequence to regions flanking the piece of DNA we are interested in. These primers bracket our gene. We add the primers to our template DNA, which has been extracted from a sample of tissue. The template effectively consists of the entire genome, and the goal is to massively enrich our sample for the target region. When DNA is heated up to 95°C, the two strands come apart (see Plate 40). This allows each primer to bond to the twenty-base-pair stretches of template whose sequences are complementary to the primer's. We have thus formed two small twenty-base-pair islands of double-stranded DNA along the single strands of the template DNA. DNA polymerase – the enzyme that copies DNA by incorporating new base pairs in complementary positions along a DNA strand – will only start at a site where the DNA is already double-stranded. DNA polymerase therefore starts its work at the double-stranded island made by the union of the primer and the complementary template region. The polymerase makes a complementary copy of the template DNA starting from each primer, and therefore copying the target region. At the end of this process, the total amount of target DNA will have doubled. Now we repeat the heating step, and the whole process occurs again; once more, the number of copies of the DNA bracketed by the two primers doubles. Each cycle of this process results in a doubling of the target region. After twenty-five cycles of PCR – which means in less than two hours – we have a 225 (about a 34 million-fold) increase in the amount of our target DNA. In effect, the resulting solution, which started off as a mixture of template DNA, primers, DNA polymerase enzyme, and free As, Ts, Gs, and Cs, is a concentrated solution of the target DNA region.

  A major early problem with PCR is that DNA polymerase, the enzyme that does the work, is destroyed at 95°C. It was therefore necessary to add it afresh in each of the process's twenty-five cycles. Polymerase is expensive, and so it was soon apparent that PCR, for all its potential, would not be an economically practical tool if it involved literally burning huge quantities of the stuff. Happily Mother Nature came to the rescue. Plenty of organisms live at temperatures much higher than the 37°C that is optimal for E. coli, the original source of the enzyme; and these creatures' proteins – including enzymes like DNA polymerase – have adapted over eons of natural selection to cope with serious heat. Today PCR is typically performed using a form of DNA polymerase derived from Thermus aquaticus, a bacterium that lives in the hot springs of Yellowstone National Park.

  PCR quickly became a major workhorse of the Human Genome Project. The process is basically the same as that developed by Mullis, but it has been automated. No longer dependent on legions of bleary-eyed graduate students to effect the painstaking transfer of tiny quantities of fluid into plastic tubes, a state-of-the-art genome lab features robot-controlled production lines. PCR robots engaged in a project on the scale of sequencing the human genome inevitably churn through vast quantities of the heat-resistant polymerase enzyme. HGP scientists therefore especially resented the unnecessarily hefty royalties added to the cost of the enzyme by the owner of the PCR patent, the European industrial-pharmaceutical giant Hoffmann-LaRoche.

  The other workhorse was the DNA sequencing method itself. Again, the underlying chemistry was not new: the HGP used the same method worked out by Fred Sanger in the mid-seventies. Innovation came as a matter of scale, through the mechanization of sequencing.

  Sequencing automation was initially developed in Lee Hood's Caltech lab. As a high-school quarterback in Montana, Hood led his team to successive state championships; he would carry the lesson of teamwork into his academic career. Peopled by an eclectic mixture of chemists, biologists, and engineers, Hood's lab became a leader in technological innovation.

  Automated sequencing was actually the brainchild of Lloyd Smith and Mike Hunkapiller. Then in Hood's lab, Hunkapiller approached Smith about a sequencing method using a different colored dye for each base type. In principle the idea promised to make the Sanger process four times more efficient: instead of four separate sets of sequencing reactions, each run in a separate gel lane, color-coding would make it possible to do everything with a single set of reactions, and run the result in a single gel lane. Smith was initially pessimistic, fearing the quantities of dye implied by the method would be too small to detect. But being an expert in laser applications, he soon conceived a solution using special dyes that fluoresce under a laser.

  Following the standard Sanger method, a procession of DNA fragments would be created and sorted by the gel according to size. Each fragment would be tagged with a fluorescent dye corresponding to its chain-terminating dideoxy nucleotide; the color emitted by that fragment would thereby indicate the identity of that base. A laser would then scan across the bottom of the gel, activating the fluorescence, and an electric eye would be in place to detect the color being emitted by each piece of DNA. This information would be fed straight into a computer, obviating the excruciating data-entry process that dogged manual sequencing.

  Hunkapiller left Hood's lab in 1983 to join a recently formed instrument manufacturer, Applied Biosystems, Inc. (ABI). It was ABI that produced the first commercial Smith-Hunkapiller sequencing machine. Since then, t
he efficiency of the process has been enormously improved: gels – unwieldy and slow – have been discarded and replaced with high-throughput capillary systems – thin tubes in which the DNA fragments are size-sorted very rapidly. Today, the latest generation of ABI's sequencing machines is phenomenally fast, some thousand times speedier than the prototype. With minimal human intervention (about fifteen minutes every twenty-four hours), these machines can produce as much as half a million base pairs of sequence per day. It was ultimately this technology that made the genome project doable.

  While DNA sequencing strategies were being optimized during the first part of the Human Genome Project, the mapping phase forged ahead. The immediate goal was a rough outline of the entire genome that would guide us in determining where each block of eventual sequence was located. The genome had to be broken up into manageable chunks, and it would be those chunks that would be mapped. Initially we pursued this objective using yeast artificial chromosomes (YACs), a means devised by Maynard Olson of importing large pieces of human DNA into yeast cells. Once implanted, YACs are replicated together with the normal yeast chromosomes. But attempts to load up to a million base pairs of human DNA into a single YAC exposed methodological problems. Segments, it was discovered, were getting shuffled, and since mapping is all about the order of genes along the chromosome, this shuffling of sequences was just about the worst thing that could happen. BACs (bacterial artificial chromosomes), developed by Pieter de Jong in Buffalo, came to the rescue. These are smaller, just 100,000 to 200,000 base pairs long, and much less prone to shuffling.

  For those attacking the human genome map head on – groups in Boston, Iowa, Utah, and France – the critical first steps involved finding genetic markers – locations where the same stretch of DNA drawn from two different individuals differed by one or more base pairs. These sites of variation would serve as landmarks for orienting our efforts throughout the genome. In short order the French effort, under Daniel Cohen and Jean Weissenbach, produced excellent maps at Généthon, a factorylike genomic research institute funded by the French Muscular Dystrophy Association. Like the Wellcome Trust across the English Channel, the French charity took up some of the slack created by insufficient government support. When, in the final push, detailed physical mapping of BACs became necessary, John McPherson's program at the genome center at Washington University was the major contributor.

  As the HGP lurched into high gear, the debate persisted about the best way to proceed. Some pointed out that a large portion of the human genome is what we in the trade call "junk," stretches of DNA that apparently don't code for anything. Indeed those stretches that encode proteins – genes – constitute only a small fraction of the total. Why therefore, these critics asked, should we sequence the entire genome – why bother with the junk? There is actually an extremely quick-and-dirty way to secure a snapshot of all the coding genes in the genome, using the reverse transcriptase technology described in chapter 5. Purify a sample of messenger RNA from any type of tissue; if your source is the brain, you will have a sample of RNA for all the genes expressed in the brain. Using reverse transcriptase, you can then create DNA copies (known as cDNAs) of these genes and the cDNAs can then be sequenced.

  This quick and dirty approach, however, was no substitute for doing the whole thing. As we now know, many of the most interesting parts of the genome lie outside genes, constituting the control mechanisms that switch the genes on and off. And so, in the case of the cDNA analysis of brain tissue just described, you will have an overview of the genes switched on in the brain but no idea how they are switched on: the hugely important regulatory regions of DNA are not transcribed into RNA by the RNA polymerase enzyme that copies the DNA strand into messenger RNA.

  Working at the relatively cash-strapped Medical Research Council (MRC) in Britain, Sydney Brenner pioneered this cDNA-based approach to large-scale gene discovery. With limited research funds, he figured that sequencing cDNAs was the most cost-effective way of using what little money he had. Keen to reap the commercial benefits of the sequences, the MRC prevented Brenner from publishing them until British pharmaceutical firms had a chance to position themselves to profit from them.

  On a visit to Sydney Brenner's lab, Craig Venter was impressed by this cDNA strategy. He could hardly wait to return to his NIH lab outside Washington, D.C., where he would apply the technique himself to produce a treasure trove of new genes. By sequencing even a small part of each one, Venter could determine whether or not it was new to science. In June 1991 an NIH official urged him to apply for patents on 337 of these new genes, although he had, in many instances, no clue about their function. A year later, having applied the technique more broadly, Venter added 2,421 sequences to the list submitted to the patent office. In my judgment, the very notion of blindly patenting sequences without knowledge of what they do was outrageous: what precisely was one protecting? This conduct could only be seen as a preemptive financial claim on a truly meaningful discovery someone else might yet make. I expounded my objections to the higher-ups at NIH, but to no avail. And the agency's persistence in endorsing the practice – a policy that was later reversed – spelled the beginning of the end of my career as a government bureaucrat. I had mixed feelings when Bernadine Healy, head of NIH, forced me to resign in 1992. Four years in the Washington pressure cooker had been enough. But what really mattered to me was that by the time of my departure, the Human Genome Project was undeflectably on course.

  Venter's taste of the commercial possibilities of patenting chunks of the genome whetted his appetite for more. But he wanted it both ways: to remain a part of the academic community, in which information was freely shared and salaries were small; and also to enter the business arena, in which his discoveries could be kept under wraps until the patent cleared and he could cash in. With the help of a fairy godfather, venture capitalist Wallace Steinberg (the inventor of the Reach toothbrush), Venter got his wish in 1992. Steinberg supplied $70 million to set up not one but two organizations: a nonprofit, The Institute for Genomic Research (TIGR, pronounced "tiger"), to be headed by Venter, and a sister company, Human Genome Sciences (HGS), to be headed by commercially inclined molecular biologist William Haseltine. It would work this way: TIGR, the research engine, would crank out cDNA sequences, and HGS, the business arm, would market the discoveries. HGS would always have six months to review TIGR's data prior to publication, except when the findings indicated potential to develop a drug, in which case HGS would have a year.

  Having grown up in California, Venter initially chose surfing over higher education. But a traumatic yearlong tour as a medical assistant in Vietnam during the war seemed to focus his mind, and on his return to the United States he acquired in short order an undergraduate degree and a Ph.D. in physiology and pharmacology from the University of California, San Diego. His migration from academia into the commercial sector made sense viewed in relation to his personal finances: by his own reckoning, he had $2,000 in the bank when he founded TIGR. But he was quick to turn his fortunes around: early in 1993 the British pharmaceutical company SmithKline Beecham, anxious for a stake in the genome gold rush, paid $125 million for the exclusive commercial rights to Venter's growing list of new genes. And a year later, the New York Times revealed that Venter's 10 percent share of HGS was itself worth $13.4 million. Not afraid to spend it, he dropped $4 million on an eighty-two-foot racing yacht, whose spinnaker he adorned with a twenty-foot image of himself.

  In the 1970s William Haseltine had been at Harvard as a graduate student under the joint direction of Wally Gilbert and myself. Afterward, he would run an innovative HIV research center at the medical school's Dana Farber Cancer Center. But it was his marriage to the multimillionaire socialite Gale Hayman (creator of the 1980s must-have perfume Giorgio Beverly Hills) that gave him the most visibility and ensured Haseltine had rather more than $2,000 in the bank when he set up HGS. Even before he went corporate, his jet-setting had provoked comment from members of his Harvard Medical School
laboratory. "What's the difference between Bill Haseltine and God?" Answer: "God is everywhere; Haseltine is everywhere but Boston, where he's supposed to be."

  Precious little skill or ingenuity was involved in Venter and Haseltine's scramble to patent every human gene they could find on the basis of cDNA sequencing. TIGR and HGS were simply the biotech equivalent of the kids who round up all the toys at the playground just so no other kid can play with them (see Plate 41).

  In 1995, HGS filed a patent for a gene called CCR5. HGS's preliminary sequence analysis suggested that the gene encoded a cell-surface protein in the immune system, and was therefore worth "owning" since such proteins may potentially serve as targets for drugs affecting the immune system. CCR5 was one of a batch of 140 patents for similar genes that HGS applied for. But in 1996 researchers discovered the role of CCR5 in the pathway by which HIV, the virus that causes AIDS, invades the immune system's T cells. They also found that mutations in CCR5 were responsible for AIDS resistance: it had been observed that some gay men – who turned out to have mutated CCR5 genes – never contracted the disease despite repeated exposure to HIV. Thus CCR5 was and remains clearly destined to play an important part in our assault on HIV. But although it made no contribution whatsoever to the hard work and solid science that determined CCR5's central role in AIDS infection, HGS stands to profit enormously from simply having got its hands on the gene first; and by exacting a fee for every attempted application of the knowledge, its CCR5 patent will sorely tax an area of medical research that desperately needs every penny it has. Haseltine's response is by turns unapologetic – "If somebody uses this gene in a drug discovery program after the patent has been issued . . . and does it for commercial purposes, they have infringed the patent" – and indignant: "We'd be entitled not just to damages, but to double and triple damages."

 

‹ Prev