This was—and still is—the question for commercial personal genomics, the same one posed by Rabbi Hillel some two thousand years earlier: If not now, when?
Both 23andMe and Navigenics were born of this frustration and impatience. And both companies may have shared a catalyst.
In November 2006, a Wall Street Journal article about Augie Nieto, inventor of the Lifecycle exercise bike, described his tragic diagnosis of amyotrophic lateral sclerosis, or Lou Gehrig’s disease, at the age of forty-eight.64 Like breast cancer, ALS is mediated mostly by genes of strong effect in 5 to 10 percent of cases and partially mediated by a collection of much weaker genes in the other 90 to 95 percent. Even had Lou Gehrig never made his extraordinary “luckiest man on the face of this earth” speech, ALS would still be among the most heart-wrenching of diseases. While the body breaks down, the limbs atrophy, and one loses the ability to move and speak, the mind remains intact. It is an agonizing, inexorable process; most ALS patients die within five years due to respiratory failure.65
Nieto, however, had financial resources most other ALS patients didn’t. He was not going to succumb without a fight. He assembled a team of physicians and scientists, and through his newly hatched ALS foundation, “Augie’s Quest,” sponsored research into the genetic basis of the disease with the explicit goal of finding a cure. Among the researchers was Dietrich Stephan, then at the Translational Genomics Research Institute in Phoenix. At Nieto’s urging and with his financial backing, TGen collected 1,250 ALS samples in three months. The TGen team then typed the samples for hundreds of thousands of markers and assembled a list of twenty-five promising ALS susceptibility genes. Nieto’s doctor asked for his patient’s genotypes. Stephan wanted to oblige him but found that he could not.66
There were two main issues. “The genotypes were done in a research environment and not in a CLIA environment."* In other words, TGen’s lab was not a certified clinical diagnostic lab and therefore the institute was forbidden from returning results to research participants. “Second, the IRB mandates total de-identification of research subjects/samples.”67 The Institutional Review Board (IRB), that is, the ethical review board charged with approving, monitoring, and reviewing TGen’s biomedical research on humans (every hospital and academic medical center has one or more IRBs; I serve on Duke’s), demanded that research samples be kept anonymous in order to protect subjects’ privacy and confidentiality. Whether the subject wanted to waive that right was immaterial. Augie Nieto could not view his own genetic data.
The investors behind 23andMe saw the Journal article. “This is why your company has to be here,” they told Linda Avey and Anne Wojcicki. “To give people access.”68 (And presumably to fill a market niche.)
Dietrich Stephan conceded that there was another issue, the same one he had confronted when trying to move complex genetic tests into clinical practice. “At the time [2006] we had absolutely no understanding of how to communicate probabilistic risk based on SNPs of very low effect size and what SNPs met the threshold for being ‘real.'” In other words, while a particular genetic marker might appear in ALS patients more than it does in controls, perhaps even a lot more, that was just the beginning; there might be ten, twenty, or fifty such markers. And even if there weren’t, an association between marker and disease was still a long way from understanding the significance of that marker, what it meant in the context of other DNA markers and/or the environment, and how it might be used to develop therapies. “We still don’t know [what’s ‘real'] for ALS. It’s actually one of the difficult problems that prompted the formation of Navigenics.”69
One could argue that Knome, too, arose from similar circumstances: demand for a service that was not yet available. As soon as news of the PGP became public, George Church began to get requests from wealthy people who were prepared to have him sequence their entire genomes on a fee-for-service basis. While he was encouraged by the fact that people were interested, it got to be a headache. “I felt this would be distracting from our academic mission both for my lab and for the PGP, both of which are active in nonprofit research operations. This seemed to me to be a textbook case for starting a company: to get it out of my hair. I thought it would be a good way of calling people’s bluffs and making sure they actually did want a whole-genome sequence.”70
Church founded Knome (he pronounces it “Know me"; the CEO says “Nome”) in 2007; it began enrolling customers at the end of the year. For $350,000, you could get your entire genome sequenced—all 6 billion base pairs, five to ten times over.71 When I mentioned it to my frugal wife she raised her eyebrows. “You’re getting quite a discount,” she said.
What do these origin stories mean? Commercial personal genomics was brought to term via multiple paths. For 23andMe, starting a company was a way to circumvent the inadequacies of publicly funded biomedical research and bring a “holistic view of genomics” to the masses, that is, genetics and self-knowledge in the form of social networking and ancestry. For Navigenics it was a way to make complex medical genetic risk information available to eager (and presumably well-heeled) consumers. For George, starting Knome was simply a way to make commercial personal genomics go away, to segregate it cleanly from his academic and nonprofit enterprises.
None of this is to say that the principals, with the possible exception of George, were not interested in making money—I am not that naïve. But there had to be easier ways to make money. And to lump the top-tier personal genomics companies in with Internet-based vitamin supplement salesmen and other varieties of modern snake oil or late-night infomercial fare was both facile and unfair. This was commerce, yes, but it was also rebellion. And it was dubious commerce, at least initially. While the VC dollars continued to flow, by 2010 no one had gotten rich selling personal genomic services to the public. Indeed, some had already lost their shirts.
In the near term, the elephant in the room would remain determining what all of this information meant. But what was “near term"? When discussing the PGP with friend and Broad Institute geneticist Stacey Gabriel, I said without thinking, “I’m not so worried about interpretation of the sequence. We have the rest of our lives to do that.”
“That’s good,” she said. “Because it’s going to take that long.”72
Stacey’s colleague Pardis Sabeti conceded that we were still in the very early days of this stuff, but insisted that that was not the point. “It’s unavoidable. This knowledge will be accessible and people will access it.”73
Statistical geneticist Tara Matise was more agnostic. She had arranged to have her father get his APOE genotype, since Alzheimer’s was his biggest concern. As for herself, she could afford it but was in no hurry. “My family is pretty healthy, luckily, and I have not made time to think about how many surprises I want.”74
I was still many, many months away from getting the full sequence of my twenty thousand genes, but a few weeks before the Navigenics soirée, Jason Bobe sent me an email: “ … here is your snp data. I’ve taken a look at it and I’m sorry to report that it’s pretty much all junk DNA.”75
Finally—something to peruse! But jeez Louise, the raw data file went on for days. I would need help to get through it. And I would get it. But even then, of the half million SNPs George’s lab typed me for, I would take a hard look at only less than three hundred.
And even that slight peek inside of Pandora’s box was more than enough to blind me.
* Another source of difference can be found in much bigger stretches of DNA that vary in how many times they are present in an individual. You, for example, may carry five copies of a million-base-pair stretch of DNA on chromosome 17 while your next-door neighbor may have seven copies of the same region. These recently discovered bits are known as copy-number variants (CNVs). In 2008, the personal genomics company Navigenics typed customers for approximately 1 million CNVs.
* “Greg Mendel” was later “outed” as 23andMe cofounder Linda Avey’s husband.
† This was true—even the compani
es themselves disagreed as to the extent that race and ethnicity had an impact on individual risks. deCODEme tended to discount ethnicity while Navigenics viewed it as very important. Why does this matter? Genetics is all about context; genes do not operate in a vacuum. You and I, for example, may both have the same gene variant that affects the way we metabolize vitamin D. But if I live in Greenland and you live in sunny West Africa, that variant may have much different effects on our resistance to melanoma, our skin pigmentation, etc. Furthermore, if your ancestors have lived somewhere for thousands of years, you have likely inherited a whole mess of gene variants that are of particular relevance to survival in the local climate.
* In many ways, GWAS have been a disappointment: we’ve found a lot of disease genes, but they are weak—they don’t explain very much of the various diseases and that makes it hard to use that information in the clinic. If Crohn’s disease is caused by twenty genes, how do we design a drug that targets twenty proteins?
* A notable exception was the Gene Sherpa (http://thegenesherpa.blogspot.com/), run by a clinical genetics fellow, Steve Murphy. Murphy was engaged in his own high-risk entrepreneurial pursuit, the first freestanding genomic medicine practice. From the first he was a vocal critic of personal genomics companies.
* CLIA stands for Clinical Laboratory Improvement Amendments of 1988. It is the mechanism by which the Centers for Medicare & Medicaid Services regulate clinical laboratory testing of human specimens. There are some 189,000 CLIA-certified Certified labs, most in the United States. CLIA is meant to ensure accuracy, timeliness, and reliability of lab test results. Some people—like me—are not convinced it does that, at least with respect to genetic testing.
5 Better Living Through Chemistry
Even though it was early February, with temperatures in the sixties and a gusty wind buffeting the Gulf Coast, the lobby of the Marco Island Marriott Beach Resort still smelled like coconut. And not cheap suntan-lotion coconut, mind you, but the actual coconut one would pull off a tree and crack open with a hammer—sweet but not overpowering; an upscale pheromone evocative of drinks with umbrellas, bikinis, white sand, and turquoise surf beckoning from just twenty yards away. Flip-flopped and sunglassed tourists waddled in from the heated pool and assorted Jacuzzis.
Sprinkled among them were a few hundred academic genome scientists and their industrial counterparts—that is, representatives from a select group of companies who cater to university molecular biology types. The nerds and the suits both expected to be pampered, and the opening night reception of the Advances in Genome Biology and Technology meeting suggested that they would not be disappointed. In the dark, spread out in the lush grass next to the pool, there was a tiki motif at work: palm trees, torches stuck in the ground, a band warbling Jimmy Buffett covers in the background, various meats on and off the bone, a bevy of other fresh food stations, and an open bar stocked with beer and wine and rum. The staff-to-attendee ratio was high: a festive-shirt-wearing person was always ready to help, to pour coffee or clear away one’s plate.
In part the lavishness came off almost as an act of defiance. With the memory of the 2005 hurricane season and the malevolent sisters Rita and Katrina already receding after a couple of years, this bit of Gulf Coast seemed intent on behaving as though it were oblivious to its precarious location, let alone to whatever curveballs La Niña and climate change might hurl its way. The Marco Island beach was, as ever, dotted with wealthy pensioners moving in and out of their space-age, Bauhaus-on-steroids condos.
At the meeting itself, the opulence was subsidized. When I checked in, the clerk at the front desk handed me two nondescript key cards inside a Marriott envelope. I asked where registration for the genomics meeting was and he stopped short for a moment and then pulled back the envelope before I could pick it up. “I am going to give you different keys, sir,” he announced. The new ones were emblazoned with a golden double helix and Applied Biosystems, Inc.'s catchphrase for its newest DNA sequencing machine: “The next generation is SOLiD™.” (SOLiD™, as I would be reminded on many occasions over the next three days, stands for “sequencing by oligonucleotide ligation and detection.”) I felt as though I had been given a backstage pass to a rock concert. But that was only the beginning: ABI and a handful of other sequencing companies had underwritten the coffee breaks, the meals, the poster sessions, and the boatloads of goodies stuffed inside our complimentary High Sierra backpacks—the pens, the lab notebook, the candy, the beach balls, the digital timer. Plenty of swag for my daughters.
DNA sequencing has only existed since the 1970s and neither of the two original methods was ever patented.1 So how did it become a multibillion-dollar bonanza? Until the last couple of years, the overwhelming majority of DNA sequencing was based on a principle developed by the unassuming English biochemist Fred Sanger, earning him the second of his two Nobel Prizes in 1980 (the first, in 1958, was for figuring out how to deduce the sequence of amino acids that make up proteins, the end products whose identities are embedded in the DNA code). Around the same time, a chemically based method of DNA sequencing was developed by George’s mentor, Walter Gilbert, and Gilbert’s student, Allan Maxam. Both methods were labor-intensive in the beginning; Sanger’s method was easier to automate and eventually overtook Maxam and Gilbert’s. And even though one could not generate much sequence in a single experiment in the early days, Gilbert said that something had shifted. “In 1975 Allan and I and Fred made sequencing a [laboratory] staple. We changed the problem from impossible to pretty easy.”2
Sanger’s DNA sequencing method, which seems to me no less ingenious now than it did when I first learned about it in the 1980s, exploited the same enzyme our cells use to manufacture DNA: DNA polymerase (most enzymes bear the suffix -ase). Essentially, “Sanger sequencing” involves putting DNA polymerase in a tube along with the DNA one wants to sequence, plus DNA building blocks, or deoxynucleotides, each of which contains one of the four DNA bases, adenine (A), thymine (T), guanine (G), and cytosine (C). The “deoxys” are the raw material that the polymerase enzyme uses to extend the DNA chain. But the key to the method was that Sanger also added a small quantity of slightly altered versions of the bases. These “dideoxy” versions could also be added to the growing chain the polymerase is churning out, but each one is a dead end—the chain cannot be extended from a dideoxynucleotide: imagine a section of railroad track with a bumper at one end preventing the track from being elongated. Thus, after the enzyme does its work, the test tube is full of DNA strands of various lengths, each one capped at a random place by a chain-ending dideoxy A, T, G, or C. If one could resolve those chains by size, Sanger reasoned, then it should become possible to read the sequence of a DNA molecule from one end to the other: each nucleotide like the rung of a ladder.3
But how to resolve the different-sized molecules? For nearly two decades the method of choice was a gel made of a thin layer of a latex-like chemical called polyacrylamide. The principle is simple: when exposed to an electric field, shorter DNA molecules migrate through a polyacrylamide gel faster than longer ones. Thus, by loading the contents of the test tube full of different-sized DNA fragments into a vertical gel and cranking up the voltage, one could get an overlapping ladder of DNA molecules and read it in order. Automated Sanger sequencing reads yield eight hundred bases per run, sometimes more (the protein-coding portion of a typical human gene is about two thousand bases long).4
This was all well and good—exciting even: Sanger sequencing meant that the genomic Rosetta Stone could now be sounded out, even if it wasn’t clear what the words meant. For me, reading a clean piece of DNA sequence that might be harboring a disease-causing mutation was one of the thrills of my graduate student experience. But the setup—the “workflow,” as the corporate sales reps call it—was decidedly less thrilling. Pouring gels between pairs of glass plates, starting over when they developed bubbles, letting them solidify (“polymerize”), loading them with extreme care, disassembling them several hours later, transferri
ng them to paper and exposing the paper to film, and then reading the sequence by hand … all of it got to be a drag. When I was working on my master’s in the late 1980s, my adviser used to walk through his human genetics lab and insist to his technicians, students, and postdocs, “This is not a factory.” Methinks he protested too much: We had an assembly line where we performed nearly identical experiments examining genetic markers, running gels, and seeing what came of them. Day after laborious day, each one divided from the next only by lots of beer and a little bit of sleep. It wasn’t quite assembling widgets, but I’d argue it was every bit a factory. And so it became with sequencing and me: I seemed to be most successful when I got into a kind of Zen state and didn’t overthink things. (For me, this was difficult—without strong medication of one kind or another, I am a pretty lousy Buddhist.)
Whatever its sweatshop-like qualities, the public and private versions of the Human Genome Project initially used more or less the same assembly-line Sanger approach. Several of the major sequencing centers hired dozens of people whose only job was to pour gels; they would often come in to work in the middle of the night. Variations of this workflow led to the sequence of a composite human genome ahead of schedule and under budget.* Indeed, within a few years automated capillary DNA sequencing, spurred mainly by demand from the HGP, produced yet another revelation. Suddenly there was no gel; it had been replaced by capillary tubes into which the four sequencing reactions were injected. By the late 1990s, there were two commercial capillary platforms. Molecular Dynamics (later Amersham and eventually GE Healthcare) offered the MegaBACE beginning in 1998. In December of that year, Applied BioSystems (then PerkinElmer) began shipping its PRISM 3700, which was eventually succeeded by the 3730.5 By then the game was afoot: the major taxpayer-funded public sequencing centers had had a fire lit under them by the upstart Craig Venter, the iconoclastic public face of a private initiative that wound up sequencing the human genome in parallel to—and in competition with—the government-funded Human Genome Project. With Venter’s heretical commercial entry into human genome sequencing and ambitious plans to annotate and sell the information, there was a huge, ready-made market for whichever sequencing platform could walk the walk.6
Here Is a Human Being Page 9