“And,” he continued, “I’m going to actually tell you some of the problems with that right now.”
Two-tiered testing for Lyme disease diagnosis is so named because it involves two sequential blood tests. The first measures antibodies to see if a patient’s immune system has produced enough to indicate potential exposure. Then, if the first test is positive, a second one, called the Western blot, checks to see if antibodies bind to specific Borrelia burgdorferi proteins, producing smudges on a test strip, called bands. These bands also indicate exposure—but with a catch.
A low-key fellow with spectacles, cropped white hair, and red striped tie, Raymond Dattwyler then explained in eighteen minutes the flaws in the blood tests that have long defined who is and is not infected with Lyme disease. His manner was matter-of-fact, as if everyone had known this for years. The technology was based on cultures that missed distinctive proteins, he said, while including others not specific to B. burgdorferi. “That criteria that was developed in the 1980s and the 1990s—there’s a lot of problems with that.” It is “not that good.” You can be seropositive for the rest of your life, it doesn’t mean you’re infected.”
I have attended other conferences and conducted many interviews in which the flaws of the Lyme diagnostic were similarly vented. But the speakers were mainly patient advocates and, especially, Lyme practitioners—doctors who have been disparaged by guidelines adherents for making Lyme diagnoses in the face of tests that had come back negative. Dattwyler’s comments, however, were coming from the side that had designed Lyme disease testing.
Under the diagnostic rules set by the panel on which he served, and adopted by the CDC in 1995, Lyme patients must achieve a minimum number of bands to test positive on the Western blot test—two out of three for one type of antibody or five out of ten for another. Get four of ten and, sorry, that is a negative. Yet the blot is curiously constructed, even arbitrary, designed to detect certain proteins while leaving out others that may be important. For one, it is skewed toward detecting one manifestation of Lyme disease over another. The five-of-ten-band scenario was modeled on a 1993 study showing the bands correctly diagnosed 96 percent of arthritis-related cases—but just 72 percent of neurological cases. The blot also includes a band that detects the flagella, common to many bacteria, along with bands that are far more indicative of Borrelia burgdorferi, something like a trunk on an elephant. Lyme doctors in the United States sometimes rely on these significant markers to diagnose the disease, ignoring the need for two of three or five of ten bands when symptoms and clinical judgment suggest Lyme disease.
When Dattwyler spoke, it had been two decades since the choices were made of what bands to include, what to leave out, and how many were needed for diagnosis, decisions that had immense consequences for multitudes. Here, at this conference in Ottawa, was one of the most powerful directors of Lyme policy and practice in the United States and the world agreeing the technology was flawed, old, in need of replacement. “We didn’t have good definitions of what was in those Western blots,” he told the gathering. “Those were just bands on a gel.”
So what do those bands, or rather their absence, mean in real life? For a boy, sixteen, living near Boston, they meant seven weeks in a psychiatric hospital. After relapsing from a previous bout of Lyme disease, the boy had been ruled negative for Lyme disease after registering four of the requisite five bands on the Western blot—as close to positive as humanly possible. Although Lyme disease is well known to cause serious psychiatric symptoms, doctors diagnosed the boy with “pure mental illness,” not Lyme disease, according to an article in International Medical Case Reports journal. Enter a pathologist from New Milford, Connecticut, named Sin Hang Lee who had devised a test using DNA sequencing to search for the pathogen in human blood. When he submitted his findings on the boy’s blood to the GenBank repository of genetic sequences of the National Institutes of Health, they perfectly matched the DNA footprint for Borrelia burgdorferi.
But as often happens with research that bucks traditional Lyme dogma, Lee’s report was criticized in the scientific literature. While his test found Borrelia DNA, he was unable to culture the organism, leaving the case, as a 2017 letter in JAMA Internal Medicine put it, “unproven.” Beyond this, however, even if the boy had been positive in two-tiered testing—and this is a comment on the fallibility of the technology itself—he likely would have been ruled a “false positive,” with antibodies showing up from his previous infection. In short, Lyme testing is often a lose-lose proposition. Test negative and get no treatment. Test positive and get no treatment if already treated. This incenses Lee, a feisty former Yale professor who had escaped communist China in 1961 and is angry over the use of twin tests that miss many cases. What are the odds that the DNA he found in the boy’s blood wasn’t B. burgdorferi after hitting a match in the NIH GenBank? “It’s mathematically almost impossible,” he said.
Several months after the Ottawa conference, I reached out to Raymond Dattwyler, who is a tell-it-like it is kind of guy from the Bronx, where he was raised by working class, high-school educated parents, of which he is rightly proud. He was direct in his criticisms of the two tiers of Lyme testing, how they have been used, and said that they must go. “They were a stopgap measure. Those were never supposed to be cast in concrete. [They were] supposed to be used until something better came along,” he told me. “Twenty years ago, I would’ve said they’re fine. Now I say, ‘oh shit, we were wrong.’ It doesn’t look as good as we thought it was.”
When I questioned the upshot of this flawed instrument—what it meant for sick, undiagnosed patients—Dattwyler lapsed into the qualifiers guideline authors have used to simultaneously acknowledge the testing regimen’s flaws while defending its use. “The biggest problem is not sensitivity,” or too many false negatives, Dattwyler told me. “It’s specificity—too many false positives.” The major flaw, in other words, was not the negative tests among people who actually had Lyme disease, although that is certainly a problem if you are one of them. The real problem—the one Lyme researchers have been far more concerned with, as I wrote in chapter 4—were people who were wrongly diagnosed with Lyme disease and treated with antibiotics when they did not have it. The false positives. The double test, the high bar, the bright bands on a Western blot—these were all designed to avoid just that, weeding out people early on who might have this or that antibody but not really Lyme disease. But what about all those missed cases, I asked, the people who did not manifest the typical rash and tested negative? “You miss the early thing,” Dattwyler said bluntly, “because your tests suck and not everybody gets the rash and doctors don’t realize the rash is variable.”
“But later,” he said, “you don’t miss many at all.”
The Upshot
In April of 1996, the New York State Department of Health wrote a letter to the CDC about its concerns over the new two-tiered technology. Agency officials had gone back and reviewed their Lyme disease cases from before two-tiered testing was adopted to see how the new criteria for diagnosing and counting Lyme disease cases would play out. Officials were concerned. “If we followed a case confirmation scheme which incorporated the new two-test requirement for serologic [blood test] confirmation on our 1995 cases, 1,237 cases would not have been confirmed.” That meant that 31 percent of all diagnosed cases in 1995 would have been ruled negative. The letter cited one case in particular: A patient had tested positive on the first tier and negative on the Western blot—a CDC negative overall—but had a form of facial paralysis that is a signal indicator of Lyme disease. “Do I confirm the case…?” the letter writer asked. Over time, the answer became crystal clear: You don’t.
The warring camps and hunkered-down mentality that have dominated Lyme disease are a function of the diagnostics that Dr. Dattwyler spoke so frankly of at Ottawa and to me. Indeed, the major issue driving the Lyme controversy for two decades has been the lack of a dependable test to determine if someone is currently infected with Lym
e disease. A 2013 Virginia law mandated that doctors inform potential Lyme patients, “current laboratory testing for Lyme disease can be problematic and standard laboratory tests often result in false negative and false positive results.” Even when it works, the test indicates only the presence of antibodies—which can last long after a prior infection—and not of the pathogen itself. That glaring gap in the Lyme diagnosis paradigm has hurt patients who need care and, beyond this, hampered research: How can we reliably enroll patients in studies, know if antibiotics work, and chart the effects of treatment if tests fail in a portion of cases?
The better question might be why two-tiered testing has been so fiercely defended for so long, why its square pegs have been jammed into round holes. In 2012, I interviewed a leading Lyme disease researcher-physician who has long been allied with the Infectious Disease Society of America side. The researcher said, but later asked that I not use, this rather innocuous quote in regard to the test regime: “I don’t think there’s any question that everybody would like to have something better.” In the world of Lyme disease politics, I learned, there was a distinct aversion to stepping outside the company line, which holds that the test is fine. Barbara Johnson, a CDC microbiologist with close ties to IDSA Lyme leaders, wrote this in a book chapter in 2012: “An extensive peer-reviewed scientific literature supports the rationale for and performance of two-tiered serological testing.”
That statement works only if one believes the Lyme diagnostic’s low accuracy—about half of tests are correctly positive at all stages—is normal and acceptable. This is a view the CDC has long embraced. “During the first few weeks of infection, such as when a patient has an erythema migrans rash,” it has officially proclaimed, “the test is expected to be negative.” The body simply hasn’t produced enough antibodies. But the false negatives are okay, the CDC has held, because Lyme disease can be diagnosed based on early symptoms or by the Lyme disease rash. There are two problems with that.
First, a Lyme diagnosis is so controversial that many doctors want proof before treating. At three different points, the IDSA treatment guidelines advise physicians not to treat potential Lyme patients who do not have a rash or a positive test. In cases involving early neurologic, arthritic, and cardiac symptoms, the guidelines say symptoms simply “are too nonspecific to warrant a purely clinical diagnosis.” Confirmation, they say, requires “laboratory support” or “serologic testing.” This is an unambiguous way of telling physicians not to use their judgment, even in the face of symptoms and likely exposure.
Second, the CDC’s study of 150,000 patients found the rash in 69.2 percent of cases; officially, the CDC maintains 70 to 80 percent of infections manifest it. But even a rash does not guarantee correct diagnosis since it may not look like the classic reddish “bull’s eye” with a clear center. CDC photos show six variations, among many, including with a “bluish” hue, a “central crust,” and “dusky centers.” Just 9 percent of ninety-five people who developed Lyme rashes had the true bull’s eye, according to a 2002 study in the Annals of Internal Medicine. At Johns Hopkins School of Medicine, researchers studying 165 early Lyme patients reported good news and bad: 87 percent actually had a rash—higher than the CDC estimates—but about a quarter of those were still initially misdiagnosed. Yet without this misnamed, sometimes misidentified, and often overlooked skin lesion, the guidelines insist on a positive test before diagnosis.
This is what happens in the real world. Because just 30 to 40 percent of tests are correctly positive in the early weeks of infection, because the rash is unpredictable, and because Lyme symptoms are common to other maladies, a share of people leave their doctors’ office undiagnosed and untreated. Some go on to feel better, get on with their lives, and suffer crippling problems later on. That’s the Lyme progression when it goes untreated. Recall that 10 to 20 percent of early treated patients suffer lingering problems. Most tragically, doctors have been encouraged, in cases with no rash, to allow infections to fester, then test later, even though patients may have symptoms and ticks may be active and infected locally. In patients without the rash, wrote Lyme pioneer Allan Steere and colleagues in 2016, “manifestations of Lyme borreliosis are typically diagnosed by recognition of characteristic clinical signs and symptoms along with serological testing.” The keywords in that sentence: along with. Diagnose by symptoms, Steere is saying, but also have a positive test. The assertion was made in a review of the literature published in Nature Reviews Disease Primers, one of many recitations of previous studies that have hammered home the mainstream Lyme message.
The CDC’s laissez faire pronouncements, its reassurances that a faulty technology works, and the advisories of the most esteemed names in Lyme disease, I’d argue, have made doctors complacent, believing, wrongly, that either a rash or, sooner or later, the twin tests will diagnose their Lyme cases. In fact, neither can be counted on to occur, most especially early on but later too. Further, reassurances that the tests work have stalled urgently needed research. If it isn’t broke, as the saying goes.
Roberta L. DiBiasi, a pediatrician, wrote somewhat more realistically on the tests than CDC’s Barbara Johnson—if in somewhat dry medical prose—in a 2014 article in Current Infectious Disease Reports: “Many attempts have been made to evaluate serologic testing” for Lyme disease, she stated. “For even this basic measure of test validity, there is marked controversy in the medical literature.”
When I began reporting on Lyme disease in 2012, I asked Gary Wormser, the lead author on the Lyme disease guidelines and the physician most associated with Lyme policy in the United States, if the tests worked. It was the last time he would speak with me. I subsequently wrote an article that questioned the validity of the tests. He said then, in a comment that captures the one-hand, other-hand nature of two-tiered Lyme diagnosis: “We don’t recommend testing for people with the rash. A negative test doesn’t prove anything. If you’re sick six months, six years and you don’t have a positive test, give me a break.” This is the prevailing principle of Lyme diagnosis: The tests don’t work early, but most certainly work later. No rash, no positive test, no Lyme disease. What’s the issue?
Under this regime, nonrash patients with equivocal symptoms, such as flu-like illness, headache, and fever, may be told to return for testing if symptoms persist. Yet even then, cases may be missed. A CDC continuing education tutorial advises doctors that “convalescent phase” patients, the second stage after acute, will correctly test positive in standard two-tiered testing just 26 to 61 percent of the time—the range of four studies quoted that demonstrates the tenuous nature of Lyme diagnosis. Later on, patients with “early disseminated” Lyme disease, with symptoms like meningitis and facial palsy, the four studies reported, will be positive 73 to 88 percent of the time. That’s better but misses potentially one in four cases. It isn’t until the “late disseminated” phase that two-tiered testing reaches accuracy heights of 95 to 100 percent, the tutorial advises. Yet those are some of the toughest cases to treat.
In 2016, British researchers looked at eighteen published studies and found the tests correctly positive just 54 percent of the time overall, a low figure that reflects early failure rates. Notably, these researchers found no standard definition of each Lyme stage—early, late, convalescent—which, they said, “prevented clear evaluation of test sensitivity.” When they looked at results by manifestation, they found good results in arthritis cases—96 percent accuracy. But for neurological Lyme disease, which can lead to memory and cognitive problems, numbness in the extremities, or psychiatric disorders, the study said testing was correctly positive in 87 percent of cases overall, leaving a significant share of potentially impaired people undiagnosed.
Beyond this, studies that measure the accuracy of Lyme disease tests should be viewed skeptically. Some rely on a kind of circular logic, selecting patients on which to validate the tests who have been known to suffer Lyme disease—precisely because they had already tested positive in two-tiered testin
g. Researchers writing in Clinical Infectious Diseases in 2008 acknowledged this flaw: “It is problematic to determine the frequency” of positive tests in cases involving neurologic, cardiac, or joint problems because positive testing is “a part of the case definition.” A CDC-led study also acknowledged, “the possibility of selection bias toward reactive samples cannot be discounted.”
These and other flaws became eminently clear when a team led by Mariska Leeflang, a Dutch epidemiologist and testing expert, reviewed the methodology behind seventy-eight studies on the efficacy of Lyme disease tests. Leeflang’s 2016 article in the journal BMC Infectious Diseases concluded that every one of those studies suffered from “a high risk of bias” in at least one of four categories. In the end, her team’s exhaustive review did not find “sufficient evidence” to endorse current Lyme diagnostics. “These [study] designs are very likely to overestimate sensitivity and specificity,” Leeflang told me – namely to inflate the test’s ability to predict positive and negative results. In other words, the performance rates are best-case, not real-world, figures.
Lyme Page 12