Snowball in a Blizzard
Page 19
In writing this, I am ignoring an entire area of controversy, that of the selective publication bias. A 2008 study in the New England Journal of Medicine looked at antidepressant drug trials that had to be registered with the Food and Drug Administration, and found that out of thirty-six trials that did not show benefit, only three were published, while eleven were published but “spun” to appear more positive than was merited, and the remaining twenty-two weren’t published at all. That contrasted with thirty-eight antidepressant trials that did show benefits, of which all but one were published. The large majority of these trials involved SSRIs. “Selective reporting of clinical trial results may have adverse consequences for researchers, study participants, health care professionals, and patients,” they write in conclusion, in what was almost certainly the New England Journal’s award winner for understatement of 2008.
The fact that the value of an SSRI is probably most pronounced in the most depressed patients hasn’t stopped SSRIs from being marketed as something of a panacea: according to CDC data, about one in every ten Americans now takes an antidepressant, the majority of which are SSRIs. These aren’t prescriptions being written by psychiatrists doing careful assessments with Hamilton Depression Scales, however. A recent study in the Journal of Clinical Psychiatry estimated that nearly three-quarters of all antidepressant prescriptions are given by primary care physicians, a group much less likely to be adept with depression scales and with little time to read the literature on outcome studies.†
Even among psychiatrists, the utilization of depression scales is hardly widespread. A survey of British psychiatrists published in 2002 indicated that nearly 60 percent of them never used a depression scale to monitor response to treatment. Presumably, however, shrinks are able to assess the severity of depression with greater accuracy than primary care physicians even without formal scales, due in part to their clinical expertise, as well as the fact that PCPs must manage the entirety of their patients’ medical issues, while psychiatrists concern themselves exclusively with the soul.
Primary care physicians must manage an astonishing number of medical problems of their patients, of which depression is only one in a list that can be extensive, particularly if the patient is elderly. It also means that the assessment of depression can often be given short shrift, with the “diagnosis” made following a rapid-fire question-and-answer session that can range from eating habits, exercise regimens, and discussions about “bad habits” like cigarettes and alcohol, to say nothing of a physical exam and a review of any pertinent laboratory tests. The diagnosis of depression can be made at an even greater remove in the case of elderly, who are frequently accompanied by family members who serve as surrogate “historians,” quickly summarizing details for hurried and harried doctors whose minds can be just as focused on documenting all the findings as considering their implications. In this environment, the opportunity for a busy physician to provide an overly casual diagnosis of depression is too easy.
And once patients start an antidepressant, they tend to stay on it, especially because the very doctors who might have made a hasty diagnosis in the first place do not often consider what the endpoint should be, and do not have a strategy to evaluate whether the patient has improved. By such means a condition that used to be considered a largely transitory state becomes morphed into a permanent feature of a patient’s medical profile, as unchanging as DNA. Whether there are any discernible benefits for these patients—and whether there may be risks, either due to side effects or interactions with other drugs—is much less clear. The use of long-term antidepressants is very much in the middle of the spectrum of certainty. There is a paucity of information about the risks versus benefits in people who stay on treatment for years, even though millions may take such drugs for that long.
The Escalation of Medicalization
What I have tried to describe above is something approaching the catch-22 situation of modern drugs. We have built a formidable formulary of medications that would have been the envy of healers of yore. Statins really are quite remarkable as well as lifesaving, and SSRIs are at least reasonably safe drugs that may be genuinely beneficial for certain patients. But the trend over the past decade or more has been to gradually expand the indications for various drugs.
My point here is that, as the indications for these medications continue to reach into populations previously considered “normal,” our confidence that we are helping our patients becomes ever more shaky, and what benefits that can be accurately measured are ever smaller still. As I have tried to show, when this happens, our uncertainty increases, and the harms become more worrisome. When that middle point is reached on the spectrum of certainty, the Latin phrase floats to the front of consciousness: primum non nocere. First, do no harm.
This holds true for medications well beyond drugs for cardiac disease and depression: it can be seen in guidelines for hypertension that I discussed earlier, or the gradual lowering of the blood sugar levels that define diabetes, to name but two critical categories. No sane physician denies that hypertension and diabetes are conditions that must be treated, and moreover that our modern treatments are unquestioned scientific triumphs. However, the lifesaving value of lowering one’s systolic blood pressure from 180 to 150 is much, much greater than lowering it from 150 to 120—a point I will return to at the end of the book. Similarly, reducing someone’s hemoglobin A1C (which measures the severity of diabetes) by two points is tremendously beneficial if one drops it from 10 to 8; it’s not nearly as lifesaving when it drops from 8 to 6.
What I’ve also hinted at thus far is that there are other drivers of medication prescription and medication use besides pure scientific knowledge of a drug’s precise benefit. It is unclear whether pharmaceutical companies consciously sought to have primary care physicians become the main dispensaries of antidepressants, but it would certainly make perfect business sense to encourage this group to feel comfortable writing SSRI prescriptions. It bears emphasis that PCPs are busy, and the Hamilton Scale clearly cannot be performed accurately without a lengthy evaluation. But these PCPs, who are almost certainly struggling to stay on top of the revised statin guidelines or the mammography guidelines or the blood pressure management guidelines, among many other medical matters, almost never have the luxury of time to make a careful assessment about the risks and benefits of SSRIs. Because they are generally well tolerated (like statins), many PCPs must assume that it is easier to just prescribe (and maintain) an SSRI than it is to have a prolonged discussion about whether a given patient will experience a true benefit. The avoidance of such topics in a primary care office must benefit the financial bottom line of drug companies, whether that was by design or was simply a happy accident.
Drugs are neither miracles nor curses—they are, alas, both. Their value can be properly assessed when the size of their benefit is weighed against the risks of their use. Doctors must weigh these two variables every day when faced with their patients’ needs. In some cases, the task is easy: the smoking septuagenarian fresh from a recent hospital stay for a heart attack really does need to start taking a statin barring some compelling reason not to.* Yet does an otherwise healthy fifty-year-old whose father smoked require a statin with the same level of urgency? It’s the same drug, but when we consider that drug in relation to the second patient we suddenly find ourselves somewhere else on the spectrum of certainty. Understanding these shifts—both as physicians and as patients—is critical to understanding the true value of a drug, and whether we’re willing to tolerate the harm that it may do us for the benefits that it may offer.
In medical parlance, the “compelling reason not to take it” is referred to as a contraindication.
As we move to the next chapter, we’ll look more directly at how uncertainty and the corporate world’s need for profits can prove a combustible combination and change the lives of millions of Americans in practically the blink of an eye—or at least a few minutes after a health report is broadcast by the medi
a and circulated on the Internet. Just as in this chapter, the final arbiter of a medication’s value will be found in a double-blind, placebo-controlled trial.
However, unlike the statins and SSRIs I discussed in this chapter, there was already an established practice of giving a medication before critical trials were performed. How did that come about? And what was the scientific evidence that supported its use? The answer is that a different type of study was used. We still commonly use this type of study today, and although it can be an effective tool to understand medicine, it introduces new elements of uncertainty. Exploring the parameters of that uncertainty is where I will begin, by looking at a small group of young women who developed a strange and devastating disease just over fifty years ago.
7
THE CORRELATION/CAUSATION PROBLEM, OR WHY DARK CHOCOLATE MAY NOT LOWER YOUR RISK OF HEART FAILURE
The science was accurate but it was extrapolated beyond imagination.
—CYNTHIA PEARSON, EXECUTIVE DIRECTOR OF THE NATIONAL WOMEN’S HEALTH NETWORK, ON HORMONE REPLACEMENT THERAPY RESEARCH, 2002
Uncertainty in medicine takes a variety of forms. When we looked at the drug trials in the previous chapter, we witnessed a variety of questions, each dealing in whole or in part with uncertainty, that affect our perception of the value of a drug. First, what will be the yardstick by which we will measure a drug’s value? Will it be something indisputable (like dying) or something subject to interpretation, and thus more difficult to quantify and measure (like feelings)? Second, how big will any potential observed benefit need to be before we consider it a success? Third, what are the potential harms of the treatment? Every type of treatment we offer our patients, from drugs to surgery to electroshock therapy, involves consideration of each of these factors. Because the answers to these questions differ for each treatment, and the fact that the answers tend to fall onto a continuum rather than cozy themselves into a tidy binary yes or no category, both doctors and patients alike need to carefully consider the data before “knowing” that a drug is right for them.
Nevertheless, the power of the double-blind, randomized, placebo-controlled trial lies in its ability to ask these questions in an organized and systematic way. We can say this drug saves this many lives (great!) but comes at the cost of these side effects, of which these particular effects are truly dangerous (not great). They do not settle questions; they give us a framework by which we can ponder uncertainty and allow us to decide where we can place a drug’s value on the spectrum.
In this chapter, we add a new form of uncertainty to the mix. Drug trials like the kind I have just discussed look forward in time. But not all clinical research is performed this way, and sometimes we look backward to see whether some remote event led to a disease that we’re seeing now. We can compare people who are suffering from a disease to those who are not and look for meaningful differences between the two, whether they be lifestyles, education levels, general emotional outlook, or medications, among many other factors. But we’re not waiting around to see which of them will develop disease and testing whether something prevents it; we’re looking at the disease right now and wondering whether something in the past brought them to their current state.
For instance, we could design a study where we took one thousand people with lung cancer, compare them to one thousand people who don’t have lung cancer, and see whether there’s a difference in the proportion of people who smoked cigarettes.* Similarly, we can look directly at those factors and then see whether the factors are associated with more or less disease. In this case, we would compare those who have smoked, that is, the cases, to those who have not, that is, the controls. At least part of that language should seem familiar because the controls in this research comprise the same kind of “do nothing” control category that we saw in drug trials. From there, we can calculate the percentage of lung cancer in the case group, do the same for the control group, and evaluate the relative proportions to see whether there is a statistically significant difference between the two. At the end, we have generated a case-control study. This type of research is known as retrospective, and its calling card is that it looks for meaningful associations in the past. We call those meaningful associations correlations.
Discussed in the Appendix.
The additional layer of uncertainty involves whether a correlation can be considered equivalent to causation. Can one know that smoking causes lung cancer from a study like these? This seems a naïve question because everyone knows almost by instinct that of course smoking causes lung cancer. Yet one of the greatest biostatisticians of the mid-twentieth century remained skeptical of the link when looking at the early smoking–lung cancer retrospective studies, in part because of his insistence that correlation does not equal causation. (That he was a chain smoker might have influenced his interpretation of the data; you can meet him—and his fibrotic lungs—later in the Appendix.)
“Correlation does not equal causation”—graduate students in public health or biostatistics memorize this phrase when learning about retrospective studies. If I take my umbrella to work, and it rains as I walk, it didn’t rain because I did so. Here the chain of causation is totally reversed. Again, this is the easy example, as we can immediately grasp the chain of relationship between rain and umbrella use. Just because two things are related by time and one thing happened to happen before the other, it hardly means that the first thing caused the second thing to occur. So why bother with this kind of research at all if it is so hopelessly mired in an Escher-like unsolvable loop?
For starters, to dismiss retrospective studies as useless is to turn our collective scientific back on an enormous volume of high-quality research that really can teach us things about our bodies, what makes them work and what makes them fail. Consequently, having a nuanced understanding of the scientific ambiguity produced by the correlation/causation problem is a valuable next stop on the tour of uncertainty. Moreover, studies don’t have to be retrospective to have correlation/causation problems, as we’ll also look at some prospective studies that are plagued by the same issue.
In this chapter, we’ll encounter correlations that really did turn out to be causations. But we’ll also see some pretty dramatic failures. Only by looking at both can we start to appreciate the value of research that utilizes correlation, which is commonly retrospective in nature.
Tragedy, Illustrated Through Statistics
In 1965, a young doctor named Arthur Herbst joined the faculty of the Boston Lying-In Hospital, a venerated institution in Boston circles, which had been established more than a century before as one of America’s first health-care facilities devoted exclusively to obstetrics.* Herbst had graduated from Harvard Medical School not long before, and after finishing his residency took up his clinical work with aspirations of becoming an academic physician, pushing the boundaries of what was known about the practice of obstetrics and gynecology.
The Lying-In Hospital would merge in 1966 with the Free Hospital for Women—itself nearly one hundred years old—and after a series of successive mergers would become what is now called the Brigham and Women’s Faulkner Hospital.
It would not take long for him to get his wish, and, as is so often the manner of physicians who become known to posterity, his academic accolades came about through the dutiful and perfunctory documentation of the sufferings of patients. Just as Herbst joined the faculty, a small number of young women were sent to the Lying-In for vaginal bleeding. In and of itself, this was, then as now, a fairly common problem. But these particular women would leave the hospital with the most unlikely of diagnoses, not to mention having their sense of womanhood utterly shattered. For the women, who numbered about a half dozen, underwent biopsies to discover they had a cancer known as clear cell adenocarcinoma, and, following the biopsies, underwent hysterectomies to save their lives.
Clear cell adenocarcinoma under any circumstances is unusual, but in young women it was essentially unheard of, as this was widely known at the time to b
e a cancer of postmenopausal women. This made no sense. Herbst must have treated hundreds, perhaps thousands, of women during this time, yet six or seven unusual cancers in a span of a few years made an impression. There was no description of this phenomenon before. So Herbst, along with a senior colleague named Robert Scully, wrote a paper about these cases in what is known as a case series.
The case series is, in some sense, about the lowest form of medical investigation, meant only to communicate some novel finding. Look here, a case series says, something odd is happening. It is a flare sent up through the technical language of journals, asking aloud, in effect, whether anyone else has seen similar stuff. Herbst and Scully’s paper was published in the journal Cancer in April 1970.
The acceptance of such a paper in a prestigious journal like Cancer bodes well for the career of a young clinical researcher, but if the paper had ended there, it would not merit particular attention more than forty years later. The medical literature is studded with reports such as this, some of which are meticulous in their collection of details and learned in their explanation of the importance of their findings. But there are many reports of strange illnesses that befall people, or groups of people, that have been reported for centuries. A case series in and of itself simply isn’t particularly profound. Herbst made no effort to describe the mechanism by which these rather remarkable cases came to pass but merely noted that they simply were. Indeed, the paper did not even venture a guess as to why this strange cancer would suddenly be so prevalent, and the absence of a guess is startling.†