by Steven Hatch
Both clear cell adenocarcinoma and AIDS, at the time of their initial descriptions when researchers were trying to understand their causes, were rare diseases. Moreover, both were found to have risk factors strongly associated with the disease, in the forms of DES and poppers. Very few people who did not have these diseases used either drug, but nearly all of those who were afflicted did use them. That provided very strong circumstantial evidence of causation, and despite this only one of them turned out to be truly important to understanding the disease, while the other was a red herring.
Coffee, Heart Attacks, and Correlation
Those are, in effect, the simple situations. When we try to understand the causes of much more common diseases, the kind that most of us are likely to die from, the correlation/causation problem is much more difficult to tease apart. Because many people suffer from common diseases like heart disease, various cancers, and dementia, it’s very challenging to find common denominators particular to one group of patients when comparing it to healthy controls.
At the same time, the blurring of correlation with causation is even more likely to occur when the news media is eager for a catchy story. If the purported risk on which the news bit is based is something common, that’s even better, because it has a greater chance of grabbing the attention of the red-meat eater, or television watcher, or booze consumer, or whatever else is being discussed. Here, for instance, is a typical example where researchers are trying to learn about the possible causes of heart attacks and looking at a common risk factor: coffee consumption. The following is a portion of an article from the Internet health website Medscape:
Coffee May Trigger Heart Attack Attacks After Single Cup, Light Drinkers Most at Risk
That cup of coffee you’re craving might not be such a good idea.
Research in the September issue of Epidemiology suggests coffee can trigger a heart attack within an hour in some people. Java junkies can take some comfort from the finding that the risk was highest among light coffee drinkers (those who consumed up to one cup a day). For those people, the risk of heart attack increased fourfold when they indulged. Couch potatoes and those with other risk factors for heart disease were also at greater risk of having a heart attack after drinking a cup of coffee, the study showed. As a result of these findings, “people at high risk for a heart attack who are occasional or regular coffee drinkers might consider quitting coffee altogether,” [say the researchers].
The new study was based on 503 cases of nonfatal heart attacks in Costa Rica. The researchers asked participants about their coffee consumption in the hours and days before their heart attack. Although the study was conducted in Costa Rica, the researchers say the results are relevant to the U.S. because Americans and Costa Ricans have similar caffeine habits. (my emphasis)
This is a special kind of a case-control study, where the people who had the heart attacks themselves serve as their own controls:* they do this by reporting how much coffee they’ve consumed roughly over the past month, and then the statisticians look for an uptick in any of the factors immediately prior to having the heart attack. It’s a common study design to identify risk factors for “acute events” like heart attacks, and these researchers carefully tried to “control for confounders”—that is, they did their best to make sure that it was really the coffee that was associated temporally with the heart attack and not, say, having a cigarette with the morning cup of joe. They also asked about other risk factors besides coffee and cigarettes. Although it’s a very well-done study, it’s still subject to the limitations of case-control studies in general, which I’ll discuss forthwith, and in failing to point out those limitations, the article oversells its conclusions. Dramatically.
Technically, it’s called a case-crossover design.
Leave aside the issue of whether it’s absolutely safe for Americans to extrapolate data obtained from a study of Costa Ricans—although they may have similar caffeine habits, they probably don’t have similar dietary, drinking, and exercise habits, among several other factors. First, the story falls into the classic trap of confusing correlation and causation: at most what can be said is that coffee consumption is associated with having a heart attack, which is not quite the same thing as saying that it’s the cause of heart attacks. The researchers present a rationale in both their paper and in the article as to how coffee could cause a heart attack and why occasional drinkers might be at the highest risk. But they didn’t prove it to be the cause of the heart attacks because, as I’ve discussed, that’s not possible to do in a case-control study.
Second, the study results are very much influenced by the problem of recall bias. Imagine yourself, lying in a hospital bed in Costa Rica, mulling over the major change that just took place in your life, and suddenly a researcher comes to you and starts asking questions about how much coffee you drank just before you had the heart attack. It’s a leading question, as are all of the questions about all of the risk factors. It’s not the fault of the researchers; it’s just the normal process of recovering from some devastating medical event. Patients commonly review their lifestyle decisions after such illnesses and theorize no less than scientists as to why they became sick. They can be heavily influenced by guilt and many other emotions, which warps their reports to researchers. The question isn’t whether a case-control study like this can avoid recall bias because it can’t; the question is how much confidence one can have in the conclusions.
Finally, imagine being given a questionnaire that asked you to recall everything you’ve consumed—all the different foods and drinks, how much alcohol, how many cigarettes—during the past twenty-four days. In addition, you need to give a reasonable estimate of what time you had those things: the heart attacks in this study happened more often in the mornings: Was that because of the time of day or the coffee? Plus you need to report your level of physical activity, your emotional state, and a few other factors to boot. Could you recollect the details of the past month of your life accurately enough to be of use to a scientist? It’s difficult under optimal circumstances; I can barely remember what I ate yesterday, let alone report how much coffee I had a week ago.
For all these reasons, identifying genuine risk factors commonly used in the general population is notoriously difficult and subject to all manner of uncertainty. A risk factor like coffee, if it is a risk for cardiovascular disease at all, is almost certainly small in its effects. We know there are millions of non–coffee drinkers who have heart attacks and millions of java fiends who do not. As a consequence, the number of people researchers need to recruit for a study looking for a slightly higher prevalence of coffee consumption in a heart-disease cohort is large. Risk factors with small effects are much more sensitive to random sampling issues: if you happen to recruit, just as a matter of chance, ten or twenty extra non–coffee drinkers into your cohort of 2,000 subjects with heart disease, you might find no effect where in fact there really was one; if you include the same number into the healthy control group, you might conclude the opposite. Plus, there is the issue of recall bias, as well as the problems introduced by standardization—if the questionnaire asks about cups of coffee per day, not all people are drinking the same size cups, to say nothing of the difference in caffeine between a mega mocha grande with an espresso chaser and the local donut shop’s sixteen-ounce standard brew. Finally, after all of these obstacles, there is the irreducible problem of correlation and causation.
In short, studies such as these, which involve identifying some common risk factor of some common disease and asking whether they are associated, almost by default find themselves in the dead middle of the spectrum of certainty. They seek to find causations where at best only correlations may exist, and high levels of uncertainty are inherent in their structural design because they introduce recall bias as well as lump together people who had different degrees of exposure to the risk factor, good or bad.
Sweet Dreams: Does Eating Chocolate Lower Your Risk of Heart Failure?
Over the past several years, one such risk factor has become a media darling: dark chocolate. However, unlike the coffee described above, which is a “negative” risk factor, dark chocolate has—mercifully—been studied mainly with respect to its beneficial properties. (Coffee, actually, is something of a mixed bag, with some studies associating harms, but others with benefits.) Chocolate is actually a complex amalgam of compounds, but starting in the early 2000s a small number of research papers began focusing on one category called flavonoids. These are chemicals found in high levels in chocolates, especially in the more pure form of dark chocolate. Some early studies in test tubes showed that flavonoids have a variety of effects that might be favorably linked to reducing heart disease risk. They have antioxidant effects, alter the endothelium (the cells that line the inside of blood vessels), and possibly inhibit platelets from clumping, which is a particularly beneficial side effect because platelet clumping is one of the initial processes in the cascade of events that causes heart attacks. The research looked genuinely promising, and for that matter it still does.
Whether medical science has definitively demonstrated that regularly ingesting dark chocolate really reduces one’s risk of heart failure is another matter entirely. A variety of studies have been performed over the past decade, and there has been a steady drumbeat in the popular press about its ambrosial effects. Take a paper from the Journal of Internal Medicine in 2009. It is slightly different in structure from the DES paper in that it is prospective rather than retrospective in nature; that is, patients were recruited at a given point in time and then followed into the future—in the case of this study, for eight years. More than a thousand Swedish people who had experienced their first heart attack were given questionnaires looking at dozens of risk factors, then followed in Swedish national registries for hospitalizations and deaths. They concluded that the more chocolate the people consumed, the more likely they were to survive. This received a cheerful review in the New York Times, among many other news outlets, and is but one example in dark chocolate’s fawning press coverage that was at full steam by the time the story was published.
The highest-quality prospective studies follow patients before they develop disease and then look for risk factors by comparing who goes on to develop disease versus who doesn’t. The great Framingham Heart Study—which deserves more than this small detour, alas—is one of the classic examples of prospective cohort research. The study began in 1948 in Framingham, then a small town about an hour west of Boston. They enrolled just over 5,000 men and women and met with them year after year, having the subjects fill out questionnaires and undergo physical exams and blood tests. Eventually, some subjects in this cohort had heart attacks or strokes or developed some other form of cardiovascular disease, while others didn’t. Researchers and statisticians could then start finding factors that were associated with disease. And find them they have: more than 1,200 research papers have come out of FHS. Nearly every major risk factor for cardiovascular disease familiar to even moderately informed laypeople—like smoking, diabetes, a sedentary lifestyle, high blood pressure, obesity, and high “bad” cholesterol—has either been discovered or confirmed by research in Framingham.
By contrast, the Swedish chocolate study is a variant where the cohort is formed by people who’ve already been diagnosed with heart disease and then are followed to see who died sooner. This unusual study design should leave one far more circumspect about its conclusions than if a similar effect were found as part of the Framingham research.
A large number of the dark chocolate studies, like the Swedish study, suffer from this blending of correlation with causation—both in the research itself and especially in its media coverage. Maybe the reason chocolate eaters were less likely to die in this time span is that they were simply healthier people to begin with and eating chocolate was merely a cultural marker associated with happiness and healthiness. There is the further problem of standardization: Especially when there are multiple purported mechanisms that could be due to a number of different compounds found in chocolate, it is difficult in a study like this to know how much of which kind of chocolate might be physiologically beneficial. None of these people were taking doses standardized to their weight, height, or body mass. And chocolate itself is a product that is cultivated in different parts of the world, leading to the question of whether it’s a particular flavonoid unique to African chocolate that might have benefits compared to chocolate processed in Honduras. For all of these reasons, one can overinterpret the findings in myriad ways, and conclude that dark chocolate is a life-giving elixir when in fact it is nothing more than a much-loved confectionary.
I am not a member of the anti–dark chocolate or antiflavonoids lobby: there is always a bowl of chocolate in my office for my residents and students to snack on to make the day more pleasant. Rather, I’m pointing out why I believe we’re living in an age of uncertainty about its effects, and why you should read news reports such as those on dark chocolate with a certain amount of caution. In this particular case, the New York Times piece helpfully includes such a caveat. “Before concluding that a box of Godiva truffles is health food, chocolate lovers may want to consider some of the study’s weaknesses,” the author writes. “It is an observational study, not a randomized trial, so cause and effect cannot be definitively established.” Such warnings are often left out; I practically wept for joy when I read these sentences.
So how do researchers manage to learn whether strong correlations are equivalent to the causes of a given disease? The ideal answer is to do a prospective clinical trial, the randomized double-blind placebo-controlled study discussed in the previous chapter, exclusively focused on the one single variable of interest. These studies are expensive and time-consuming, so there has to be a pretty big incentive to tackle such a project when there’s already circumstantial evidence that some thing—environmental chemical, drug, radiation exposure—really does cause, or protect people from, disease. In one instance, just such a prospective clinical trial led to the mother of all correlation/causation confusions, causing billions of dollars of losses for several drug companies, but one in particular: Wyeth Pharmaceuticals.
Correlation/Causation and Hormone Replacement Therapy
That story begins in 1966 when a gynecologist named Robert Wilson wrote a book that stands as one of the watershed moments in what we might call the medicalization of everyday life—that is, turning natural processes into “diseases” for which doctors, and by extension the pharmaceutical industry, can offer “cures.” Titled Feminine Forever, Wilson argued that menopausal women could enhance nearly all aspects of their lives by taking estrogen therapy. “Many women simply refuse to recognize menopause for what it is—a serious, painful, and often crippling disease,” he wrote. In a line that practically defies belief today, he claimed that estrogen would counteract this by ensuring that her “breasts and genital organs will not shrivel. She will be much more pleasant to live with and will not become dull and unattractive.” Simply to peruse Dr. Wilson’s thoughts today is an exercise in trying to keep one’s lower jaw attached to the rest of one’s face. It is difficult now to conceive that a book of such unadorned paternalism—both in the sense of physician paternalism as well as male paternalism—could be the best seller that Feminine Forever was, influencing millions of women for nearly a generation.
Dr. Wilson’s financial relationships with several pharmaceutical companies who had a stake in expanding the hormone market have since been well documented, and even adjusted to the social mores of the 1960s it wasn’t pretty. Not long after Feminine Forever debuted, questions were being raised about its objectivity; an article in the New Republic documented that Wilson was effectively paid to write the book by Wyeth Pharmaceuticals. A large volume of commentary has since been generated on the marketing and business aspects of hormone replacement therapy. In effect, critics have charged that books like Feminine Forever are representative examples of what happens when physicians hitch their wago
ns to the star of Big Pharma: whole disease categories are invented to create new markets, and physicians like Wilson become highly compensated dupes, wittingly or not, to the whole enterprise.
Regardless of the financially symbiotic relationship between Dr. Wilson and his corporate masters, in terms of the correlation/causation story, the relevance of Feminine Forever is that, because it was so influential, it effectively began an enormous natural experiment on the risks and benefits of hormone therapy in women. Of course, all freely marketed drugs create natural experiments. For the most part these go unmeasured, but following the boom in hormone replacement therapy, researchers started taking a careful look at whether the benefits touted by Dr. Wilson had any basis in reality.
For the most part, they did indeed appear to. There were some cautionary tales: research published in the New England Journal of Medicine in 1975 showed that estrogen-only therapy increased the risk of developing uterine cancer. Typically, such a negative study in the world’s premiere medical journal causes a massive freeze in prescriptions. But pharmaceutical companies argued that the problem observed in the New England Journal study was caused by “unopposed estrogen”—which was in hormonal terms something akin to all accelerator and no brake. In its place, they offered the newly developed combination of estrogen and progesterone. By including the “break” hormone of progesterone, the combination restored the natural balance that unopposed estrogen disrupted. Therefore, the risks highlighted by the study were claimed to be no longer valid. It was a savvy corporate reply, and although it was pure theory because no clinical trials were performed to test this hypothesis, hormone replacement sales continued nearly unabated.
At any rate, despite the occasional study such as this, an increasing amount of research was being published indicating that hormone replacement therapy really could reap substantial health benefits beyond controlling menopausal symptoms and leaving one feeling youthful. Some studies showed that women who took hormone replacement had a lower incidence of cardiovascular disease—that is, they had fewer heart attacks and strokes. Other studies showed that they had stronger bones and fewer fractures when compared to nonusers, indicating that hormone replacement might prevent osteoporosis.