by Steven Hatch
Compare the absence of a discussion about a proposed mechanism for the clear cell cases to another famous case series, the 1981 CDC Morbidity and Mortality Weekly Report in which the first cases of what is now called AIDS were reported. Although the CDC staff must have been no less mystified than was Herbst by what they were seeing in previously healthy young gay men dying of a strange pneumonia known as Pneumocystis carinii, they make a foray into considering the cause. The easy part was that they all had the same sexual practices—almost anybody would have noted that this wasn’t coincidental—but they go further and get the essential drift of how HIV works the very first time out: “All the above observations suggest the possibility of a cellular-immune dysfunction related to a common exposure that predisposes individuals to opportunistic infections,” they wrote. That’s a pretty accurate description of HIV, and they provided this depiction two years before the virus itself was discovered.
It’s what happened after the Cancer paper that completely changed Dr. Herbst’s professional life, to say nothing of the lives of millions of women. Still, outside the world of medicine—indeed, even largely within the world of medicine—Arthur Herbst is barely known. The names of early 1970s celebrities such as second-tier baseball players and mediocre actors from sitcoms like The Partridge Family (which debuted about the time of the Cancer paper) are much more likely to be recognized than his by even well-informed Americans. In fact, it isn’t even close. And yet, due in large part to Herbst’s research, people think about pregnancy in a completely, utterly different way than they had before.
What happened was this: Dr. Herbst was approached by the mother of one of the women included in the case series. The Cancer paper had received some media publicity, and this mother had followed the coverage. She contacted Herbst and asked him a question. You know, I was wondering, she said. When I was pregnant, I took a medication to stop bleeding. Could it be connected to my daughter’s cancer?
Intrigued, Herbst ended up asking the other mothers of the young women from the case series, and was surprised to find that essentially all of them had taken a drug called diethyl stilbestrol, or DES, during their pregnancies. DES was a synthetic estrogen, initially approved by the FDA in 1941 mainly for symptoms associated with menopause. Not long after it came to market, the drug’s maker, Bristol-Meyers Squibb, sought approval for its use in aiding the pregnancies of women with a history of miscarriage, which it was granted in 1947. Within a few years, however, there was evidence that DES might not be especially useful in helping these women carry their babies to term: a group of obstetricians from the University of Chicago performed a double-blind, placebo-controlled trial to evaluate its effectiveness and found that it made no difference in pregnancy outcomes. That didn’t stop DES from being widely prescribed to another generation of pregnant women, even though multiple obstetric textbooks noted that it wasn’t effective treatment. Its heyday, though, was in the early 1950s, just about the time these young women were born.
Herbst reviewed the data on the women with the clear cell cancer with several colleagues, and they collectively decided to perform something akin to a retrospective survey to learn whether DES might be associated with this rare cancer. In other words, they set up a case-control study like that discussed above. The seven women from the case series, along with an eighth who had come to their attention from a private-practice physician, were grouped into one cohort and then “matched” with a larger group of young women of the same age who didn’t have cancer. They asked questions about their medical, and especially gynecological, histories: Was there some factor that separated the young women with cancer from the control group? The women reported whether they smoked, whether they had pets, what cosmetics they used, how old they were when they began having periods—anything to try to tease out what might be linked to this unusual disease. They didn’t find any differences.
Then they looked at the mothers of the girls and asked questions about their pregnancies, and based on what they found the practice of obstetrics was never the same. On the following page is the Table from the article, published in the New England Journal of Medicine in 1971, showing the relevant data.*
Table 1 of the article briefly summarizes the clinical aspects of the cases: how old they were when symptoms began, year of birth, year of treatment, and the therapy. The final column provides their clinical status as of the article’s publication in 1971: one had died the year of her procedure, while the others were described as “living and well.” Living? Sure. Well? Well, one supposes that Dr. Herbst and his colleagues were talking in the narrow, clinical sense of “well,” rather than their overall spiritual state at the time, because all of the women had just undergone total hysterectomies and would never bear children.
They found three factors that were significantly more common among the case mothers compared to the control mothers: bleeding during pregnancy, a history of previous miscarriage, and the use of DES. The statistical analysis of significance is discussed in the Appendix, but the table is a mathematical description of the likelihood that such factors could be different by chance. For instance, three of the eight case mothers, or 37.5 percent, had bleeding during their pregnancy, while only one of the thirty-two control mothers—a measly 3 percent—had the same problem. The likelihood that this was just due to random chance was calculated to be less than one in twenty.
This was an intriguing finding, but what practically leaped off the page was the DES use: seven of eight of the case mothers used DES during the first trimester of their pregnancies; none of the thirty-two control mothers had done so. The difference in those proportions is much more dramatic; the likelihood that nearly all of the mothers would use DES and none of the control mothers would just due to random chance was calculated to be less than 1 in 100,000.
There were two consequences of that one column in that one table. The first was that, later that year, the FDA withdrew approval for DES to be marketed for the prevention of miscarriage. (It was pulled entirely from circulation four years later.) The second was that nobody, whether inside or outside medicine, ever thought about pregnancy in quite the same way.
It is hard now, in an age when many women won’t so much as look at a tuna steak during pregnancy for fear of being contaminated by trace amounts of heavy metals, to imagine a time when doctors provided drugs to pregnant women without much thought about how it might affect the baby. But until the 1950s and 1960s many, if not most, physicians believed that drugs could not cross the placenta, and thus rigorous safety studies were never performed on pregnant women.
FIGURE 7.1. The single column that changed the practice of obstetrics. Of eight young women who had been diagnosed with the rare vaginal cancer known as clear cell adenocarcinoma, all but one of their mothers had used the drug diethylstilbestrol during pregnancy. By contrast, none of the thirty-two mothers of otherwise healthy “matched controls” had used the drug.
SOURCE: Herbst, A., “Adenocarcinoma of the Vagina: Association of Maternal Stilbestrol Therapy with Tumor Appearance in Young Women,” New England Journal of Medicine 1971; 284(16):878–881.
The drug that would initially lead to a reconsideration of this bit of received wisdom was, ironically, never approved for use in pregnancy in the United States: thalidomide. Thalidomide was initially developed as a sedative by a German pharmaceutical firm in the late 1950s but was quickly appreciated for its antinausea properties and subsequently marketed for morning sickness. Nearly fifty countries in all approved thalidomide. In the United States, however, one woman working in the Food and Drug Administration named Dr. Frances Oldham Kelsey found the safety studies on thalidomide wanting, and she insisted on further studies proving its safety in order for it to be approved. The studies never came because around the time of Dr. Kelsey’s act of bureaucratic stubbornness, thousands of thalidomide-using mothers in Europe (where the drug had been approved and was commonly prescribed) bore children with malformed limbs, a condition known as phocomelia. Many of the children
did not survive. Thalidomide was withdrawn within a few years, having never been stocked on pharmacy shelves in the United States. Dr. Kelsey saved tens of thousands of lives, and in doing so became one of medicine’s greatest heroes.* She retired from the FDA, Nestor-like in her knowledge of the institution, at the age of ninety in 2005, and she is still living in British Columbia as I complete this book in mid-2015.
Although she is a bit more well known than Arthur Herbst, she is significantly more obscure than physicians such as Deepak Chopra and Andrew Weil, whose contributions to the profession, even if viewed only through the lens of public education and outreach, can be described as harmless but mostly useless, at absolute best.
Thalidomide began the paradigm shift with respect to environmental effects on fetal development, and the DES paper a decade later largely completed it. By the end of the 1970s, the Food and Drug Administration introduced a set of drug classifications based on fetal risk. Category A drugs were found to be safe based on quality studies; Category B drugs did not appear to show risk, but there was not the high-quality evidence seen in the Category A drugs; Category C drugs—a fairly confusing category for clinicians—had animal but not human studies showing adverse fetal effects, so they may be considered helpful “despite risks”; Category D drugs were likely harmful but might be warranted because of their use (i.e., to save the mother’s life); and Category X drugs were unambiguously harmful and had no legitimate uses in pregnant women. (Thalidomide, which is still used today for such varied diseases as leprosy and multiple myeloma, was, of course, Category X.) Today, this classification scheme remains in effect, and doctors carefully scrutinize drugs for their pregnancy category before prescribing them to their pregnant patients, all a consequence of the one-two punch formed by the research on thalidomide and DES.
If this category scheme has a familiar feel to it, it should: the FDA fetal risk classification is essentially its own spectrum of certainty, specifically in relation to the harms that a given drug carries for the baby.
The DES data was at once striking and slightly quizzical, for although the association between the drug and the disease seemed to be beyond doubt, clear cell adenocarcinoma of young women remained a very rare disease despite the fact that millions of women had used DES during their pregnancies. If you had the cancer, the odds were overwhelming that your mother had taken DES (in technical language, this is known as the odds ratio). Yet that wasn’t true in the reverse: if a mother took DES, the chance her daughter would develop cancer—statistically, the absolute risk—was estimated to be about one in one thousand, based on later studies.
But the most important aspect of the DES study is perhaps one of the least understood, and as I said at the outset of the chapter, it directly relates to this new layer of uncertainty introduced by the backward-looking aspect of this research. The paper didn’t demonstrate that DES caused clear cell adenocarcinoma; it merely noted that the two things seemed to be linked. In an interview commemorating the fortieth anniversary of its publication, Dr. Herbst made this point perfectly clear. “Actually, at the time, [we wanted to write the paper] in clear language that didn’t result in claiming more than an association,” he said, underscoring that they weren’t suggesting causation at that juncture. The only way that causation could be proved was by performing laboratory studies—first in a test tube, then in animals—that showed the mechanism by which the DES led to cancer. Those studies would be done over the next generation, and as a result much more is understood about the behavior of DES in utero, as well as the passage of all sorts of drugs across the placenta and what kinds of effects they might have on a developing fetus. In essence, it paved the way for an entire subspecialty in obstetrics as well as pharmacology.
Correlations—the careful, statistical juxtaposition of risk factors to diseases—are the raison d’être of the kind of research typified by the DES study. They can hint at a causal relationship between one and the other, but they can never prove it. In the case of the early DES research, the data might have indicated that there was something about these particular women (say, a genetic defect) and that the DES use might merely have been a substitute for this hidden abnormality. Perhaps, given that all the women were from Massachusetts, the reason lay in some hidden environmental factor and the DES use was merely a coincidence.
To be clear, my goal here isn’t to suggest that DES isn’t causally linked to clear cell cancer: so much quality research has been done since that first paper that such a claim would stretch credulity. (Likewise, for those intrigued by the notion that a great biostatistician would doubt that smoking isn’t linked to lung cancer, stay tuned until the Appendix.) Instead, I’m trying to point out that the process by which researchers stumbled across this tragic side effect is subject to limitations. Correlations do not equal causations. The fact that they occasionally seem to can be a trap—sometimes a multibillion dollar trap, as we’ll soon see.
The DES study was successful for a number of reasons. First, the cancers were, in the medical parlance, “zebras”—that is, they stood out for their unusualness. Young women simply weren’t known to have this kind of cancer before the mid-1960s. That helped in the search for potential causes because if a risk factor could be identified that was peculiar to the patients, it would make a strong circumstantial argument that the risk factor might be the cause itself.
Second, the questionnaire that the researchers prepared for the “DES mothers” focused on their pregnancies. That’s a relatively short period for the women to recall, during a time when they were more likely than usual to be aware of what medications they were taking. Even then, one of the seven mothers who took DES wasn’t certain that she had, in fact, taken it, and her obstetrician was consulted to confirm its use. The lack of reliability in retrospective recall, as I’ll discuss in a moment, can have a profound impact on studies that look for correlations.
Third, DES was a relatively new drug. It had been around for not much more than twenty years and had a very narrow usage. Obviously, it had no indication for men, and even in women it was indicated for a fairly small group. That meant that tracking its effects was a much simpler task than looking at, say, the effects of common chemicals in the environment.
All of these factors made DES an ideal candidate to study in relation to a rare cancer, where a strongly positive correlation might be reasonably assumed to suggest a causal link. Even so, sorting out genuine risks from mere incidental associations in rare diseases is still a challenge. When AIDS first came on to the medical scene, all of the patients described in the literature were young gay men. Much of the early epidemiology focused on their sexual behavior, and one of the risk factors that was strongly associated with the disease were drugs known as “poppers.” Belonging to a chemical class named alkyl nitrites, poppers created brief but intense highs and were considered to be aphrodisiacs. Essentially all of the first patients with AIDS* reported using poppers: if a case-control study were performed on these men in the same way the DES study was done, with heterosexual men as controls, the data on popper use would look no different from that of DES.
These first patients in 1981 and 1982 were not diagnosed with “AIDS” as there was no name for the disease initially, which may seem difficult to believe in an age where routine blizzards are now heralded with names on the Weather Channel even before they strike. The first well-known name for AIDS was GRID, or Gay Related Immune Deficiency. “AIDS” was proposed as a name in August 1982 to account for patients, among them Haitians and hemophiliacs, who had similar presentations but were not gay. More than anything else, it was the presence of AIDS in nongay populations that led most scientists to suspect that the disease was caused by a virus rather than recreational drugs then in use among gay men.
Of course, we know now that poppers had nothing at all to do with AIDS and was simply a marker—that is, it was a risk factor that tracked with the disease. AIDS was caused by a virus that could spread only through sex and blood. Young gay men who had multi
ple sexual partners took poppers as part of the party culture where the virus was spreading, and they were being used deliberately to augment the sexual experience. Poppers, then, were merely a marker for having sex—the very activity that was leading to the transmission of the virus—but that was a reflection of that particular subculture and played no direct part in causing AIDS. (To this day, in AIDS-denialist circles, poppers are still sometimes invoked as the cause of AIDS-related diseases such as Kaposi’s sarcoma.) This association, however, seemed plausible in the early days of the epidemic such that papers like “Toxicity, Immunosuppressive Effects and Carcinogenic Potential of Volatile Nitrites: Possible Relationship to Kaposi’s Sarcoma” could be found in mainstream scientific journals as late as 1984, one year after the announcement of the discovery of HIV.
The “poppers theory” wasn’t bad science. In fact, it was essentially of the same scientific quality as the DES / clear cell cancer theory. What the poppers theory does demonstrate is how carefully one must approach even strong correlations between risk factors and disease. The theory that poppers caused AIDS was wrong, while the theory that DES use in a mother caused her daughter’s cancer was right. The only reason we know this to be true is the many subsequent studies. As more information poured in on AIDS patients, the notion that poppers was involved became less and less plausible; as more information poured in on young women with clear cell adenocarcinoma, the notion that DES was involved became stronger and stronger.