2008 - Bad Science
Page 25
It wasn’t just the drugs. Everything we associate with modern medicine happened in that time, and it was a barrage of miracles: kidney dialysis machines allowed people to live on despite losing two vital organs. Transplants brought people back from a death sentence. CT scanners could give three-dimensional images of the inside of a living person. Heart surgery rocketed forward. Almost every drug you’ve ever heard of was invented. Cardiopulmonary resuscitation (the business with the chest compressions and the electric shocks to bring you back) began in earnest.
Let’s not forget polio. The disease paralyses your muscles, and if it affects those of your chest wall, you literally cannot pull air in and out: so you die. Well, reasoned the doctors, polio paralysis often retreats spontaneously. Perhaps, if you could just keep these patients breathing somehow, for weeks on end if necessary, with mechanical ventilation, a bag and a mask, then they might, with time, start to breathe independently once more. They were right. People almost literally came back from the dead, and so intensive care units were born.
Alongside these absolute undeniable miracles, we really were finding those simple, direct, hidden killers that the media still pine for so desperately in their headlines. In 1950 Richard Doll and Austin Bradford Hill published a preliminary ‘case-control study’—where you gather cases of people with a particular disease, and find similar people who don’t have it, and compare the lifestyle risk factors between the groups—which showed a strong relationship between lung cancer and smoking. The British Doctors Study in 1954 looked at 40,000 doctors—medics are good to study, because they’re on the GMC register, so you can find them again easily to see what happened later in their life—and confirmed the finding. Doll and Bradford Hill had been wondering if lung cancer might be related to tarmac, or petrol; but smoking, to everybody’s genuine surprise, turned out to cause it in 97 per cent of cases. You will find a massive distraction on the subject in this footnote.*
≡ In some ways, perhaps it shouldn’t have been a surprise. The Germans had identified a rise in lung cancer in the 1920s, but suggested—quite reasonably—that it might be related to poison-gas exposure in the Great War. In the 1930s, identifying toxic threats in the environment became an important feature of the Nazi project to build a master race through ‘racial hygiene’.
Two researchers, Schairer and Schoniger, published their own case-control study in 1943, demonstrating a relationship between smoking and lung cancer almost a decade before any researchers elsewhere. Their paper wasn’t mentioned in the classic Doll and Bradford Hill paper of 1950, and if you check in the Science Citation Index, it was referred to only four times in the 1960s, once in the 1970s, and then not again until 1988, despite providing valuable information. Some might argue that this shows the danger of dismissing sources you dislike. But Nazi scientific and medical research was bound up with the horrors of cold-blooded mass murder, and the strange puritanical ideologies of Nazism. It was almost universally disregarded, and with good reason. Doctors had been active participants in the Nazi project, and joined Hitler’s National Socialist Party in greater numbers than any other profession (45 per cent of them were party members, compared with 20 per cent of teachers).
German scientists involved in the smoking project included racial theorists, but also researchers interested in the heritability of frailties created by tobacco, and the question of whether people could be rendered ‘degenerate’ by their environment. Research on smoking was directed by Karl Astel, who helped to organist- the ‘euthanasia’ operation that murdered 200,000 mentally and physically disabled people, and assisted in the ‘final solution of the Jewish question’ as head of the Office of Racial Affairs.
The golden age—mythical and simplistic though that model maybe—ended in the 1970s. But medical research did not grind to a halt. Far from it: your chances of dying as a middle-aged man have probably halved over the past thirty years, but this is not because of any single, dramatic, headline-grabbing breakthrough. Medical academic research today moves forward through the gradual emergence of small incremental improvements, in our understanding of drugs, their dangers and benefits, best practice in their prescription, the nerdy refinement of obscure surgical techniques, identification of modest risk factors, and their avoidance through public health programmes (like ‘five-a-day’) which are themselves hard to validate.
This is the major problem for the media when they try to cover medical academic research these days: you cannot crowbar these small incremental steps—which in the aggregate make a sizeable contribution to health—into the pre-existing ‘miracle-cure-hidden-scare’ template.
I would go further, and argue that science itself works very badly as a news story: it is by its very nature a subject for the ‘features’ section, because it does not generally move ahead by sudden, epoch-making breakthroughs. It moves ahead by gradually emergent themes and theories, supported by a raft of evidence from a number of different disciplines on a number of different explanatory levels. Yet the media remain obsessed with ‘new breakthroughs’.
It’s quite understandable that newspapers should feel it’s their job to write about new stuff. But if an experimental result is newsworthy, it can often be for the same reasons that mean it is probably wrong: it must be new, and unexpected, it must change what we previously thought; which is to say, it must be a single, lone piece of information which contradicts a large amount of pre-existing experimental evidence.
There has been a lot of excellent work done, much of it by a Greek academic called John Ioannidis, demonstrating how and why a large amount of brand-new research with unexpected results will subsequently turn out to be false. This is clearly important in the application of scientific research to everyday work, for example in medicine, and I suspect most people intuitively understand that: you would be unwise to risk your life on a single piece of unexpected data that went against the grain.
In the aggregate, these ‘breakthrough’ stories sell the idea that science—and indeed the whole empirical world view—is only about tenuous, new, hotly contested data and spectacular breakthroughs. This reinforces one of the key humanities graduates’ parodies of science: as well as being irrelevant boffinry, science is temporary, changeable, constantly revising itself, like a transient fad. Scientific findings, the argument goes, are therefore dismissible.
While this is true at the bleeding edges of various research fields, it’s worth bearing in mind that Archimedes has been right about why things float for a couple of millennia. He also understood why levers work, and Newtonian physics will probably be right about the behaviour of snooker balls forever.*
≡ I cheerfully admit to borrowing these examples from fabulous Professor Lewis Wolpert.
But somehow this impression about the changeability of science has bled through to the core claims. Anything can be rubbished.
But that is all close to hand-waving. We should now look at how the media cover science, unpick the real meanings behind the phrase ‘research has shown’, and, most importantly of all, examine the ways in which the media repeatedly and routinely misrepresent and misunderstand statistics.
‘Research has shown…’
The biggest problem with science stories is that they routinely contain no scientific evidence at all. Why? Because papers think you won’t understand the ‘science bit’, so all stories involving science must be dumbed down, in a desperate bid to seduce and engage the ignorant, who are not interested in science anyway (perhaps because journalists think it is good for you, and so should be democratised).
In some respects these are admirable impulses, but there are certain inconsistencies I can’t help noticing. Nobody dumbs down the finance pages. I can barely understand most of the sports section. In the literature pull-out, there are five-page-long essays which I find completely impenetrable, where the more Russian novelists you can rope in the cleverer everybody thinks you are. I do not complain about this: I envy it.
If you are simply presented with the conclusions
of a piece of research, without being told what was measured, how, and what was found—the evidence—then you are simply taking the researchers’ conclusions at face value, and being given no insight into the process. The problems with this are best explained by a simple example.
Compare the two sentences ‘Research has shown that black children in America tend to perform less well in IQ tests than white children’ and ‘Research has shown that black people are less intelligent than white people.’ The first tells you about what the research found: it is the evidence. The second tells you the hypothesis, somebody’s interpretation of the evidence: somebody who, you will agree, doesn’t know much about the relationship between IQ tests and intelligence.
With science, as we have seen repeatedly, the devil is in the detail, and in a research paper there is a very clear format: you have the methods and results section, the meat, where you describe what was done, and what was measured; and then you have the conclusions section, quite separately, where you give your impressions, and mesh your findings with those of others to decide whether they are compatible with each other, and with a given theory. Often you cannot trust researchers to come up with a satisfactory conclusion on their results—they might be really excited about one theory—and you need to check their actual experiments to form your own view. This requires that news reports are about published research which can, at least, be read somewhere. It is also the reason why publication in full—and review by anyone in the world who wants to read your paper—is more important than ‘peer review’, the process whereby academic journal articles are given the once-over by a few academics working in the field, checking for gross errors and the like.
In the realm of their favourite scares, there is a conspicuous over-reliance by newspapers on scientific research that has not been published at all. This is true of almost all of the more recent headline stories on new MMR research, for example. One regularly quoted source, Dr Arthur Krigsman, has been making widely reported claims for new scientific evidence on MMR since 2002, and has yet to publish his work in an academic journal to this day, six years later. Similarly, the unpublished ‘GM potato’ claims of Dr Arpad Pusztai that genetically modified potatoes caused cancer in rats resulted in ‘Frankenstein food’ headlines for a whole year before the research was finally published, and could be read and meaningfully assessed. Contrary to the media speculation, his work did not support the hypothesis that GM is injurious to health (this doesn’t mean it’s necessarily a good thing—as we will see later).
Once you become aware of the difference between the evidence and the hypothesis, you start to notice how very rarely you get to find out what any research has really shown when journalists say ‘research has shown’.
Sometimes it’s clear that the journalists themselves simply don’t understand the unsubtle difference between the evidence and the hypothesis. The Times, for example, covered an experiment which showed that having younger siblings was associated with a lower incidence of multiple sclerosis. MS is caused by the immune system turning on the body. ‘This is more likely to happen if a child at a key stage of development is not exposed to infections from younger siblings, says the study.’ That’s what The Times said.
But it’s wrong. That’s the ‘hygiene hypothesis’, that’s the theory, the framework into which the evidence might fit, but it’s not what the study showed: the study just found that having younger siblings seemed to be somewhat protective against MS.
It didn’t say what the mechanism was, it couldn’t say why there was a relationship, such as whether it happened through greater exposure to infections. It was just an observation. The Times confused the evidence with hypothesis, and I am very glad to have got that little gripe out of my system.
How do the media work around their inability to deliver scientific evidence? Often they use authority figures, the very antithesis of what science is about, as if they were priests, or politicians, or parent figures. ‘Scientists today said…Scientists revealed…Scientists warned’. If they want balance, you’ll get two scientists disagreeing, although with no explanation of why (an approach which can be seen at its most dangerous in the myth that scientists were ‘divided’ over the safety of MMR). One scientist will ‘reveal’ something, and then another will ‘challenge’ it. A bit like Jedi knights.
There is a danger with authority-figure coverage, in the absence of real evidence, because it leaves the field wide open for questionable authority figures to waltz in. Gillian McKeith, Andrew Wakefield and the rest can all get a whole lot further in an environment where their authority is taken as read, because their reasoning and evidence are rarely publicly examined.
Worse than that, where there is controversy about what the evidence shows, it reduces the discussion to a slanging match, because a claim such as ‘MMR causes autism’ (or not), is only critiqued in terms of the character of the person who is making the statement, rather than the evidence they are able to present. There is no need for this, as we shall see, because people are not stupid, and the evidence is often pretty easy to understand.
It also reinforces the humanities graduate journalists’ parody of science, for which we now have all the ingredients: science is about groundless, changeable, didactic truth statements from arbitrary unelected authority figures. When they start to write about serious issues like MMR, you can see that this is what people in the media really think science is about. The next stop on our journey is inevitably going to be statistics, because this is one area that causes unique problems for the media. But first, we need to go on a brief diversion.
13 Why Clever People Believe Stupid Things
The real purpose of the scientific method is to make sure nature hasn’t misled you into thinking you know something you actually don’t know.
Robert Pirsig, Zen and the Art of Motorcycle Maintenance
Why do we have statistics, why do we measure things, and why do we count? If the scientific method has any authority—or as I prefer to think of it, ‘value’—it is because it represents a systematic approach; but this is valuable only because the alternatives can be misleading. When we reason informally—call it intuition, if you like—we use rules of thumb which simplify problems for the sake of efficiency. Many of these shortcuts have been well characterised in a field called ‘heuristics’, and they are efficient ways of knowing in many circumstances.
This convenience comes at a cost—false beliefs—because there are systematic vulnerabilities in these truth-checking strategies which can be exploited. This is not dissimilar to the way that paintings can exploit shortcuts in our perceptual system: as objects become more distant, they appear smaller, and ‘perspective’ can trick us into seeing three dimensions where there are only two, by taking advantage of this strategy used by our depth-checking apparatus. When our cognitive system—our truth-checking apparatus—is fooled, then, much like seeing depth in a flat painting, we come to erroneous conclusions about abstract things. We might misidentify normal fluctuations as meaningful patterns, for example, or ascribe causality where in fact there is none.
These are cognitive illusions, a parallel to optical illusions. They can be just as mind-boggling, and they cut to the core of why we do science, rather than basing our beliefs on intuition informed by a ‘gist’ of a subject acquired through popular media: because the world does not provide you with neatly tabulated data on interventions and outcomes. Instead it gives you random, piecemeal data in dribs and drabs over time, and trying to construct a broad understanding of the world from a memory of your own experiences would be like looking at the ceiling of the Sistine Chapel through a long, thin cardboard tube: you can try to remember the individual portions you’ve spotted here and there, but without a system and a model, you’re never going to appreciate the whole picture.
Let’s begin.
Randomness
As human beings, we have an innate ability to make something out of nothing. We see shapes in the clouds, and a man in the moon; gamblers are co
nvinced that they have ‘runs of luck’; we take a perfectly cheerful heavy-metal record, play it backwards, and hear hidden messages about Satan. Our ability to spot patterns is what allows us to make sense of the world; but sometimes, in our eagerness, we are oversensitive, trigger-happy, and mistakenly spot patterns where none exist.
In science, if you want to study a phenomenon, it is sometimes useful to reduce it to its simplest and most controlled form. There is a prevalent belief among sporting types that sportsmen, like gamblers (except more plausibly), have ‘runs of luck’. People ascribe this to confidence, ‘getting your eye in’, ‘warming up’, or more, and while it might exist in some games, statisticians have looked in various places where people have claimed it to exist and found no relationship between, say, hitting a home run in one inning, then hitting a home run in the next.
Because the ‘winning streak’ is such a prevalent belief, it is an excellent model for looking at how we perceive random sequences of events. This was used by an American social psychologist called Thomas Gilovich in a classic experiment. He took basketball fans and showed them a random sequence of X’s and O’s, explaining that they represented a player’s hits and misses, and then asked them if they thought the sequences demonstrated ‘streak shooting’.
Here is a random sequence of figures from that experiment. You might think of it as being generated by a series of coin tosses.
OXXXOXXXOXXOOOXOOXXOO
The subjects in the experiment were convinced that this sequence exemplified ‘streak shooting’ or ‘runs of luck’, and it’s easy to see why, if you look again: six of the first eight shots were hits. No, wait: eight of the first eleven shots were hits. No way is that random…