Big Science has a Big, Bad Secret: it doesn’t work. This is down to a combination of perverse incentives, careerism and commercialization. The incentivization of bad science is nothing new. Donald T. Campbell, an American social scientist, coined his eponymous law in 1979: ‘If researchers are incentivized to increase the number of papers published, they will modify their methods to produce the largest possible number of publishable results rather than the most rigorous investigations.’ The medical statistician Douglas Altman wrote an editorial for the British Medical Journal in 1994 entitled ‘The scandal of poor medical research’. This editorial won a 2015 poll of BMJ readers for the paper the journal should be most proud of publishing. Altman simply articulated clearly and concisely what everyone within academic medicine knew:
Put simply, much poor research arises because researchers feel compelled for career reasons to carry out research that they are ill equipped to perform, and nobody stops them. Regardless of whether a doctor intends to pursue a career in research, he or she is usually expected to carry out research with the aim of publishing several papers.
… The poor quality of much medical research is widely acknowledged, yet disturbingly the leaders of the medical profession seem only minimally concerned about the problem and make no apparent efforts to find a solution… The issue here is not one of statistics as such. Rather it is a more general failure to appreciate the basic principles underlying scientific research, coupled with the ‘publish or perish’ climate… We need less research, better research, and research done for the right reasons.
Over the past few years, Big Scientists have become increasingly concerned by what they call ‘the Replication Crisis’: the indisputable fact that most research findings are never repeated – replicated – to confirm that the findings are real. Most ‘positive’ studies are never repeated to see if the finding withstands further scrutiny: less than 1 per cent of all psychological research, for example, is ever replicated. The Royal Society – the world’s grandest scientific institution – produces an open-access online journal, Royal Society Open Science, which in 2016 published a paper entitled ‘The natural selection of bad science’ by Paul Smaldino (from the University of California) and Richard McElreath (from the Max Planck Institute in Leipzig). Using a Darwinian model of natural selection, the authors argued that Big Science is driven by multiple perverse incentives to produce bad work. Academic promotion in science depends very much on publication metrics, which are based on the overall number of papers, and how often these papers are cited by other researchers. These metrics encourage scientists to value quantity over quality. The rate of production of new scientific papers is increasing exponentially: global scientific output doubles every nine years. Most of the increase is driven by perverse incentives. Richard Horton, the editor of the Lancet, wrote: ‘Part of the problem is that no one is incentivized to be right.’
The new breed of biomedical career researcher is often a great salesman for his work. Real scientists – like Harry Collins’s particle wave physicists – tend to be reticent, self-effacing, publicity-shy and full of doubt and uncertainty, unlike the gurning hucksters who seem to infest medical research. Smaldino and McElreath wearily observed:
In the years between 1974 and 2014, the frequency of the words ‘innovative’, ‘ground-breaking’ and ‘novel’ in PubMed abstracts increased by 2,500 per cent or more. As it is unlikely that individual scientists have really become 25 times more innovative in the past 40 years, one can only conclude that this language evolution reflects a response to increasing pressures for novelty, and more generally to stand out from the crowd.
They argue ‘that the incentives to generate a lengthy CV have particular consequences on the ecology of scientific communities’. Journals have a bias for ‘positive’ results: this incentivizes research techniques and statistics that have a high rate of false positives. The majority of the false positive publications are not due to deliberate fraud, but to practices such as ‘p-hacking’ – the common practice of putting raw data through statistical software until a ‘significant’ p-value is found. (‘P’ stands for probability: a p-value of 0.05 means that the probability of the result occurring by chance is 1 in 20; a p-value of 0.01 means the probability is 1 in 100, and so on. A p-value of 0.05 is regarded as the lowest level of statistical ‘significance’.) ‘P-hacking’ is also known as ‘data torture’ and ‘data dredging’. What is to be done? Smaldino and McElreath are not optimistic:
Institutional change is difficult to accomplish, because it requires coordination on a large scale, which is often costly to early adopters. Yet such change is needed to ensure the integrity of science… A more drastic suggestion is to alter the fitness landscape entirely by changing the selection pressures: the incentives for success. This is likely to be quite difficult.
Big Science has recognized its big problem, and is trying – or knows that it is important to be seen to try – to fix it. In April 2015, a meeting was held at the Wellcome Trust in London, under the collective auspices of the Academy of Medical Sciences, the Wellcome Trust, the Medical Research Council and the Biotechnology and Biological Sciences Research Council. This was called a ‘Symposium on the Reproducibility and Reliability of Biomedical Research’, which sounds distinctly bland, but the meeting was the first serious attempt to address the undisputed fact that medical research has lost its way. The meeting was a semi-secret affair, with attendees asked to observe Chatham House rules – that is, participants are free to use the information, but cannot identify the speaker or their affiliation. Those who worked for government agencies were particularly anxious not to be quoted. Richard Horton wrote in his own journal shortly after the symposium: ‘Why the paranoid concern for secrecy and non-attribution? Because this symposium – on the reproducibility and reliability of biomedical research – touched on one of the most sensitive issues in science today: the idea that something has gone fundamentally wrong with one of our greatest human creations.’
The summary document of the meeting concluded that there is no single cause of irreproducibility. The factors they identified included: (1) p-hacking; (2) HARKing, short for Hypothesising After the Results are Known – inventing a plausible explanation for the result that was obtained, after the data have been inspected; (3) not publishing studies unless they have a ‘significant’ result; (4) lack of statistical power, i.e. recruiting so few subjects that it is impossible to tell if an effect is real; (5) technical errors; (6) inadequate description of experimental methods, such that other researchers cannot repeat the study; and (7) weak experimental design. They acknowledged that ‘cultural factors, such as a highly competitive research environment and the high value placed on novelty and publication in high-profile journals, may also play a part’. How can all this be fixed? The eminent scientists who attended the meeting at the Wellcome Trust had some typically banal suggestions, such as ‘providing further education in research methods for scientists’, ‘due diligence by bodies that fund research’, and ‘greater openness and transparency’.
It was inevitable with so many and such varied perverse incentives that outright fraud would become commonplace in medical research. Although only 2 per cent of researchers have admitted to falsifying data, the true figure is thought to be much higher. The website Retraction Watch tracks scientific papers which have been retracted or withdrawn by their authors for reasons of fraud or falsification. Dr Yoshitaka Fujii, a Japanese anaesthetist, is currently top of the Retraction Watch leaderboard with 183 retracted studies. Laypeople are especially shocked by scientific fraud: when Al Gore led a congressional inquiry into scientific fraud in 1981, the historian Daniel Kevles observed that ‘for Gore and for many others, fraud in the biomedical sciences was akin to pederasty among priests’. Most doctors have come across scientific fraud, and are less shocked by it. I knew one researcher whose deliberate falsification of data was an open secret in the hospital where I then worked; his ‘work’ had been published in several major journals
. A friend and colleague co-authored several of these papers. I told him about the researcher’s methods, and was answered with a weary shrug. Deliberate fraud – so-called ‘scientific pornography’ – is indeed shocking, but is a minor problem compared to the combination of careerism, vested interest, self-deception and perverse incentives which creates so much bad science. Researchers are generally too careful and too cunning to carry out deliberate fraud when they can achieve the same result by other, less detectable means.
Medical journals are embedded in this problem. When I qualified, it was possible – just – for a diligent and disciplined doctor to keep up both with developments in medicine in general, and with their own speciality. One read the great general journals (the Lancet, British Medical Journal and New England Journal of Medicine), as well as perhaps two specialist journals. These journals then carried the authority of Moses’ Tablets of Stone. Since then, the exponentially rising quantity of biomedical research output has to go somewhere, and that somewhere is a journal. The prestige of a journal is now based on a metric called the impact factor. The impact factor is calculated from the number of citations, received in that year, of articles published in that journal. The impact factor of the New England Journal of Medicine, for example, is 72.4, for the Lancet 44 and for the Irish Medical Journal 0.31. The performance of medical academics is judged on metrics such as citation count and the h-index (calculated from the number of publications and number of citations of each paper). Inevitably, academics game these metrics. Goodhart’s law (named after the British economist) states that once a variable is adopted as a policy target, it rapidly loses its ability to capture the phenomenon or characteristic that is supposedly being measured; adoption of a new indicator ‘leads to changes in behaviour with gaming to maximize the score, perverse incentives, and unintended consequences’. Mario Biagioli, professor of law and of science and technology at the University of California, Davis, cited this law in an analysis of how individual researchers and institutions game bibliometric indices, such as impact factors, citation indices and rankings. This gaming has become increasingly sophisticated: one new wheeze is for researchers, when submitting a paper to a journal, to give the journal fake email addresses of potential ‘reviewers’. (Journals send all submitted papers to be reviewed by experts in the field, a practice known as ‘peer review’.) The authors then use these fake email addresses to supply flattering reviews back to the journal, thus increasing the chance of publication. In some universities – typically in emerging countries – researchers are unofficially obliged to cite the work of other researchers in the institution, to increase their ‘citation index’. Biagioli concluded: ‘The audit culture of universities – their love affair with metrics, impact factors, citation statistics and rankings – does not just incentivize this new form of bad behaviour. It enables it.’
When biomedical science expanded dramatically in the post-Second World War decades, so too did the number of journals available to publish all this new research. Scientific publishing has global revenues of more than £19 billion, placing it somewhere between the record and film industries in size. The profit margins of scientific publishers are greater than any of the tech giants: in 2010, Elsevier reported profits of £724 million on £2 billion in revenue, a margin of 36 per cent. Their business model is truly remarkable: the product (scientific papers) is given to them for free, and the purchasers of the product are mainly government-funded institutes and universities. In his brilliant 2017 Guardian essay ‘Is the staggeringly profitable business of scientific publishing bad for science?’ Stephen Buranyi wrote:
It is as if the New Yorker or the Economist demanded that journalists write and edit each other’s work for free, and asked the government to foot the bill. Outside observers tend to fall into a sort of stunned disbelief when describing this setup. A 2004 parliamentary science and technology committee report on the industry drily observed that ‘in a traditional market suppliers are paid for the goods they provide’. A 2005 Deutsche Bank report referred to it as a ‘bizarre’ ‘triple-pay’ system, in which ‘the state funds most research, pays the salaries of most of those checking the quality of research, and then buys most of the published product’.
The person who did most to create this unbeatable business model was Robert Maxwell. Born Ján Hoch in what was then Czechoslovakia, he reinvented himself during the war as a British officer, and became the millionaire ‘Robert Maxwell’. Immediately after the war, the British government decided that, although British science was exploding, its scientific journals were dismal. They chose to pair the British publisher Butterworths with the German publisher Springer, which was thought to have greater commercial expertise. At this time, Maxwell was shipping scientific articles to Britain on behalf of Springer. The Butterworths directors knew Maxwell, and hired him and a former spy, the Austrian metallurgist Paul Rosbaud. In 1951, Maxwell bought both Butterworths’ and Springer’s shares, and set up a new company which he called Pergamon. Rosbaud saw that new journals would be required to accommodate all the research produced in the post-war scientific boom. He hit on the simple but brilliant idea of persuading prominent academics that their field needed a new journal, and then installed the same prominent academics as the editors. Maxwell entertained the scientists at his Oxfordshire mansion, Headington Hill Hall; they were easily seduced. In 1959, Pergamon was publishing 40 journals; by 1965, the number rose to 150.
Maxwell understood that the business was almost limitless. After Watson and Crick’s discovery of the double-helix structure of DNA, he decided that the future was in biomedical sciences. Maxwell called the business ‘a perpetual financing machine’. Journals now set the agenda for science, and in the process researchers were incentivized to produce work that appealed to the journal editors, particularly the new, glamorous, high-impact basic science journals, such as Cell, Nature and Science. After Maxwell’s mysterious death in 1991 – he fell overboard from his yacht – Pergamon and its 400 journals was bought by Elsevier. In the late 1990s, it was widely predicted that the Internet would make these companies obsolete, but Elsevier accommodated to the new reality by selling electronic access to its journals in bundles of hundreds. In 2015, the Financial Times labelled Elsevier ‘the business the Internet could not kill’. Robert Maxwell correctly predicted in 1988 that in the future there would only be a handful of huge scientific publishing companies, who would operate in an electronic age with no printing or delivery costs, leading to almost ‘pure profit’.
Maxwell would have admired the pure shamelessness of the ‘predatory’ journals. They emerged around ten years ago to meet the urgent need of researchers to get published – anywhere. They will publish anything, as long as the authors pay. It has been estimated that there are now 8,000 such journals, publishing 420,000 articles a year. I receive emails from them most days, inviting me to contribute, or to become a member of their editorial boards. The existence of this new thriving industry is the logical conclusion of the dominance of journals over scientists, and the prevailing culture of medical research. Nobody reads the predatory journals, but then, most articles in ‘respectable’ journals are also unread: half of all articles published are never cited. Richard Smith, who edited the British Medical Journal from 1991 to 2004, joked that the publishers of medical journals are like mustard manufacturers: they make their money from material that is never used.
Papers submitted to the ‘respectable’ journals are sent out for external peer review. There is a now a general consensus that this process is deeply flawed; most papers eventually find a home of greater or lesser prestige. If your paper is rejected, send it to another journal, and so on until one eventually publishes your work. Drummond Rennie, deputy editor of the influential Journal of the American Medical Association, wrote:
There seems to be no study too fragmented, no hypothesis too trivial, no literature citation too biased or too egotistical, no design too warped, no methodology too bungled, no presentation of results too inaccurate, t
oo obscure, and too contradictory, no analysis too self-serving, no argument too circular, no conclusions too trifling or too unjustified, and no grammar and syntax too offensive for a paper to end up in print.
Many researchers are in open revolt against the journals. They argue that all research should be published openly, online, with all the data available for inspection; all trial protocols should be similarly published, and all trials registered. More importantly, they argue, trial proposals should be scrutinized and need to pass some basic requirements to determine that they are useful, that they will answer a real question. Innovations such as Wellcome Open Research and F1000Research have shown the journals that scientists could run science publishing without them. Even Elsevier has concluded that the journal era is nearing an end: they now describe themselves as a ‘Big Data Company’, and a ‘global information analytics business that helps institutions and professionals progress science, advance healthcare, and improve performance’. Elsevier is positioning itself to become the only company that sells publishing services such as software to scientists, becoming a force to rival Facebook or Google. Richard Smith warned: ‘Elsevier will come to know more about the world’s scientists – their needs, demands, aspirations, weaknesses, and buying patterns – than any other organization. The profits will come from those data and that knowledge. The users of Facebook are both the customers and the product, and scientists will be both the customers and the product of Elsevier.’ But a revolt is already under way: Swedish and German universities cancelled their Elsevier subscriptions, and the website Sci-Hub, which Elsevier has sued, provides free access to 67 million research articles. The European Commission has called for full open access to all scientific publications by 2020, and has invited bids for the development of an EU-wide open-access science publishing platform, a move that has been criticized by Jon Tennant, a researcher working on public access to scientific knowledge, as ‘finding new ways of channelling public money into private hands’. Many believe that the only way to sort this out is for the scientific community to take control of how their work is communicated.
Can Medicine Be Cured Page 5