by Helen Razer
But statistics, right from its very inception, was always about power and politics. It had been an Englishman who first referred to ‘political arithmetic’—William Petty, an early demographer (among other things, in the pre-twentieth-century tradition of polymathy) who followed in the footsteps of another Englishman, John Graunt. Graunt had produced the first work of demography, Natural and Political Observations on the London Bills of Mortality in the 1660s.
Graunt and Petty, who was also an economist and who urged better statistical information to improve tax collection, correctly understood that accumulating demographic data was an inherently political act. The growth of statistics was an inevitable consequence of the growing dominance of monarchies in the early modern period, even if government did not yet have many of the recognisable characteristics of the modern bureaucratic form.* In particular, the demands of warfare—of which monarchies were very fond—the shift to permanent (and increasingly national, rather than mercenary) armies and economic policies like mercantilism that conceived international trade as a zero-sum, competitive game drove the emergence of the centralised state. Such entities needed to understand the size and structure of their population so as to know how many men of military age they could butcher, their economic resources to pay for wars and even how long their citizens lived for, given the early modern habit of granting leases, awarding pensions and selling annuities for several lives rather than a set period of years.
The early growth of statistics was accompanied by the beginnings of probability theory by English and European mathematicians—in particular, the work of Thomas Bayes, which would re-emerge in the twentieth century—thereby providing the tools to start better using the data being collated. But while that was occurring, the English grappled with the politics inherent in political arithmetic. A bill to conduct the first national census in England in 1753 was defeated by ‘country’ opponents of the government—conservative landed aristocrats who claimed to defend the traditional liberties of Englishmen (mainly property rights) from a central government bureaucracy—aka the ‘court’. The ‘country’ opposition wasn’t just opposed to a centralised bureaucracy in theory—they themselves controlled the local bureaucratic, law enforcement apparatus and ecclesiastical apparatus that would have collated census data across England.
As a result, the first national census wasn’t conducted in the UK until nearly half a century later, after the first census had been conducted in the United States. In the US, contrarily, despite a similar antipathy towards centralised power to that which thwarted the UK census, censuses were not merely considered non-threatening, they were written into the Constitution as the basis for taxation and state representation in Congress. But back across the Atlantic, as if to prove the point of the ‘country’ opponents, the transformation of France from ancien régime to Napoleonic empire via revolutionary chaos saw a dramatic rise in governmental statistical compilation, even via basics such as Napoleon’s standardisation of measurements throughout France. Statistics and centralisation of power went hand in ink-stained hand.
The innately political nature of the compilation and use of data continued to be demonstrated as political arithmetic—now rebranded as the shorter, but harder to pronounce, ‘statistics’—and probability expanded in the nineteenth century. Statisticians and mathematicians developed key tools of probability, such as the law of large numbers, the least squares method and normal distribution, and re-argued the medieval debate about nominalism in determining categories and classifications of data, while data collection played a bigger and bigger role in public debate. At the same time, complaints that statistics were being exploited inappropriately or by vested interests to pursue self-serving agendas grew. The French medical profession bitterly divided over the meaning of data from cholera epidemics in 1820s Paris. New statistical techniques combined with data collected from English parishes was used in the debate on Poor Law reform, with some statisticians arguing that tough welfare laws were correlated with lower poverty rates. And a big spur to debate over statistical methods in the late nineteenth and early twentieth centuries were the efforts of eugenicists and social Darwinists to identify links between heredity and intelligence. Much of the hard work of merging social statistics and probability into a single subject area was done by men eager to argue non-white races were intellectually inferior.*
A form of that eugenics argument was still going on in the 1990s with the most notorious recent example of statistical Stupid in the service of an agenda, Richard Herrnstein and Charles Murray’s The Bell Curve. The book was named after the curve of normal distribution, on the left side of which, the authors suggested, African Americans and immigrants were to be disproportionately found. The coverage of the book prompted criticism about the failure of many US journalists to identify the poor methodology underpinning its conclusions about the links between race and intelligence, as scientists, psychologists and statisticians raced to show the profound flaws in the book.
The discovery of normal distribution curves in sociological and medical data accompanied the earliest data gathering, but it was the Flemish statistician Adolphe Quetelet who made them famous in the early to mid-1800s. In particular, Quetelet claimed that the normal curve that represented, say, the distribution of human height or weight, was also applicable to human morality, measured by marriage, crime and suicide statistics. Quetelet was thus a ‘moral statistician’ as much as any other type. He considered the average human, the one closest to the mean at the centre of the normal curve, to be an ideal of moderation, with all others a kind of imperfect copy, too tall or too short, too heavy or too light, too unethical or too officious, all small departures from the will of the Creator for the perfect being found at the apex of the curve.
Quetelet’s view initially informed the sociology of Émile Durkheim at the end of the nineteenth century, albeit from a different perspective: for Durkheim in his early work, the mean human was the product of social forces, the most representative creation of the society in which they had been born and raised. But later, the mean became instead synonymous with mediocrity; the average human for Durkheim was average indeed, lacking strongly positive or negative qualities, a compromise, moderate in everything and good in nothing, including talent and ethics.
This shows how inherently political statistics are: the shift in characterisation of those on the left-hand side of the normal curve to ‘below average’ led to that segment of the population being deemed by some to be a threat to Western societies. Francis Galton was a cousin of Charles Darwin and an eminent late-Victorian polymath across many fields, including statistics, in which he devised the concept of standard deviation. Galton also invented the terms ‘eugenics’ and argued for the ‘weak’—those on the left side of the curve—to be kept celibate and for eminent families—the right-siders—to intermarry in order to improve the racial stock. His protégé, Karl Pearson, one of the key figures in the twentieth-century development of statistics, also advocated race war and rejected the utility of trying to improve social conditions. ‘No degenerate and feeble stock will ever be converted into healthy and sound stock by the accumulated effects of education, good laws, and sanitary surroundings,’ Pearson insisted. Pearson believed himself to be a stickler for intellectual rigour, telling the British Medical Journal in 1910 that social scientists ought not to prostitute statistics for controversial or personal purposes.
Herrnstein and Murray’s arguments were thus simply a reiteration, almost verbatim, of the arguments of eugenicists a century before. They used their statistics to call for less welfare, lamenting that America was subsidising ‘low-IQ women’ (i.e. African American and immigrant women) to have children rather than encouraging high-IQ women. The eventual result, they argued, would be a kind of IQocalypse in which the intelligent (whites and Asians) lived in fortified compounds shielded from the teeming slums of low-IQ masses.
The economics of political arithmetic
An important development in statistics i
n the twentieth century was the development of input–output models and the pioneering of econometrics, and their use once computers became available after World War II to help process large amounts of data. A key driver of the proliferation of this kind of Stupid in public debate has thus been the spread of computable general equilibrium (CGE) economic models that use real economic data to model the likely impacts of policy changes or economic reforms.
These once required significant IT power to run—the ‘computer’ used to model the British economy in the early 1950s was a two-metre tall machine that used coloured water. This limited their use to academic institutions and official policymaking bodies like central banks. But even small computers can now run such models easily, and anyone can download simplified versions of widely used economic models like that of the Australian economy developed by Monash University. Economic consultants and academic institutions now advertise their models, either self-developed or bought from institutions with a proven track record, as a key part of their service offering to potential clients.
And there are more economists than ever before to run such models. Since the 1980s, Australian higher education institutions have seen a dramatic increase in students choosing to study business administration and economics, increasing their numbers per annum almost threefold between 1983 and 2000, by which time business admin and economics was, despite slower growth in the 1990s, the most popular field of study for Australian students. The US saw a similar rapid rise in the number of economics undergraduates from the mid-1990s onwards, while all Anglophone economies seem to have seen increases in the numbers of economics students since the global financial crisis showed the profession in such a favourable light. The English-speaking world appears to be suffering from a plague of economists, far in excess of what banks and governments need, leaving the rest to wander the streets holding ‘will model for food’ signs.
The other key ingredient such models rely on are input–output tables put together by the government statisticians that detail the relationships between and interdependence of different industries and sectors of the economy. But input-output tables themselves, without the labour or expense of CGE modelling, are often used to generate output, jobs and income multipliers for particular industries, thus providing the ingredients of the reverse magic pudding used by industry lobbyists to demonstrate the case for assistance.
So rife had the misuse of these multipliers by economic consultants, industries and governments arguing for handouts become in Australia in the 1990s that the Australian Bureau of Statistics actually stopped providing multipliers in its input–output tables, in essence saying that if people were going to misuse them they could produce them themselves. ‘Users of the I–O tables can compile their own multipliers as they see fit, using their own methods and assumptions to suit their own needs from the data supplied in the main I–O tables,’ the ABS said. ‘I–O multipliers are likely to significantly over-state the impacts of projects or events. More complex methodologies, such as those inherent in Computable General Equilibrium (CGE) models, are required to overcome these shortcomings.’
Magical multipliers and fictional industry reports about the huge benefits or costs of particular policies or misused data should have little direct impact on policymakers, who have access to more reliable and genuinely independent assessments of policy impacts. That doesn’t stop governments from pursuing bad policy, of course, but it means they have less excuse—the NSW government, for example, repeatedly pointed to evidence of declining violence in Sydney before caving in and agreeing to a get-tough reform package on alcohol-related violence. But dodgy reports can be effective at influencing the media and voters.
Reports, or extracts from them, are thus often given ahead of release to journalists or outlets regarded as sympathetic, because it complements an outlet’s ideology or partisanship, and then offered to readers under that most abused term ‘exclusive’. Alternatively, journalists too innumerate or too time-pressured to subject reports to basic scrutiny get them. Few journalists outside science and economic rounds have sufficient grounding to subject economic modelling to rigorous scrutiny, and few have the time to dig through evidence that would undermine material served up to them by lobbyists, NGOs and industry. Moreover, it enables the media to create the illusion of journalism. The classic description of news is what somebody does not want you to print, and the rest is advertising. Bad maths is advertising masquerading as news, filling column inches and the minutes between ad breaks in news bulletins, and without even being paid for. The benefit to media outlets, instead of revenue, is a saving, generating the appearance of journalism without the need to invest in the resources required for actual newsgathering, leaving the consumer to do the work of critically examining what’s been offered.
The discomfort or lack of interest of many journalists when it comes to numbers is a cliché that has been the subject of complaints for decades—journalists were, it’s long been said, the kids at school who topped English, not maths or science. But it becomes plain when they write more traditional stories where hard data is available but needs to be researched and explained, rather than handed over as a gift. Stories about crime trends, for example, rarely contain actual evidence about crime rates, even though crime statistics, which have been trending downwards in many Western countries in recent years, aren’t hard to unearth. Instead, journalists prefer anecdotes over data: anecdotes are harder to discredit and provide an immediate human hook for media consumers, regardless of how meaningless or unrepresentative personal stories may be. Actual crime data is likely to provide an unappealing counterweight to individual stories of out-of-control thugs, alcohol-fuelled violence or rampant cybercriminals.
Lies, damned lies and opinion polling
Different problems arise from numbers that journalists should be more comfortable with: those produced by polling organisations. Opinion polling was a nineteenth-century creation of American newspapers and magazines—the first one was for the 1824 presidential election—until polling became professionalised and more statistically rigorous in the twentieth century. That was after a magazine called Literary Digest predicted a landslide to Republican Alf Landon in the 1936 presidential election. The reason you’ve never heard of President Landon is that he actually lost in a landslide to FDR and the Digest shut soon after, setting an example of media accountability that, alas, has rarely been followed since.
In most democracies, the media continues to be a key customer of polling companies, particularly around elections, although it is now rare for a media company, rather than a marketing company, to own a polling organisation, as News Corporation does with Newspoll in Australia. The relationship tends towards one of interdependence or, perhaps, symbiosis, though it’s unclear whether the media or marketers are the parasite. A polling company without a media outlet struggles to match the influence or profile of companies that are linked to national media. For media companies, which invest in the costly process of polling either by owning a pollster or contracting with one, a poll provides influence and precious column inches for its political journalists.
Historically, Australia has had relatively good-quality polling, mainly because we force people to vote, which removes the challenge of predicting voter turnout that bedevils US polling, and we refuse to let them exhaust their vote on minor parties, removing the lottery of first-past-the-post psephology in the UK. And there’s a lot of polling, too, for such a small country: until recently, there have been around ten national polls a month in Australia outside elections.
But problems arise in the interpretation of the results by journalists. Partisanship among journalists and outlets plays a role—my side edges down 3 points, but yours plummets 2—but more common is the practice of retrofitting narratives onto polls. Having invested in the expensive process of polling 1000 people (usually riding on regular omnibus marketing polling conducted by the pollster), media outlets feel obliged to get their money’s worth by dramatising the results, regard
less of what they are.
This yields the sight of even very good journalists being compelled to explain small changes in polls, including those within the margin of error of the poll (around 3–4 per cent for a sample size of 1000), as arising from specific political events or a change in tactics by a party, establishing a narrative even when none exists. Thus, rises or falls in polls, even those resulting purely from statistical noise, generate their own positive or negative coverage as journalists rely on post hoc ergo propter hoc logic and scour preceding political events for explanations of shifts that may be entirely random.
Sometimes even this doesn’t work, when polls go in a direction not anticipated by journalists, thus requiring ‘government has failed to benefit from . . .’ narratives. Testing of such explanations is never undertaken, and the many assumptions embedded in them remain unrevealed: not merely that the change in polling results is statistically significant rather than random variation, but that voters have paid sufficient attention to politics to react to events that precede the polling outcome and that the reaction has occurred in a time frame that has been detected in a poll.
Even if polls and polling interpretation don’t have a lot of influence on voters—in fact, there is little evidence that polls change voters’ minds, such as via any bandwagon effect—they do have an impact on politicians, especially in countries with a short political cycle, like Australia, where a federal election is never more than thirty-six months away and parties have taken to removing even electorally successful leaders at the first hint of trouble. The result is a strange feedback loop of Stupid, in which meaningless numbers are interpreted as meaningful and influence the behaviour of those ostensibly the subject of the numbers.