Expert Political Judgment

Home > Other > Expert Political Judgment > Page 11
Expert Political Judgment Page 11

by Philip E. Tetlock


  Probability Theorists. Even readers swayed by the foregoing arguments may remain reluctant to “go all the way” with radical skepticism. One source of this reluctance is the repeated success that almost all of us feel we have had in explaining the past. Whether we anchor our explanations of the past in qualitative case studies or in multivariate regression models, hope springs eternal that our accounts of what just happened will confer predictive leverage on what will happen next.

  Robyn Dawes pours cold water on this comforting thought. The outcomes we most want to forecast—usually disasters—tend to be rare. And Dawes argues that, even for rare events we can explain reasonably well, such as airplane crashes, there is no guarantee we will do as well at predicting the future. If anything, we can almost guarantee the opposite: disappointment.28

  Dawes illustrates his thesis with the grisliest job of the National Transportation Safety Board (NTSB): piecing together postmortems of plane crashes. The crash of Western flight 903 in Mexico City on October 31, 1979, captures the challenges NTSB investigators confront. The plane landed at night on the left runway—which was closed to traffic because it was under construction—and crashed into a truck. Looking backward into time, investigators identified at least five plausible causes of the crash.

  Fatigue. Fifteen minutes before the crash the pilot said, “Morning, Dan,” and Dan responded with a muffled “Morning.” Dan, the navigator, had had only four hours’ sleep in the last twenty-four hours. The pilot had had five hours. Later Dan said, “I think I’m going to sleep all night” (all said about ten minutes prior to his death).

  Poor Visibility. Air traffic control then instructed the pilots to approach by tracking the radar beam on the left runway but shifting to the right for landing. Only the right runway was illuminated by approach lights. However visibility was poor, so the construction on the left runway and the lack of landing lights was not apparent.

  Radio Failure. Two minutes before the crash, the pilot asked, “What happened to that fucking radio?” The copilot replied, “I just don’t have any idea…. It just died.” The pilots thus had no radio contact two minutes before landing on the wrong runway.

  Vague Communication. After radio contact was restored, sixty-five seconds before the crash, the air traffic controller told the pilots, “26 of 5, you are left of the track.” By bad luck, the plane had been slightly left of the left runway. The pilot replied, “Yeah, we know.” If the tower had been explicit, the crash might have been averted.

  Stress. Forty-three seconds prior to the crash, an ever-burdened air traffic controller confused the two runways. “OK, sir. Approach lights on runway 23 left by the runway closed to traffic.” In fact, the radar beam was on the left runway and the approach lights on the right runway, which was not closed to traffic. Thirteen seconds later, the pilot realized that the plane was heading to the wrong runway, but it was too late.

  This sad tale is laced with quirky details, but it contains lessons of broad applicability. We often want to know why a particular consequence—be it a genocidal bloodbath or financial implosion—happened when and how it did. Examination of the record identifies a host of contributory causes. In the plane crash, five factors loom. It is tempting to view each factor by itself as a necessary cause. But the temptation should be resisted. Do we really believe that the crash could not have occurred in the wake of other antecedents? It is also tempting to view the five causes as jointly sufficient. But believing this requires endorsing the equally far-fetched counterfactual that, had something else happened, such as a slightly different location for the truck, the crash would still have occurred.

  Exploring these what-if possibilities might seem a gratuitous reminder to families of victims of how unnecessary the deaths were. But the exercise is essential for appreciating why the contributory causes of one accident do not permit the NTSB to predict plane crashes in general. Pilots are often tired; bad weather and cryptic communication are common; radio communication sometimes breaks down; and people facing death frequently panic. The NTSB can pick out, post hoc, the ad hoc combination of causes of any disaster. They can, in this sense, explain the past. But they cannot predict the future. The only generalization that we can extract from airplane accidents may be that, absent sabotage, crashes are the result of a confluence of improbable events compressed into a few terrifying moments.

  If a statistician were to conduct a prospective study of how well retrospectively identified causes, either singly or in combination, predict plane crashes, our measure of predictability—say, a squared multiple correlation coefficient—would reveal gross unpredictability. Radical skeptics tell us to expect the same fate for our quantitative models of wars, revolutions, elections, and currency crises. Retrodiction is enormously easier than prediction.

  Figure 2.2. The first panel displays the bewildering array of possible relationships between causal antecedents and possible futures when the observer does not yet know which future will need to be explained. The second panel displays a simpler task. The observer now knows which future materialized and identifies those antecedents “necessary” to render the outcome inevitable. The third panel “recomplexifies” the observer’s task by imagining ways in which once-possible outcomes could have occurred, thereby recapturing past states of uncertainty that hindsight bias makes it difficult to reconstruct. The dotted arrows to faded E’s represent possible pathways between counterfactual worlds and their conjectured antecedents.

  How can this be? Looking forward in time, we confront the first panel of figure 2.2. We don’t yet know what we have to explain. We need to be alert to the multiplicity of ways in which potential causes could produce a multiplicity of potential effects. Let’s call this complex pattern the “many-many relationship between antecedents and consequences.” Looking backward in time, we confront the second panel of figure 2.2. We now do know what we need to explain. We can concentrate our explanatory efforts on why one of the many once-possible consequences occurred. Let’s call this simple pattern the “many-one relationship between antecedents and consequences.” The pattern is, however, deceptively simple. It plays on the cognitive illusion that the reasons we can identify why a known outcome had to occur give us a basis for predicting when similar outcomes will occur.

  Retrospective explanations do not travel well into the future. The best protection against disappointment is that recommended in the third panel of figure 2.2: work through counterfactual thought exercises that puncture the deceptive simplicity of the many-one relationship by imagining ways in which outcomes we once deemed possible could have come about. Chapter 7 will show that, although these exercises do not boost our predictive accuracy, they do check our susceptibility to hindsight bias: our tendency to exaggerate the degree to which we saw it coming all along. Humility is attainable, even if forecasting accuracy is not.

  Psychological Skeptics

  Ontological skeptics need no psychology. They trace indeterminacy to properties of the external world—a world that would be just as unpredictable if we were smarter. Psychological skeptics are not so sure. They suspect there are opportunities to peer into the future that we miss for reasons linked to the internal workings of the human mind. Psychological skeptics are thus more open to meliorist arguments that observers with the “right mental stuff” will prove better forecasters. Indeed, every obstacle that psychological skeptics identify to good judgment is an invitation to “we could fix it” interventions. Here we identify four key obstacles: (1) our collective preference for simplicity; (2) our aversion to ambiguity and dissonance; (3) our deep-rooted need to believe we live in an orderly world; and (4) our seemingly incorrigible ignorance of the laws of chance.

  PREFERENCE FOR SIMPLICITY

  However cognitively well equipped human beings were to survive on the savannah plains of Africa, we have met our match in the modern world. Picking up useful cues from noisy data requires identifying fragile associations between subtle combinations of antecedents and consequences. This is exac
tly the sort of task that work on probabilistic-cue learning indicates people do poorly.29 Even with lots of practice, plenty of motivation, and minimal distractions, intelligent people have enormous difficulty tracking complex patterns of covariation such as “effect y1 rises in likelihood when x1 is falling, x2 is rising, and x3 takes on an intermediate set of values.”

  Psychological skeptics argue that such results bode ill for our ability to distill predictive patterns from the hurly-burly of current events.30 Insofar as history repeats itself, it does not do so in a ploddingly mechanistic fashion.31 Much analogical reasoning from history is, however, plodding. Consider the impact of the Vietnam War on the political consciousness of late twentieth-century pundits who saw a variety of conflicts, almost surely too great a variety, as Vietnam-style quagmires. The list includes Nicaragua, Haiti, Bosnia, Colombia, Afghanistan, and Iraq (all new American Vietnams), Afghanistan (the Soviet Union’s Vietnam), Chechnya (Russia’s Vietnam), Kashmir (India’s Vietnam), Lebanon (Israel’s Vietnam), Angola (Cuba’s Vietnam), the Basque territory (Spain’s Vietnam), Eritrea (Ethiopia’s Vietnam), Northern Ireland (Britain’s Vietnam), and Kampuchea (Vietnam’s Vietnam).32 We know—from many case studies—that overfitting the most superficially applicable analogy to current problems is a common source of error.33 We rarely hear policy makers, in private or public, invoking mixtures of probabilistic analogies: “Saddam resembles Hitler in his risk taking, but he also has some of the shrewd street smarts of Stalin, the vaingloriousness of Mussolini, and the demagoguery of Nasser, and the usefulness of each analogy depends on the context.”

  AVERSION TO AMBIGUITY AND DISSONANCE

  People for the most part dislike ambiguity—and we shall discover in chapter 3 that this is especially true of the hedgehogs among us. History, however, heaps ambiguity on us. It not only requires us to keep track of many things; it also offers few clues as to which things made critical differences. If we want to make causal inferences, we have to guess what would have happened in counterfactual worlds that exist—if “exist” is the right word—only in our imaginative reenactments of what-if scenarios. We know from experimental work that people find it hard to resist filling in the missing data points with ideologically scripted event sequences.34 Indeed, as chapter 5 will show, observers of world politics are often enormously confident in their counterfactual beliefs,35 declaring with eerie certainty that they know pretty much exactly what would have happened in counterfactual worlds that no one can visit or check out.

  People for the most part also dislike dissonance—a generalization that again particularly applies to the hedgehogs we shall meet in chapter 3. They prefer to organize the world into neat evaluative gestalts that couple good causes to good effects and bad to bad.36 Unfortunately, the world can be a morally messy place in which policies that one is predisposed to detest sometimes have positive effects and policies that one embraces sometimes have noxious ones. Valued allies may have frightful human rights records; free trade policies that improve living standards in the Third World may reward companies that exploit child labor; despised terrorists may display qualities that, in other contexts, we might laud as resourceful and even courageous; regimes in rogue states may have more popular support than we care to admit. Dominant options—that beat the alternatives on all possible dimensions—are rare.

  NEED FOR CONTROL

  Most of us find it irksome to contemplate making life-and-death decisions on no sounder basis than a coin toss.37 Nihilistic fatalism of this sort runs against the mostly can-do grain of human nature. Moreover, it should be especially irksome to the specialists in our sample, people who make their living thinking and writing about varied facets of international affairs, to adopt this despairing stance, undercutting as it does not just their worldviews but also their livelihoods. This argument suggests that people will generally welcome evidence that fate is not capricious, that there is an underlying order to what happens. The core function of political belief systems is not prediction; it is to promote the comforting illusion of predictability.

  THE UNBEARABLE LIGHTNESS OF OUR UNDERSTANDING OF RANDOMNESS

  No amount of methodological hocus-pocus will improve the accuracy of our forecasts in games of pure chance. If a casino with a financial death wish installed a roulette wheel with 60 percent black and 40 percent red slots and kept payoffs unchanged, the best strategy would always be to bet on the most likely outcome: black. The worst strategy would be to look for patterns until one convinces oneself that one has found a formula that justifies big bets on the less likely outcome. The reward for thought, at least thought of the gambler’s-fallacy caliber, will be to hemorrhage money.38

  Our reluctance to acknowledge unpredictability keeps us looking for predictive cues well beyond the point of diminishing returns.39 I witnessed a demonstration thirty years ago that pitted the predictive abilities of a classroom of Yale undergraduates against those of a single Norwegian rat. The task was predicting on which side of a T-maze food would appear, with appearances determined—unbeknownst to both the humans and the rat—by a random binomial process (60 percent left and 40 percent right). The demonstration replicated the classic studies by Edwards and by Estes: the rat went for the more frequently rewarded side (getting it right roughly 60 percent of the time), whereas the humans looked hard for patterns and wound up choosing the left or the right side in roughly the proportion they were rewarded (getting it right roughly 52 percent of the time). Human performance suffers because we are, deep down, deterministic thinkers with an aversion to probabilistic strategies that accept the inevitability of error. We insist on looking for order in random sequences. Confronted by the T-maze, we look for subtle patterns like “food appears in alternating two left/one right sequences, except after the third cycle when food pops up on the right.” This determination to ferret out order from chaos has served our species well. We are all beneficiaries of our great collective successes in the pursuit of deterministic regularities in messy phenomena: agriculture, antibiotics, and countless other inventions that make our comfortable lives possible. But there are occasions when the refusal to accept the inevitability of error—to acknowledge that some phenomena are irreducibly probabilistic—can be harmful.

  Political observers run the same risk when they look for patterns in random concatenations of events. They would do better by thinking less. When we know the base rates of possible outcomes—say, the incumbent wins 80 percent of the time—and not much else, we should simply predict the more common outcome. But work on base rate neglect suggests that people often insist on attaching high probabilities to low-frequency events.40 These probabilities are rooted not in observations of relative frequency in relevant reference populations of cases, but rather in case-specific hunches about causality that make some scenarios more “imaginable” than others. A plausible story of how a government might suddenly collapse counts for far more than how often similar outcomes have occurred in the past. Forecasting accuracy suffers when intuitive causal reasoning trumps extensional probabilistic reasoning.41

  Psychological skeptics are also not surprised when people draw strong lessons from brief runs of forecasting failures or successes. Winning forecasters are often skilled at concocting elaborate stories about why fortune favored their point of view. Academics can quickly spot the speciousness of these stories when the forecaster attributes her success to a divinity heeding a prayer or to planets being in the correct alignment. But even these observers can be gulled if the forecaster invokes an explanation in intellectual vogue.

  At this point, skeptics throw up their hands. They remind us of the perils of drawing confident inductive inferences from small samples of unknown origin and, if in a patient mood, append a lecture on the logical fallacy of affirming the consequent: “Beware of people who argue, “If A, then B,” observe B, and then declare, “A is true.” If you go down that path, you will wind up awarding “forecaster of the year” awards to an unseemly procession of cranks. Who wants to congratulate apartheid
supporters in South Africa for their prescient predictions of the dismal state of sub-Saharan Africa? Much mischief can be wrought by transplanting this hypothesis-testing logic, which flourishes in controlled lab settings, into the hurly-burly of real-world settings where ceteris paribus never is, and never can be, satisfied.

  ADVANCING TESTABLE HYPOTHESES

  Combining these varied grounds for radical skepticism, we can appreciate how reasonable people could find themselves taking the seemingly unreasonable position that experts add precious little, perhaps nothing, to our ability to see into the future. Moreover, skeptics make testable predictions about the unpredictability of the world. The six core tenets of radical skepticism are as follows:

  1. Debunking hypotheses: humans versus chimps and extrapolation algorithms of varying sophistication. Like the weather, the political world has pockets of turbulence: political and financial crises during which we, the consumers of expertise, feel the greatest need for guidance but such guidance will be least useful. Even the most astute observers will fail to outperform random prediction generators—the functional equivalent of dart-throwing chimps—in affixing realistic likelihoods to possible futures.

  Of course, it is not always obvious when one has entered or exited turbulence. Turbulence is the exception. It is far easier ex post than ex ante to pinpoint the qualitative breakpoints that mark where old patterns of predictability break down and new ones emerge.42 The right performance baseline ceases to be blind chance in the often long periods of stability between episodes of turbulence. We should therefore raise the bar and ask whether experts can outperform not the chimp but extrapolation algorithms of varying sophistication. The Technical Appendix describes several such algorithms: (a) crude base rate algorithms that attach probabilities to outcomes that correspond to the frequency with which those outcomes pop up in narrowly or widely defined comparison populations of cases; (b) cautious or aggressive case-specific extrapolation algorithms that, for each state in our sample, predict the continuation of its recent past into its near-term future; (c) formal statistical equations (such as generalized autoregressive distributed lag models) that piece together optimal linear combinations of predictors in the dataset.

 

‹ Prev