People predict by making up stories
People predict very little and explain everything
People live under uncertainty whether they like it or not
People believe they can tell the future if they work hard enough
People accept any explanation as long as it fits the facts
The handwriting was on the wall, it was just the ink that was
invisible
People often work hard to obtain information they already have
And avoid new knowledge
Man is a deterministic device thrown into a probabilistic
Universe
In this match, surprises are expected
Everything that has already happened must have been inevitable
At first glance it resembles a poem. What it was, in fact, was early fodder for his and Danny’s next article, which would also be their first attempt to put their thinking in such a way that it might directly influence the world outside of their discipline. Before returning to Israel, they had decided to write a paper about how people made predictions. The difference between a judgment and a prediction wasn’t as obvious to everyone as it was to Amos and Danny. To their way of thinking, a judgment (“he looks like a good Israeli army officer”) implies a prediction (“he will make a good Israeli army officer”), just as a prediction implies some judgment—without a judgment, how would you predict? In their minds, there was a distinction: A prediction is a judgment that involves uncertainty. “Adolf Hitler is an eloquent speaker” is a judgment you can’t do much about. “Adolf Hitler will become chancellor of Germany” is, at least until January 30, 1933, a prediction of an uncertain event that eventually will be proven either right or wrong. The title of their next paper was “On the Psychology of Prediction.” “In making predictions and judgments under uncertainty,” they wrote, “people do not appear to follow the calculus of chance or the statistical theory of prediction. Instead, they rely on a limited number of heuristics which sometimes yield reasonable judgments and sometimes lead to severe and systematic error.”
Viewed in hindsight, the paper looks to have more or less started with Danny’s experience in the Israeli army. The people in charge of vetting Israeli youth hadn’t been able to predict which of them would make good officers, and the people in charge of officer training school hadn’t been able to predict who among the group they were sent would succeed in combat, or even in the routine day-to-day business of leading troops. Danny and Amos had once had a fun evening trying to predict the future occupations of their friends’ small children, and had surprised themselves by the ease, and the confidence, with which they had done it. Now they sought to test how people predicted—or, rather, to dramatize how people used what they now called the representativeness heuristic to predict.
To do this, however, they needed to give them something to predict.
They decided to ask their subjects to predict the future of a student, identified only by some personality traits, who would go on to graduate school. Of the then nine major courses of graduate study in the United States, which would he pursue? They began by asking their subjects to estimate the percentage of students in each course of study. Here were their average guesses:
Business: 15 percent
Computer Science: 7 percent
Engineering: 9 percent
Humanities and Education: 20 percent
Law: 9 percent
Library Science: 3 percent
Medicine: 8 percent
Physical and Life Sciences: 12 percent
Social Science and Social Work: 17 percent
For anyone trying to predict which area of study any given person was in, those percentages should serve as a base rate. That is, if you knew nothing at all about a particular student, but knew that 15 percent of all graduate students were pursuing degrees in business administration, and were asked to predict the likelihood that the student in question was in business school, you should guess “15 percent.” Here was a useful way of thinking about base rates: They were what you would predict if you had no information at all.
Now Danny and Amos sought to dramatize what happened when you gave people some information. But what kind of information? Danny spent a day inside the Oregon Research Institute stewing over the question—and became so engrossed by his task that he stayed up all night creating what at the time seemed like the stereotype of a graduate student in computer science. He named him “Tom W.”
Tom W. is of high intelligence, although lacking in true creativity. He has a need for order and clarity, and for neat and tidy systems in which every detail finds its appropriate place. His writing is rather dull and mechanical, occasionally enlivened by somewhat corny puns and by flashes of imagination of the sci-fi type. He has a strong drive for competence. He seems to have little feel and little sympathy for other people and does not enjoy interacting with others. Self-centered, he nonetheless has a deep moral sense.
They would ask one group of subjects—they called it the “similarity” group—to estimate how “similar” Tom was to the graduate students in each of the nine fields. That was simply to determine which field of study was most “representative” of Tom W.
Then they would hand a second group—what they called the “prediction” group—this additional information:
The preceding personality sketch of Tom W. was written during Tom’s senior year in high school by a psychologist, on the basis of projective tests. Tom W. is currently a graduate student. Please rank the following nine fields of graduate specialization in order of the likelihood that Tom W. is now a graduate student in each of these fields.
They would not only give their subjects the sketch but inform them that it was a far from reliable description of Tom W. That it had been written by a psychologist, for a start; they would further tell subjects that the assessment had been made years earlier. What Amos and Danny suspected—because they had tested it first on themselves—is that people would essentially leap from the similarity judgment (“that guy sounds like a computer scientist!”) to some prediction (“that guy must be a computer scientist!”) and ignore both the base rate (only 7 percent of all graduate students were computer scientists) and the dubious reliability of the character sketch.
The first person to arrive for work on the morning Danny finished his sketch was an Oregon researcher named Robyn Dawes. Dawes was trained in statistics and legendary for the rigor of his mind. Danny handed him the sketch of Tom W. “He read it over and he had a sly smile, as if he had figured it out,” said Danny. “And he said, ‘Computer scientist!’ After that I wasn’t worried about how the Oregon students would fare.”
The Oregon students presented with the problem simply ignored all objective data and went with their gut sense, and predicted with great certainty that Tom W. was a computer scientist. Having established that people would allow a stereotype to warp their judgment, Amos and Danny then wondered: If people are willing to make irrational predictions based on that sort of information, what kind of predictions might they make if we give them totally irrelevant information? As they played with this idea—they might increase people’s confidence in their predictions by giving them any information, however useless—the laughter to be heard from the other side of the closed door must have grown only more raucous. In the end, Danny created another character. This one he named “Dick”:
Dick is a 30 year old man. He is married with no children. A man of high ability and high motivation, he promises to be quite successful in his field. He is well liked by his colleagues.
Then they ran another experiment. It was a version of the book bag and poker chips experiment that Amos and Danny had argued about in Danny’s seminar at Hebrew University. They told their subjects that they had picked a person from a pool of 100 people, 70 of whom were engineers and 30 of whom were lawyers. Then they asked them: What is the likelihood tha
t the selected person is a lawyer? The subjects correctly judged it to be 30 percent. And if you told them that you were doing the same thing, but from a pool that had 70 lawyers in it and 30 engineers, they said, correctly, that there was a 70 percent chance the person you’d plucked from it was a lawyer. But if you told them you had picked not just some nameless person but a guy named Dick, and read them Danny’s description of Dick—which contained no information whatsoever to help you guess what Dick did for a living—they guessed there was an equal chance that Dick was a lawyer or an engineer, no matter which pool he had emerged from. “Evidently, people respond differently when given no specific evidence and when given worthless evidence,” wrote Danny and Amos. “When no specific evidence is given, the prior probabilities are properly utilized; when worthless specific evidence is given, prior probabilities are ignored.”*
There was much more to “On the Psychology of Prediction”—for instance, they showed that the very factors that caused people to become more confident in their predictions also led those predictions to be less accurate. And in the end it returned to the problem that had interested Danny since he had first signed on to help the Israeli army rethink how it selected and trained incoming recruits:
The instructors in a flight school adopted a policy of consistent positive reinforcement recommended by psychologists. They verbally reinforced each successful execution of a flight maneuver. After some experience with this training approach, the instructors claimed that contrary to psychological doctrine, high praise for good execution of complex maneuvers typically results in a decrement of performance on the next try. What should the psychologist say in response?
The subjects to whom they posed this question offered all sorts of advice. They surmised that the instructors’ praise didn’t work because it led the pilots to become overconfident. They suggested that the instructors didn’t know what they were talking about. No one saw what Danny saw: that the pilots would have tended to do better after an especially poor maneuver, or worse after an especially great one, if no one had said anything at all. Man’s inability to see the power of regression to the mean leaves him blind to the nature of the world around him. We are exposed to a lifetime schedule in which we are most often rewarded for punishing others, and punished for rewarding.
* * *
When they wrote their first papers, Danny and Amos had no particular audience in mind. Their readers would be the handful of academics who happened to subscribe to the highly specialized psychology trade journals in which they published. By the summer of 1972, they had spent the better part of three years uncovering the ways in which people judged and predicted—but the examples that they had used to illustrate their ideas were all drawn directly from psychology, or from the strange, artificial-seeming tests that they had given high school and college students. Yet they were certain that their insights applied anywhere in the world that people were judging probabilities and making decisions. They sensed that they needed to find a broader audience. “The next phase of the project will be devoted primarily to the extension and application of this work to other high-level professional activities, e.g., economic planning, technological forecasting, political decision making, medical diagnosis, and the evaluation of legal evidence,” they wrote in a research proposal. They hoped, they wrote, that the decisions made by experts in these fields could be “significantly improved by making these experts aware of their own biases, and by the development of methods to reduce and counteract the sources of bias in judgment.” They wanted to turn the real world into a laboratory. It was no longer just students who would be their lab rats but also doctors and judges and politicians. The question was: How to do it?
They couldn’t help but sense, during their year in Eugene, a growing interest in their work. “That was the year it was really clear we were onto something,” recalled Danny. “People started treating us with respect.” Irv Biederman, then a visiting associate professor of psychology at Stanford University, heard Danny give a talk about heuristics and biases on the Stanford campus in early 1972. “I remember I came home from the talk and told my wife, ‘This is going to win a Nobel Prize in economics,’” recalled Biederman. “I was so absolutely convinced. This was a psychological theory about economic man. I thought, What could be better? Here is why you get all these irrationalities and errors. They come from the inner workings of the human mind.”
Biederman had been friends with Amos at the University of Michigan and was now a member of the faculty at the State University of New York at Buffalo. The Amos he knew was consumed by possibly important but probably insolvable and certainly obscure problems about measurement. “I wouldn’t have invited Amos to Buffalo to talk about that,” he said—as no one would have understood it or cared about it. But this new work Amos was apparently doing with Danny Kahneman was breathtaking. It confirmed Biederman’s sense that “most advances in science come not from eureka moments but from ‘hmmm, that’s funny.’” He persuaded Amos to pass through Buffalo in the summer of 1972, on his way from Oregon to Israel. Over the course of a week, Amos gave five different talks about his work with Danny, each aimed at a different group of academics. Each time, the room was jammed—and fifteen years later, in 1987, when Biederman left Buffalo for the University of Minnesota, people were still talking about Amos’s talks.
Amos devoted talks to each of the heuristics he and Danny had discovered, and another to prediction. But the talk that lingered in Biederman’s mind was the fifth and final one. “Historical Interpretation: Judgment Under Uncertainty,” Amos had called it. With a flick of the wrist, he showed a roomful of professional historians just how much of human experience could be reexamined in a fresh, new way, if seen through the lens he had created with Danny.
In the course of our personal and professional lives, we often run into situations that appear puzzling at first blush. We cannot see for the life of us why Mr. X acted in a particular way, we cannot understand how the experimental results came out the way they did, etc. Typically, however, within a very short time we come up with an explanation, a hypothesis, or an interpretation of the facts that renders them understandable, coherent, or natural. The same phenomenon is observed in perception. People are very good at detecting patterns and trends even in random data. In contrast to our skill in inventing scenarios, explanations, and interpretations, our ability to assess their likelihood, or to evaluate them critically, is grossly inadequate. Once we have adopted a particular hypothesis or interpretation, we grossly exaggerate the likelihood of that hypothesis, and find it very difficult to see things any other way.
Amos was polite about it. He did not say, as he often said, “It is amazing how dull history books are, given how much of what’s in them must be invented.” What he did say was perhaps even more shocking to his audience: Like other human beings, historians were prone to the cognitive biases that he and Danny had described. “Historical judgment,” he said, was “part of a broader class of processes involving intuitive interpretation of data.” Historical judgments were subject to bias. As an example, Amos talked about research then being conducted by one of his graduate students at Hebrew University, Baruch Fischhoff. When Richard Nixon announced his surprising intention to visit China and Russia, Fischhoff asked people to assign odds to a list of possible outcomes—say, that Nixon would meet Chairman Mao at least once, that the United States and the Soviet Union would create a joint space program, that a group of Soviet Jews would be arrested for attempting to speak with Nixon, and so on. After the trip, Fischhoff went back and asked the same people to recall the odds they had assigned to each outcome. Their memories of the odds they had assigned to various outcomes were badly distorted. They all believed that they had assigned higher probabilities to what happened than they actually had. They greatly overestimated the odds that they had assigned to what had actually happened. That is, once they knew the outcome, they thought it had been far more predictable than they had found it to be before, when they ha
d tried to predict it. A few years after Amos described the work to his Buffalo audience, Fischhoff named the phenomenon “hindsight bias.”†
In his talk to the historians, Amos described their occupational hazard: the tendency to take whatever facts they had observed (neglecting the many facts that they did not or could not observe) and make them fit neatly into a confident-sounding story:
All too often, we find ourselves unable to predict what will happen; yet after the fact we explain what did happen with a great deal of confidence. This “ability” to explain that which we cannot predict, even in the absence of any additional information, represents an important, though subtle, flaw in our reasoning. It leads us to believe that there is a less uncertain world than there actually is, and that we are less bright than we actually might be. For if we can explain tomorrow what we cannot predict today, without any added information except the knowledge of the actual outcome, then this outcome must have been determined in advance and we should have been able to predict it. The fact that we couldn’t is taken as an indication of our limited intelligence rather than of the uncertainty that is in the world. All too often, we feel like kicking ourselves for failing to foresee that which later appears inevitable. For all we know, the handwriting might have been on the wall all along. The question is: was the ink visible?
The Undoing Project Page 19