But measurement alone is not enough to prevent us from misleading ourselves. Commonsense reasoning can also mislead us with respect to more philosophical questions about society—like how we assign blame, or how we attribute success—where measurement may be impossible. In these circumstances too we will not be able to restrain our commonsense intuition from coming up with seemingly self-evident answers. But once again, we can suspect it, and instead look for ways to think about the world that benefit from understanding the limits of common sense.
CHAPTER 9
Fairness and Justice
August 4, 2001, was a Saturday, and Joseph Gray was having a fun day. Gray, a fifteen-year veteran of the New York City Police Department, had finished the late shift that morning, in the 72nd precinct in Brooklyn, and he and a bunch of his colleagues decided to stick around the station house to have a few beers. Shortly before noon, by which stage a few beers had turned into several beers, several of them decided to have lunch at the nearby Wild, Wild West topless bar. Officer Gray, apparently, was particularly pleased with the decision, as he stayed there all afternoon and into the evening, even after the rest of his friends had left. It was puzzling behavior, considering he had to report to work again later that night, but perhaps he was hoping to get there a few hours before his shift started and sleep it off. Regardless, by the time he poured himself into his burgundy Ford Windstar van, he had drunk somewhere between twelve and eighteen beers—enough to put his blood alcohol content at over twice the legal limit.
What happened next isn’t completely clear, but the record indicates that as Officer Gray drove north on Third Avenue, under the Gowanus Expressway overpass, he ran a red light. Definitely not good, but also perhaps not a big deal. On any other Saturday evening he might have sailed right on through and gotten safely to Staten Island, where he planned to pick up one of his drinking partners from earlier in the day before returning to the station. But on this particular night, he was not to be so lucky. Nor were twenty-four-year-old Maria Herrera; her sixteen-year-old sister, Dilcia Peña; and Herrera’s four-year old son, Andy, who were crossing the avenue at 46th Street at that moment. Officer Gray struck the three of them at full speed, killing them all and dragging the poor boy’s body for nearly half a block under his front fender before coming to a halt. As he emerged from his vehicle, witnesses claimed his eyes were glassy, his voice was slurred, and he kept asking, “Why did they cross?” over and over again. But the nightmare didn’t end there. Maria Herrera was also eight-and-a-half months pregnant. Her unborn baby, Ricardo, was delivered by cesarean section at Lutheran Medical Center, and the doctors there fought to save his life. But they failed. Twelve hours after his mother died, so did baby Ricardo, leaving his father, Victor Herrera, alone in the world.
Almost two years later, Joseph Gray was sentenced in State Supreme Court to the maximum penalty of five to fifteen years in prison on four counts of second-degree manslaughter. Gray pleaded with the judge for mercy, claiming that he’d never done “anything intentional in my entire life to hurt another human being,” and more than one hundred supporters wrote letters to the court attesting to his decency. But Justice Anne Feldman was unsympathetic, pointing out that driving a half-ton van along city streets while intoxicated was “equivalent to waving a loaded gun around a crowded room.” The four thousand members of the Herreras’ community, who signed a petition demanding the maximum sentence, clearly concurred with the judge. Many felt that Gray had gotten off easy. Certainly Victor Herrera did. “Joseph Gray, fifteen years is not enough for you,” he told the courtroom. “You will get out of prison one day. And when you do, you will still be able to see your family. I will have nothing. You killed everything I have.”1
Even reading about these events years after they took place, it’s impossible not to feel the grief and anger of the victims’ family. As Victor Herrera expressed it to one reporter, God had blessed him with the family he’d dreamed of; then one drunk and reckless man had taken it all away from him in an instant. It’s a horrible thought, and Herrera has every right to hate the man who destroyed his life. Nevertheless, as I read about the repercussions—the protests outside the police station, the condemnation of neighbors and politicians, the shock waves through the community, and of course the eventual sentence—I couldn’t help but think about what would have happened had Joseph Gray come along an instant later. Naturally there would have been no accident, and Maria Herrera, her sister, and her son would have gone along their merry way. She would have given birth to Ricardo weeks later, hopefully lived a long and happy life, and would never have thought twice about the van speeding erratically along Third Avenue that summer evening. Joseph Gray would have picked up his fellow officer in Staten Island, who presumably would have insisted on driving back to Brooklyn. Gray might have gotten a reprimand from his supervisor, or he might have gotten away with it altogether. But regardless, he would have gone home to his wife and three children the next day and gotten on with his quiet, unremarkable existence.
ALL’S NOT WELL THAT ENDS WELL
OK, I know what you’re thinking. Even if Gray’s driving drunk did not make the accident inevitable, it did increase the likelihood that something bad would happen, and his punishment was justified in terms of his behavior. But if that’s true, then versions of his crime play out all the time. Every day, police officers—not to mention public officials, parents, and others—get drunk and drive their cars. Some of them are as drunk as Joseph Gray was that night, and some of them drive just as irresponsibly. Most of them don’t get caught, and even the few who do are rarely sent to jail. Few are subject to the punishment and public vilification that befell Joseph Gray, who was labeled a monster and a murderer. So what was it about Joseph Gray’s actions that made him so much worse than all these others? No matter how reprehensible, even criminal, you think his actions that day were, they would have been exactly as bad had he walked out of the bar a minute later, or had the light been green, or had the Herreras been momentarily delayed while walking down the street, or had they seen the car coming and sped up or slowed down. Nevertheless, even if you subscribe to Judge Feldman’s logic that everyone who is driving a van drunk down a city street is a potential killer of mothers and children, it is hard to imagine charging every driver who has had a few too many drinks—or these days, anyone texting or talking on a cell phone—to fifteen years in prison, simply on the grounds that they might have killed someone.
That the nature of the outcome should matter is about as commonsense an observation as one can think of. If great harm is caused, great blame is called for—and conversely, if no harm is caused, we are correspondingly inclined to leniency. All’s well that end’s well, is it not? Well, maybe, but maybe not. To be clear, I’m not drawing any conclusion about whether Joseph Gray got a fair trial, or whether he deserved to spend the next fifteen years of his life in prison; nor am I insisting that all drunk drivers should be treated like murderers. What I am saying, however, is that in being swayed so heavily by the outcome, our commonsense notions of justice inevitably lead us to a logical conundrum. On the one hand, it seems an outrage not to punish a man who killed four innocent people with the full force of the law. And on the other hand, it seems grossly disproportionate to treat every otherwise decent, honest person who has ever had a few too many drinks and driven home as a criminal and a killer. Yet aside from the trembling hand of fate, there is no difference between these two instances.
Quite possibly this is an inconsistency that we simply have to live with. As sociologists who study institutions have long argued, the formal rules that officially govern behavior in organizations and even societies are rarely enforced in practice, and in fact are probably impossible to enforce both consistently and comprehensively. The real world of human interactions is simply too messy and ambiguous a place ever to be governed by any predefined set of rules and regulations; thus the business of getting on with life is something that is best left to individuals exercising their common se
nse about what is reasonable and acceptable in a given situation. Most of the time this works fine. Problems get resolved, and people learn from their mistakes, without regulators or courts of law getting involved. But occasionally an infraction is striking or serious enough that the rules have to be invoked, and the offender dealt with officially. Looked at on a case-by-case basis, the invocation of the rules can seem arbitrary and even unfair, for exactly the reasons I have just discussed, and the person who suffers the consequences can legitimately wonder “why me?” Yet the rules nevertheless serve a larger, social purpose of providing a rough global constraint on acceptable behavior. For society to function it isn’t necessary that every case get dealt with consistently, as nice as that would be. It is enough simply to discourage certain kinds of antisocial behavior with the threat of punishment.2
Seen from this sociological perspective, it makes perfect sense that even if some irresponsible people are lucky enough to get away with their actions, society still has to make examples of violators occasionally—if only to keep the rest of us in check—and the threshold for action that has been chosen is that harm is done. But just because sociological sense and common sense happen to converge on the same solution in this particular case does not mean that they are saying the same thing, or that they will always agree. The sociological argument is not claiming that the commonsense emphasis on outcomes over processes is right—just that it’s a tolerable error for the purpose of achieving certain social ends. It’s the same kind of reasoning, in fact, that Oliver Wendell Holmes used to defend freedom of speech—not because he was fighting for the rights of individuals per se, but because he believed that allowing everyone to voice their opinion served the larger interest of creating a vibrant, innovative, and self-regulating society.3 So even if we end up shrugging off the logical conundrum raised by cases like Joseph Gray’s as an acceptable price to pay for a governable society, it doesn’t follow that we should overlook the role of chance in determining outcomes. And yet we do tend to overlook it. Whether we are passing judgment on a crime, weighing up a person’s career, assessing some work of art, analyzing a business strategy, or evaluating some public policy, our evaluation of the process is invariably and often heavily swayed by our knowledge of the outcome, even when that outcome may have been driven largely by chance.
THE HALO EFFECT
This problem is related to what management scientist Phil Rosenzweig calls the Halo Effect. In social psychology, the Halo Effect refers to our tendency to extend our evaluation about one particular feature of another person—say that they’re tall or good-looking—to judgments about other features, like their intelligence or character, that aren’t necessarily related to the first feature at all. Just because someone is good-looking doesn’t mean they’re smart, for example, yet subjects in laboratory experiments consistently evaluate good-looking people as smarter than unattractive people, even when they have no reason to believe anything about either person’s intelligence. Not for no reason, it seems, did John Adams once snipe that George Washington was considered a natural choice of leader by virtue of always being the tallest man in the room.4
Rosenzweig argues that the very same tendency also shows up in the supposedly dispassionate, rational evaluations of corporate strategy, leadership, and execution. Firms that are successful are consistently rated as having visionary strategies, strong leadership, and sound execution, while firms that are performing badly are described as suffering from some combination of misguided strategy, poor leadership, or shoddy execution. But as Rosenzweig shows, firms that exhibit large swings in performance over time attract equally divergent ratings, even when they have pursued exactly the same strategy, executed the same way, under the same leadership all along. Remember that Cisco Systems went from the poster child of the Internet era to a cautionary tale in a matter of a few years. Likewise, for six years before its spectacular implosion in 2001, Enron was billed by Fortune magazine as “America’s most innovative company,” while Steve & Barry’s—a now-defunct low-cost clothing retailer—was heralded by the New York Times as a game-changing business only months before it declared bankruptcy. Rosenzweig’s conclusion is that in all these cases, the way firms are rated has more to do with whether they are perceived as succeeding than what they are actually doing.5
To be fair, Enron’s appearance of success was driven in part by outright deception. If more had been known about what was really going on, it’s possible that outsiders would have been more circumspect. Better information might also have tipped people off to lurking problems at Steve & Barry’s and maybe even at Cisco. But as Rosenzweig shows, better information is not on its own any defense against the Halo Effect. In one early experiment, for example, groups of participants were told to perform a financial analysis of a fictitious firm, after which they were rated on their performance and asked to evaluate how well their team had functioned on a variety of metrics like group cohesion, communication, and motivation. Sure enough, groups that received high performance scores consistently rated themselves as more cohesive, motivated, and so on than groups that received low scores. The only problem with these assessments was that the performance scores were assigned at random by the experimenter—there was no difference in performance between the high and low scorers. Rather than highly functioning teams delivering superior results, in other words, the appearance of superior results drove the illusion of high functionality. And remember, these were not assessments made by external observers who might have lacked inside information—they were by the very members of the teams themselves. The Halo Effect, in other words, turns conventional wisdom about performance on its head. Rather than the evaluation of the outcome being determined by the quality of the process that led to it, it is the observed nature of the outcome that determines how we evaluate the process.6
Negating the Halo Effect is difficult, because if one cannot rely on the outcome to evaluate a process then it is no longer clear what to use. The problem, in fact, is not that there is anything wrong with evaluating processes in terms of outcomes—just that it is unreliable to evaluate them in terms of any single outcome. If we’re lucky enough to get to try out different plans many times each, for example, then by keeping track of all their successes and failures, we can indeed hope to determine their quality directly. But in cases where we only get to try out a plan once, the best way to avoid the Halo Effect is to focus our energies on evaluating and improving what we are doing while we are doing it. Planning techniques like scenario analysis and strategic flexibility, which I discussed earlier, can help organizations expose questionable assumptions and avoid obvious mistakes, while prediction markets and polls can exploit the collective intelligence of their employees to evaluate the quality of plans before their outcome is known. Alternatively, crowdsourcing, field experiments, and bootstrapping—discussed in the last chapter—can help organizations learn what is working and what isn’t and then adjust on the fly. By improving the way we make plans and implement them, all these methods are designed to increase the likelihood of success. But they can’t, and should not, guarantee success. In any one instance, therefore, we need to bear in mind that a good plan can fail while a bad plan can succeed—just by random chance—and therefore try to judge the plan on its own merits as well as on the known outcome.7
TALENT VERSUS LUCK
Even when it comes to measuring individual performance, it’s easy to get tripped up by the Halo Effect—as the current outrage over compensation in the financial industry exemplifies. The source of the outrage, remember, isn’t that bankers got paid lots of money—because we always knew that—but rather that they got paid lots of money for what now seems like disastrously bad performance. Without doubt there is something particularly galling about so-called pay for failure. But really it is just a symptom of a deeper problem with the whole notion of pay for performance—a problem that revolves around the Halo Effect. Consider, for example, all the financial-sector workers who qualified for large bonuses in 2009�
�the year after the crisis hit—because they made money for their employers. Did they deserve to be paid bonuses? After all, it wasn’t them who screwed up, so why should they be penalized for the foolish actions of other people? As one recipient of the AIG bonuses put it, “I earned that money, and I had nothing to do with all of the bad things that happened at AIG.”8 From a pragmatic perspective, moreover, it’s also entirely possible that if profit-generating employees aren’t compensated accordingly, they will leave for other firms, just as their bosses kept saying. As the same AIG employee pointed out, “They needed us to stay, because we were still making them lots of money, and we had the kind of business we could take to any competitor or, if they wanted, that we could wind down profitably.” This all sounds reasonable, but it could just be the Halo Effect again. Even as the media and the public revile one group of bankers—those who booked “bad” profits in the past—it still seems reasonable that bankers who make “good” profits deserve to be rewarded with bonuses. Yet for all we know, these two groups of bankers may be playing precisely the same game.
Imagine for a second the following thought experiment. Every year you flip a coin: If it comes up heads, you have a “good” year; and if it comes up tails, you have a “bad” year. Let’s assume that your bad years are really bad, meaning that you lose a ton of money for your employer, but that in your good years you earn an equally outsized profit. We’ll also adopt a fairly strict pay-for-performance model in which you get paid nothing in your bad years—no cheating, like guaranteed bonuses or repriced stock options allowed—but you receive a very generous bonus, say $10 million, in your good years. At first glance this arrangement seems fair—because you only get paid when you perform. But a second glance reveals that over the long run, the gains that you make for your employer are essentially canceled out by your losses; yet your compensation averages out at a very handsome $5 million per year. Presumably our friend at AIG doesn’t think that he’s flipping coins, and that my analogy is therefore fundamentally misconceived. He feels that his success is based on his skill, experience, and hard work, not luck, and that his colleagues committed errors of judgment that he has avoided. But of course, that’s precisely what his colleagues were saying a year or two earlier when they were booking those huge profits that turned out to be illusory. So why should we believe him now any more than we should have believed them? More to the point, is there a way to structure pay-for-performance schemes that only reward real performance?
Everything Is Obvious Page 21