Super Crunchers
Page 2
Now something is changing. Business and government professionals are relying more and more on databases to guide their decisions. The story of hedge funds is really the story of a new breed of number crunchers—call them Super Crunchers—who have analyzed large datasets to discover empirical correlations between seemingly unrelated things. Want to hedge a large purchase of euros? Turns out you should sell a carefully balanced portfolio of twenty-six other stocks and commodities that might include Wal-Mart stock.
What is Super Crunching? It is statistical analysis that impacts real-world decisions. Super Crunching predictions usually bring together some combination of size, speed, and scale. The sizes of the datasets are really big—both in the number of observations and in the number of variables. The speed of the analysis is increasing. We often witness the real-time crunching of numbers as the data come hot off the press. And the scale of the impact is sometimes truly huge. This isn’t a bunch of egghead academics cranking out provocative journal articles. Super Crunching is done by or for decision makers who are looking for a better way to do things.
And when I say that Super Crunchers are using large datasets, I mean really large. Increasingly business and government datasets are being measured not in mega- or gigabytes but in tera- and even petabytes (1,000 terabytes). A terabyte is the equivalent of 1,000 gigabytes. The prefix tera comes from the Greek word for monster. A terabyte is truly a monstrously large quantity. The entire Library of Congress is about twenty terabytes of text. Part of the point of this book is that we need to start getting used to this prefix. Wal-Mart’s data warehouse, for example, stores more than 570 terabytes. Google has about four petabytes of storage which it is constantly crunching. Tera mining is not Buck Rogers’s fantasy—it’s being done right now.
In field after field, “intuitivists” and traditional experts are battling Super Crunchers. In medicine, a raging controversy over what is called “evidence-based medicine” boils down to a question of whether treatment choice will be based on statistical analysis or not. The intuitivists are not giving up without a fight. They claim that a database can never capture clinical expertise nurtured over a lifetime of experience, that a regression can never be as good as an emergency room nurse with twenty years of experience who can tell whether a kid looks “hinky.”
We tend to think that the chess grandmaster Garry Kasparov lost to the Deep Blue computer because of IBM’s smarter software. That software is really a gigantic database that ranks the power of different positions. The speed of the computer is important, but in large part it was the computer’s ability to access a database of 700,000 grandmaster chess games that was decisive. Kasparov’s intuitions lost out to data-based decision making.
Super Crunchers are not just invading and displacing traditional experts; they’re changing our lives. They’re not just changing the way that decisions are made; they’re changing the decisions themselves. Baseball scouts are losing out to gearheads not just because it’s a lot cheaper to crunch numbers than to fly scouts out to Palookaville. The scouts are losing because they make poorer predictions. Super Crunchers and experts, of course, don’t always disagree. Number crunching sometimes confirms traditional wisdom. The world isn’t so perverse that the traditional experts were wrong 100 percent of the time or were even no better than chance. Still, number crunching is leading decision makers to make different and, by and large, better choices.
Statistical analysis in field after field is uncovering hidden relationships among widely disparate kinds of information. If you’re a politician and want to know who is most likely to give you a contribution and what form of solicitation is most likely to be successful, you don’t need to guess, follow rules of thumb, or trust grizzled traditionalists. Increasingly, it is possible to tease out measurable effects of separate attributes to tell you what kinds of persuasion are likely to work the best. Trolling through databases can reveal underlying causes that traditional experts never even considered.
Data-based decision making is on the rise all around us:
Rental car companies and insurers are refusing service to people with poor credit scores because data mining tells them that credit scores correlate with a higher likelihood of having an accident.
Nowadays when a flight is canceled, airlines will skip over their frequent fliers and give the next open seat to the mine-identified customer whose continued business is most at risk. Instead of following a first-come, first-serve rule, companies will condition their behavior on literally dozens of consumer-specific factors.
The “No Child Left Behind” Act, which requires schools to adopt teaching methods supported by rigorous data analysis, is causing teachers to spend up to 45 percent of class time training kids to pass standardized tests. Super Crunching is even shifting some teachers toward class lessons where every word is scripted and statistically vetted.
Intuitivists beware. This book will detail a dizzying array of Super Crunching stories and introduce you to the people who are making them happen. The number-crunching revolution isn’t just about baseball or even sports in general. It is about all the rest of our lives as well. Many times this Super Crunching revolution is a boon to consumers as it helps sellers and governments make better predictions about who needs what. At other times, however, consumers are playing against a statistically stacked deck. Number crunching can put the little guy at a real disadvantage, since sellers can better predict how much they can squeeze out of us.
Steven D. Levitt and Stephen J. Dubner showed in Freakonomics dozens of examples of how statistical analysis of databases can reveal the secret levers of causation. Levitt and John Donohue (both my coauthors and friends, about whom you will hear more later) showed that seemingly unrelated events like the abortion rate in 1970 and the crime rate in 1990 have an important connection. Yet Freakonomics didn’t talk much about the extent to which quantitative analysis is impacting real-world decisions. In contrast, this book is about just that—the impact of number crunching. Decision makers in- and outside of business are using statistical analysis in ways you’d never imagine to drive all kinds of choices.
All of industry, worldwide, is being remade around the database capacities of modern computers. The expectation (and fear) of the 1950s and ’60s—in books like Vance Packard’s The Hidden Persuaders—that sophisticated social engineering, at the behest of big government and big corporations, was about to take over the world has been suddenly resurrected for a new generation. But where we once expected big government to solve all human problems by command and control, we now observe something similar arising in the form of massive data networks.
Why Me?
I’m a number cruncher myself. Even though I teach law at Yale, I learned econometrics while I was studying at MIT for a Ph.D. I’ve crunched numbers on everything from bail bonds and kidney transplantation to concealed handguns and reckless sex. You might think that your basic Ivy-tower egghead is completely disconnected from real-world decision making (and yes, I am the kind of absentminded professor who was once so engrossed in writing an article on a train that I went to Poughkeepsie instead of New Haven). Still, even data mining by eggheads can sometimes have an impact on the world.
A few years back Steve Levitt and I teamed up to figure out something very practical—the impact of LoJack on auto theft. LoJack is a small radio transmitter that is hidden in one of many possible locations within a car. When the car is reported stolen the police remotely activate the transmitter, and then specially equipped police cars can track the precise location of the stolen vehicle. LoJack is highly effective as a vehicle-recovery device. LoJack Corporation knew this and proudly advertised a 95 percent recovery rate. But Steve and I wanted to test whether LoJack helped reduce auto theft generally. The problem with lots of anti-theft devices is that they might just shift crime around. If you use “the Club” on your car, it probably doesn’t stop crime, it just causes the thief to walk down the street and steal another car. The cool thing about LoJack is that it’s hid
den. In a city covered by LoJack, a thief doesn’t know whether a particular car has it or not.
This is just the kind of perversity that Levitt likes to explore. Freakonomics reviewers really got it right when they said that Steve looks at things differently. Several years ago, I had an extra ticket and invited Steve to come with me to see Michael Jordan play with the Chicago Bulls. Steve figured he’d enjoy the game more if he was invested in it, but (in sharp contrast to me) he didn’t care that much about whether the Bulls won or lost. So just before the game, he hopped online and placed a substantial bet that Chicago would win. Now he really was invested in the game. The online bet changed his incentives.
In an odd way, LoJack is also a device for changing incentives. Before LoJack, many professional thieves were almost untouchable. LoJack changed all that. With LoJack, cops not only recover the vehicle, they often catch the thief. In Los Angeles alone, LoJack has broken up more than 100 chop shops. If you steal 100 cars in a LoJack town, you’re almost certain to steal some that have LoJack in them. We wanted to test whether LoJack scared thieves from taking cars generally. If it does, LoJack creates what economists call a “positive externality.” When you put the Club on your car, you probably are increasing the chance the next guy’s car will be stolen. If enough people put LoJack in their cars, however, Steve and I thought that they might be helping their neighbors by scaring professional car thieves from taking any cars.
Our biggest problem was convincing LoJack to share any of its sales data with us. I remember repeatedly calling and trying to convince them that if Steve and I were right, it would provide another reason for people to buy LoJack. If LoJack reduces the chance that thieves will take other people’s cars, then LoJack might be able to convince insurance companies to give LoJack users more substantial discounts. A junior executive finally did send us tons of helpful data. But to be honest, LoJack just wasn’t that interested in the research at first.
All that changed when they saw the first draft of our paper. After looking at auto theft in fifty-six cities over fourteen years, we found that LoJack had a huge positive benefit for other people. In high-crime areas, a $500 investment in LoJack reduced the car theft losses of non-LoJack users by $5,000. Because we had LoJack sales broken down by both year and city, we could generate a pretty accurate estimate about the proportion of cars with LoJack that were on the road. (For example, in Boston, where the state mandated the largest insurance discount, over 10 percent of the cars had LoJack.) We looked to see what happened to auto theft in the city as a whole as the number of LoJack users increased. Since LoJack service began in different cities in different years, we could estimate the impact of LoJack separate from the general level of crime in that year. In city after city, as the percentage of cars with LoJack increased, the rate of auto theft fell dramatically. Insurance companies weren’t giving nearly big enough discounts for LoJack, because they weren’t taking into account how much LoJack reduced payouts on even unprotected cars.
Steve and I never bought LoJack stock (because we didn’t want to change our own incentives, to tell the truth) but we knew we were sitting on valuable information. When our working paper went public the stock jumped 2.4 percent. Our study has helped convince other cities to adopt the LoJack technology and has spurred slightly higher insurance discounts (but they’re still not nearly large enough!).
The bottom line here is that I care passionately about number crunching. I have been a cook myself in the data-mining café. Like Ashenfelter, I am the editor of a serious journal, the Journal of Law, Economics, and Organization, where I have to evaluate the quality of statistical papers all the time. I’m well placed to explore the rise of data-based decision-making because I have been both a participant and an observer. I know where the bodies are buried.
Plan of Attack
The next five chapters will detail the rise of Super Crunching across society. The first three chapters will introduce you to two fundamental statistical techniques—regressions and randomized trials—and show how the art of quantitative prediction is reshaping business and government. We’ll explore the debate over “evidence-based” medicine in Chapter 4. And Chapter 5 will look at hundreds of tests evaluating how data-based decision making fares in comparison with experience-and intuition-based decisions.
The second part of the book will step back and assess the significance of this trend. We’ll explore why it’s happening now and whether we should be happy about it. Chapter 7 will look at who’s losing out—in terms of both status and discretion. And finally, Chapter 8 will look to the future. The rise of Super Crunching doesn’t mean the end of intuition or the unimportance of on-the-job experience. Rather, we are likely to see a new era where the best and the brightest are comfortable with both statistics and ideas.
In the end, this book will not try to bury intuition or experiential expertise as norms of decision making, but will show how intuition and experience are evolving to interact with data-based decision making. In fact, there is a new breed of innovative Super Crunchers—people like Steve Levitt—who toggle between their intuitions and number crunching to see farther than either intuitivists or gearheads ever could before.
CHAPTER 1
Who’s Doing Your Thinking for You?
Recommendations make life a lot easier. Want to know what movie to rent? The traditional way was to ask a friend or to see whether reviewers gave it a thumbs-up.
Nowadays people are looking for Internet guidance drawn from the behavior of the masses. Some of these “preference engines” are simple lists of what’s most popular. The New York Times lists the “most emailed articles.” iTunes lists the top downloaded songs. Del.icio.us lists the most popular Internet bookmarks. These simple filters often let surfers zero in on the greatest hits.
Some recommendation software goes a step further and tries to tell you what people like you enjoyed. Amazon.com tells you that people who bought The Da Vinci Code also bought Holy Blood, Holy Grail. Netflix gives you recommendations that are contingent on the movies that you yourself have recommended in the past. This is truly “collaborative filtering,” because your ratings of movies help Netflix make better recommendations to others and their ratings help Netflix make better recommendations to you. The Internet is a perfect vehicle for this service because it’s really cheap for an Internet retailer to keep track of customer behavior and to automatically aggregate, analyze, and display this information for subsequent customers.
Of course, these algorithms aren’t perfect. A bachelor buying a one-time gift for a baby could, for example, trigger the program into recommending more baby products in the future. Wal-Mart had to apologize when people who searched for Martin Luther King: I Have a Dream were told they might also appreciate a Planet of the Apes DVD collection. Amazon.com similarly offended some customers who searched for “abortion” and were asked “Did you mean adoption?” The adoption question was generated automatically simply because many past customers who searched for abortion had also searched for adoption.
Still, on net, collaborative filters have been a huge boon for both consumers and retailers. At Netflix, nearly two-thirds of the rented films are recommended by the site. And recommended films are rated half a star higher (on Netflix’s five-star ranking system) than films that people rent outside the recommendation system.
While lists of most-emailed articles and best-sellers tend to concentrate usage, the great thing about the more personally tailored recommendations is that they diversify usage. Netflix can recommend different movies to different people. As a result, more than 90 percent of the titles in its 50,000-movie catalog are rented at least monthly. Collaborative filters let sellers access what Chris Anderson calls the “long tail” of the preference distribution. The Netflix recommendations let its customers put themselves in rarefied market niches that used to be hard to find.
The same thing is happening with music. At Pandora.com, users can type in a song or an artist that they like and almost instantaneously the
website starts streaming song after song in the same genre. Do you like Cyndi Lauper and Smash Mouth? Voilà, Pandora creates a Lauper/Smash Mouth radio station just for you that plays these artists plus others that sound like them. As each song is playing, you have the option of teaching the software more about what you like by clicking “I really like this song” or “Don’t play this type of song again.”
It’s amazing how well this site works for both me and my kids. It not only plays music that each of us enjoys, but it also finds music that we like by groups we’ve never heard of. For example, because I told Pandora that I like Bruce Springsteen, it created a radio station that started playing the Boss and other well-known artists, but after a few songs it had me grooving to “Now” by Keaton Simons (and because of on-hand quick links, it’s easy to buy the song or album on iTunes or Amazon). This is the long tail in action because there’s no way a nerd like me would have come across this guy on my own. A similar preference system lets Rhapsody.com play more than 90 percent of its catalog of a million songs every month.
MSNBC.com has recently added its own “recommended stories” feature. It uses a cookie to keep track of the sixteen articles you’ve most recently read and uses automated text analysis to predict what new stories you’ll want to read. It’s surprising how accurate a sixteen-story history can be in kickstarting your morning reading. It’s also a bit embarrassing: in my case American Idol articles are automatically recommended.