by Robert Litan
Love it or hate it, Big Data is here to stay. There is a growing literature on the topic that is difficult to keep up with. At this writing, I highly recommend two books cited in the endnote for those interested in the subject.16 An earlier book, Super Crunchers, by economist and lawyer Ian Ayres of Yale Law School, anticipated the growth of Big Data analytics and is also worth reading if for no other reason than he was way out front on this topic before it became as popular as it has become.17
One final observation about all this is worth noting. The Big Data movement came largely out of the business world rather than academia, and thus is the exception to the rule of this chapter (although it is roughly consistent with the course of events described in the topic areas covered by the next two chapters). At this writing, in mid-2014, universities are just beginning to catch up to industry’s need for a whole new generation of data scientists—individuals who have training in multiple fields, primarily statistics and computer science, but also economics and perhaps one or more of the physical or biological sciences. As just one example, Georgetown received a $100 million donation in September 2013 to launch a new public policy school, one of whose primary missions will be data analytics. Carnegie Mellon and the University of California at Berkeley already have made their marks in the field. I expect a growing number of other schools to join them in the years ahead.
Econometrics and Sports: Moneyball
It may not exactly be Big Data, but the data generated by athletic performances is certainly interesting to millions of Americans and the owners of the teams who put them on the field. Perhaps no sport is more measured or attracts more data geeks than professional baseball. The best-known sports geek of them all is Bill James, who helped launch the baseball data revolution from a makeshift office in the back of his house in Lawrence, Kansas. James and his craft were catapulted into fame by Michael Lewis in his book Moneyball (the basis for the movie of the same name).18
Moneyball entails the use of statistics to discover and exploit the inefficiencies in the valuation of individual players (baseball in the first instance) to determine how and to what extent these players contribute to their teams’ performance. The book Moneyball credits the Oakland Athletics and its manager, Billy Beane, with being the first practitioner of this mode of analysis but in fact other teams were making use of some of the same techniques, now also widely referred to as sabermetrics, at or near the same time.
The fundamental idea behind sabermetrics is to identify the key variables that most contribute to the performance of both players and teams. Since you are now familiar with the basic premise of regression analysis, you won’t be surprised to learn that the Oakland A’s used various forms of it to evaluate baseball players’ batting statistics (and also their fielding statistics) in college and the minor leagues to discover overlooked or undervalued players to build a relatively inexpensive winning team. Put another way, teams practicing moneyball use baseball data to find undervalued players in much the same way that Warren Buffett and other value investors use financial data to discover undervalued stocks (one of the topics covered in Chapter 8).
Today, virtually all baseball teams engage in some form of moneyball, although none to my knowledge use it exclusively. More typical is the way the St. Louis Cardinals employ it—two different teams of experts, traditional scouts who rely on their gut and feel from observing young prospects, and the quants or analysts who go by the numbers, are mixed together to decide who to draft and trade. The stakes are huge, and no one has yet perfected the art of picking all the right people. The Cardinals’ scouting staff, for example, analyzed all of the baseball draft results from all teams between 1990 and 2013 and found that if a club signed nine players from a single year’s draft (in which more than 20 players are taken by each team, counting all rounds) who eventually made it to the major leagues, that would put the team in the 95th percentile of all teams (namely in the top two).19 Comparable data for just the more recent years, when presumably many, if not all, teams are using analytic techniques to help them identify talent, are not available, and one hopes are better. Still, picking young baseball players who are likely to have successful professional careers remains part art and part science, though moneyball techniques are pushing things in the scientific direction.
This is evident from the large and growing interest in sports analytics among fans of all types of sports teams, as well as among academic scholars. For example, if you’re into sports and want to know how a clever economist can come up with really interesting insights into what works and doesn’t, at least statistically, I highly recommend Scorecasting by Tobias Moskowitz (the economist) and L. Jon Wertheim, executive editor of Sports Illustrated.20 For a thorough discussion of the uses and limits of sabermetrics in baseball, where it all started, you can’t do better than The Sabermetrics Revolution: Assessing the Growth of Analytics in Baseball by Benjamin Baumer (a mathematician) and Andrew Zimbalist (one of the leading “sports economists” in the country, baseball in particular).21
The growing academic and real-world research in sports analytics turns out, not surprisingly, to be of more than academic interest. The annual MIT Sloan Sports Analytics Conference, for example, has become the premier forum for discussing the growing importance of the application of analytics to a range of sports. The conference has attracted growing numbers of attendees since its founding in 2006, and representatives from all major sports and all corners of the country come to it every year to discuss the latest trends and developments.
Can all sports be moneyballed? In other words, is it possible to apply analytics to the other major sports—football, basketball, hockey, and soccer—to discover and exploit different inefficiencies? Are individual sports like golf or tennis easier to moneyball? Are there certain sports that simply can’t be moneyballed? And perhaps most important to the vast majority of us who are not professional athletes, can or will a type of moneyball be used to assess our performance in the workplace?
The answers to all these questions seem to be yes, although it will take more time for analytical techniques to penetrate some sports than others. The speed of adoption in various sports will depend on the types of variables and metrics that are unique to each sport, and whether that data can be collected, analyzed, and exploited effectively. Some sports, like football, are more team-oriented and therefore have less individuality than baseball. The insights of moneyball and economics suggest, however, that there could be a lot of low-hanging analytical fruit in sports other than baseball, since they haven’t been explored as extensively yet.
Basketball is one sport outside of baseball where moneyball is starting to make inroads. At the 2011 MIT Sloan Sports Analytics Conference, for example, the backdrop in the main panel room featured a picture of Kobe Bryant taking a fadeaway shot just as Shane Battier was sticking his hand in Bryant’s face. The image illustrated a well-known analytical finding by the front office of the Houston Rockets (Battier’s team at the time): Bryant is a much less efficient scorer (as is likely the case with most other players) when a hand from an opposing player obstructs his view.22
Still, one sign that basketball has a way to go to catch up to baseball in analytical techniques is that professional basketball teams have been reluctant to discuss what measures they look at, citing the information as proprietary. Only when these measures become standardized and are widely adopted by all teams and made public, analogous to on-base percentage and similar publicly available baseball statistics, is moneyball likely to become part of the mainstream in basketball or any other sport.
As for the workplace, most companies already have a variety of ways in which they measure the performance of their employees, both by the quantity and quality of their output. There is an entire human relations sub-industry that has grown up around this subject. As sports analytics become increasingly sophisticated and well accepted across a number of sports, do not be surprised if some of the lessons from the athletic world spill over into the corporate world
(and perhaps vice versa).
Bloomberg Sports and William Squadron
You can’t engage in moneyball or any sort of statistical exercise without data. In the pre-digital world, Bill James was the king of baseball statistics. In the digital age, the data king in baseball, and potentially in soccer and other sports, is Bloomberg.
At one level, this should not be surprising because Bloomberg is a financial data company. Sports data are no different, though behind every hit or pitch is also a video, which makes it unlike financial data. Bloomberg Sports, overseen by William Squadron, has brought baseball data to a whole new level: The type, speed, and location of every pitch and its aftermath (taken, swung at, hit, and where) are fully integrated in a single system. The Bloomberg Sports database enables managers and players to know instantaneously the tendencies, or probabilities, of how specific pitchers will fare against specific batters, ideally in specific situations (with runners on base or none at all) and vice versa, all based on their prior histories.
Although he is not an economist, Squadron brings a unique set of skills, as an attorney, and more importantly, as a business innovator in the sports industry over the course of his career. He and Fox Sports colleagues Stan Honey and Jerry Gepner spun a sports technology company out of News Corp (owner of the Fox television network) that is perhaps best known for developing the yellow line that magically appears on TV screens during football games to indicate where the first down marker is, as well as showing the K Zone—the strike zone—for each pitcher-batter combination in televised baseball games.23 Later, Squadron moved to Bloomberg where he is showing how the back office of data is crucial to the further development and refinement of statistical techniques applied to the sports business (just as large databases, or big data, are the foundations for a growing data mining industry).
Regulatory Moneyball
I concede that after all this talk about sports, economics, and statistics, it may be somewhat of a letdown to conclude this chapter by talking about regulation. But Cass Sunstein, one of the nation’s leading legal scholars and a former regulatory official, has cleverly explained that the main task of regulators is or should be the practice of regulatory moneyball.24 Given the huge impact that federal and other regulations have on business and society, discussion of this topic alone would justify the trillion dollar label in the title of this book and hopefully that fact alone will pique your interest.
Yes, I did say trillion, and that and more may be the aggregate costs and benefits of just the body of federal regulation, even more counting state and local regulation. Admittedly, there remains a debate over the precise price tag, which I do not intend to resolve. My main purpose here is simply to focus on technique—the act of comparing the benefits and costs of rules before implementing them.
You would think such a simple idea—which many economists over many decades have championed—would not be controversial, but it has been one of the most contested notions in the policy arena over the past several decades. In fact, I began my career, after finishing law and graduate schools, as a staff economist at the Council of Economic Advisers (CEA) in 1977, when the political discussion about using cost-benefit analysis (CBA) in regulatory decision making, something which most people informally and routinely do in their everyday lives, was quite intense. The discussion and debate over CBA continues to this day.
Here’s how it all started. The precursor of CBA in the federal government was the inflation impact statement (IIS), which the administration of Gerald Ford required executive branch regulatory agencies to prepare before issuing final rules. The Carter administration, led by the Council of Economic Advisers, reformulated the IIS as something closer to a full cost-benefit analysis. CEA also headed a multi-agency Regulatory Analysis Review Group, which was formed to review the analyses of agencies’ proposed rules.
After President Reagan was elected, he further formalized the regulatory review process by issuing an executive order creating the Office of Information and Regulatory Affairs within the Office of Management and Budget. OIRA exists to this day, and is viewed as an important institutional check on the quality of cost-benefit analyses performed by executive branch regulatory agencies, whether or not their underlying statutes permit the balancing of costs against benefits in issuing rules themselves. Some agencies therefore don’t use CBA to make decisions under some statutes, although the analytical technique tends to find its way into decision making indirectly in many cases.
Although CBA has been controversial through the years—consumer and many environmental groups generally have opposed its use while business has been more friendly—every president since Reagan, both Democratic and Republican, has reaffirmed and refined its implementation. Several contentious issues remain, however. One is the appropriate discount rate to apply to likely benefits and costs in future years (the future values are discounted because a dollar today is more valuable than one received in later years). A second issue relates to the values assigned to avoiding deaths and injuries, in particular whether these values should be adjusted by age (if so, then a strictly economic calculus would assign greater values to avoiding deaths and injuries to younger than to older people).
Does regulatory moneyball have limits? Of course it does. Many benefits of regulatory rules, for example, cannot be monetized or quantified in any objective or scientific way. In addition, there are ethical issues involved in assigning values to lives, discounting them to take account of time value of money, or varying them by age.
In the end, however, regulatory moneyball (or CBA) is an input—a very important one, but not the only one—into regulatory decisions, just as real moneyball (sabermetrics) has become one, albeit not the only, important factor in the sports business.
The Bottom Line
It is hard to know where economics ends and statistics begins because the two fields are so intertwined. This is clearly the case in academia, where empirical economics is essentially applied statistics. In business, statistical analysis is becoming more important, especially in the age of big data. Companies using statisticians to refine their marketing or their production processes may not be aware of the close connection between economics and statistics. Nor may some sports enthusiasts be aware of the growing role of statistical analysis by the teams and players they root for. But one of the defining features of twenty-first century economies will be their reliance on and use of techniques for data analysis. Economists played a major role in this movement at its inception and will continue to help shape it in the future.
At the same time, economics as a separate academic discipline will also be affected and shaped by big data and the growing importance of analytical techniques in academia and the business word. I close the book in Chapter 16 with some thoughts about this topic.
Notes
1. Michael Lewis, Moneyball: The Art of Winning an Unfair Game (New York: W.W. Norton & Company, 2004).
2. New York Times, “Jan Tinbergen, Dutch Economist and Nobel Laureate, Dies at 91,” June 14, 1994.
3. David Henderson, “Jan Tinbergen” in The Concise Encyclopedia of Economics (Indianapolis: Liberty Fund, Inc., 2008).
4. Ibid., “Lawrence Klein.”
5. Lawrence R. Klein—Biographical, www.nobelprize.org.
6. Ibid.
7. Henderson, Concise Encyclopedia, “Lawrence Klein.”
8. “The Prize in Economics 1980,” press release, www.nobelprize.org/nobel_prizes/economic-sciences/laureates/1980/press.html.
9. “Lawrence R. Klein–Biographical,” www.nobelprize.org. See also Glenn Rifkin, “Lawrence R. Klein, Economic Theorist, Dies at 93,” New York Times, October 21, 2013.
10. “The Prize in Economics 1980.”
11. James Surowiecki, The Wisdom of Crowds (Norwell, MA: Anchor Press, 2005).
12. Larissa MacFarquhar, “The Bench Burner,” New Yorker, December 10, 2001, 87.
13. This profile is based largely on a personal interview with Professor Fisher, June 10, 2013.
14. Based on personal communications with Hal Varian, Google’s chief economist.
15. Bob Tita, “In-House Economists Are Hot Again,” Wall Street Journal, February 27, 2014.
16. Viktor Mayer-Schonberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work and Think (Chicago: Eamon Dolan/Houghton Mifflin Harcourt, 2013); and Bill Franks, Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics (Hoboken, NJ: John Wiley & Sons, 2012).
17. Ian Ayres, Super Crunchers: Why Thinking by the Numbers Is the Best Way to Be Smart (New York: Bantam Books, 2007).
18. Lewis, Moneyball.
19. Ben Reiter, “Three Days in June,” Sports Illustrated, October 28, 2013, 34–39.
20. Tobias Moskowitz and L. Jon Wertheim, Scorecasting: The Hidden Influences Behind How Sports Are Played and Games Are Won (New York: Three Rivers Press, 2011).
21. Benjamin Baumer and Andrew Zimbalist, The Sabermetrics Revolution: Assessing the Growth of Analytics in Baseball: (Philadelphia: University of Pennsylvania Press, 2014).
22. Marc Tracy, “Which Sport Is Most Immune to Moneyball?” New Republic, March 7, 2013. See also Michael Lewis, “The No-Stats All-Star,” New York Times, February 13, 2009.