by Nate Silver
I have some quarrels with the paper’s methodology. Intrade’s forecasts beat FiveThirtyEight’s only after Wolfers and Rothschild made certain adjustments to them after the fact; otherwise FiveThirtyEight won.13 Perhaps more important, a new forecast at FiveThrityEight fairly often moved the Intrade price in the same direction, suggesting that the bettors there were piggybacking off it to some extent.
Nevertheless, there is strong empirical and theoretical evidence that there is a benefit in aggregating different forecasts. Across a number of disciplines, from macroeconomic forecasting to political polling, simply taking an average of everyone’s forecast rather than relying on just one has been found to reduce forecast error,14 often by about 15 or 20 percent.
But before you start averaging everything together, you should understand three things. First, while the aggregate forecast will essentially always be better than the typical individual’s forecast, that doesn’t necessarily mean it will be good. For instance, aggregate macroeconomic forecasts are much too crude to predict recessions more than a few months in advance. They are somewhat better than individual economists’ forecasts, however.
Second, the most robust evidence indicates that this wisdom-of-crowds principle holds when forecasts are made independently before being averaged together. In a true betting market (including the stock market), people can and do react to one another’s behavior. Under these conditions, where the crowd begins to behave more dynamically, group behavior becomes more complex.
Third, although the aggregate forecast is better than the typical individual’s forecast, it does not necessarily hold that it is better than the best individual’s forecast. Perhaps there is some polling firm, for instance, whose surveys are so accurate that it is better to use their polls and their polls alone rather than dilute them with numbers from their less-accurate peers.
When this property has been studied over the long run, however, the aggregate forecast has often beaten even the very best individual forecast. A study of the Blue Chip Economic Indicators survey, for instance, found that the aggregate forecast was better over a multiyear period than the forecasts issued by any one of the seventy economists that made up the panel.15 Another study by Wolfers, looking at predictions of NFL football games, found that the consensus forecasts produced by betting markets were better than about 99.5 percent of those from individual handicappers.16 And this is certainly true of political polling; models that treat any one poll as the Holy Grail are more prone to embarrassing failures.17 Reducing error by 15 or 20 percent by combining forecasts may not sound all that impressive, but it’s awfully hard to beat in a competitive market.
So I told Wolfers and Rothschild that I was ready to accept the principle behind their conclusion, if not all the details. After all, bettors at Intrade can use FiveThirtyEight’s forecasts to make their predictions as well as whatever other information they deem to be relevant (like the forecasts issued by our competitors, some of which are also very good). Of course, the bettors could interpret that information in a biased fashion and get themselves into trouble. But it is not like the FiveThirtyEight forecasts—or anybody else’s—are beyond reproach.
Wolfers seemed disappointed that I was willing to concede so much ground. If I wasn’t sure I could beat Intrade, why not just join them and adopt their predictions as my own?
“I’m surprised by your reaction, actually,” he told me. “If there’s something else that should beat it and does beat it, what’s the point of doing what you’re doing?”
For one thing, I find making the forecasts intellectually interesting—and they help to produce traffic for my blog.
Also, while I accept the theoretical benefits of prediction markets, I don’t know that political betting markets like Intrade are all that good right now—the standard of competition is fairly low. Intrade is becoming more popular, but it is still small potatoes compared with the stock market or Las Vegas. In the weeks leading up to the Super Tuesday primaries in March 2012, for instance, about $1.6 million in shares were traded there;18 by contrast, $8 million is traded in the New York Stock Exchange in a single second. The biggest profit made by any one trader from his Super Tuesday bets was about $9,000, which is not enough to make a living, let alone to get rich. Meanwhile, Intrade is in a legal gray area and most of the people betting on American politics are from Europe or from other countries. There have also been some cases of market manipulation*19 or blatant irrational pricing20 there. And these markets haven’t done very well at aggregating information in instances where there isn’t much information worth aggregating, like in trying to guess the outcome of Supreme Court cases from the nebulous clues the justices provide to the public.
Could FiveThirtyEight and other good political forecasters beat Intrade if it were fully legal in the United States and its trading volumes were an order of magnitude or two higher? I’d think it would be difficult. Can they do so right now? My educated guess21 is that some of us still can, if we select our bets carefully.22
Then again, a lot of smart people have failed miserably when they thought they could beat the market.
The Origin of Efficient-Market Hypothesis
In 1959, a twenty-year-old college student named Eugene Fama, bored with the Tufts University curriculum of romance languages and Voltaire, took a job working for a professor who ran a stock market forecasting service.23 The job was a natural fit for him; Fama was a fierce competitor who had been the first in his family to go to college and who had been a star athlete at Boston’s Malden Catholic High School despite standing at just five feet eight. He combed through data on past stock market returns looking for anything that could provide an investor with an advantage, frequently identifying statistical patterns that suggested the stock market was highly predictable and an investor could make a fortune by exploiting them. The professor almost always responded skeptically, advising Fama to wait and see how the strategies performed in the real world before starting to invest in them. Almost always, Fama’s strategies failed.
Equally frustrated and fascinated by the experience, Fama abandoned his plans to become a high school teacher and instead enrolled at the University of Chicago’s Graduate School of Business, where in 1965 he managed to get his Ph.D. thesis published. The paper had something of the flavor of the baseball statistician Bill James’s pioneering work during the 1980s, leaning on a mix of statistics and sarcasm to claim that much of the conventional wisdom about how stocks behaved was pure baloney. Studying the returns of dozens of mutual funds in a ten-year period from 1950 to 1960, Fama found that funds that performed well in one year were no more likely to beat their competition the next time around.24 Although he had been unable to beat the market, nobody else really could either:
A superior analyst is one whose gains . . . are consistently greater than those of the market. Consistency is the crucial word here, since for any given short period of time . . . some people will do much better than the market and some will do much worse.
Unfortunately, by this criterion, this author does not qualify as a superior analyst. There is some consolation, however . . . . [O]ther more market-tested institutions do not seem to qualify either.25
The paper, although it would later be cited more than 4,000 times,26 at first received about as much attention as most things published by University of Chicago graduate students.27 But it had laid the groundwork for efficient-market hypothesis. The central claim of the theory is that the movement of the stock market is unpredictable to any meaningful extent. Some investors inevitably perform better than others over short periods of time—just as some gamblers inevitably win at roulette on any given evening in Las Vegas. But, Fama claimed, they weren’t able to make good enough predictions to beat the market over the long run.
Past Performance Is Not Indicative of Future Results
Very often, we fail to appreciate the limitations imposed by small sample sizes and mistake luck for skill when we look at how well someone’s predictions have done. The reve
rse can occasionally also be true, such as in examining the batting averages of baseball players over short time spans: there is skill there, perhaps even quite a bit of it, but it is drowned out by noise.
In the stock market, the data on the performance of individual traders is noisy enough that it’s very hard to tell whether they are any good at all. “Past performance is not indicative of future results” appears in mutual-fund brochures for a reason.
Suppose that in 2007 you wanted to invest in a mutual fund, one that focused mostly on large-capitalization American stocks like those that make up the Dow Jones Industrial Average or the S&P 500. You went to E*Trade, which offered literally hundreds of choices of such funds and all sorts of information about them, like the average return they had achieved over the prior five years. Surely you would have been better off investing in a fund like EVTMX (Eaton Vance Dividend Builder A), which had beaten the market by almost 10 percent annually from 2002 through 2006? Or if you were feeling more daring, JSVAX (Janus Contrarian T), which had invested in some unpopular stocks but had bettered the market by 9 percent annually during this period?
Actually, it wouldn’t have made any difference. When I looked at how these mutual funds performed from 2002 through 2006, and compared it with how they performed over the next five years from 2007 through 2011, there was literally no correlation between them. EVTMX, the best-performing fund from 2002 through 2006, was only average over the next five years. And high-performing JSVAX was 3 percent worse per year than the market average. As Fama found, there was just no consistency in how well a fund did, even over five-year increments. Other studies have identified very modest correlations in mutual fund performance from year to year,28 but it’s so hard to tell them apart (figure 11-3)29 that you’re best off just selecting the one with the cheapest fees—or eschewing them entirely and investing in the market yourself.
The Misery of the Chartist
Fama reserved his harshest criticism, however, for what he called “chartists”—people who claim to be able to predict the direction of stock prices (as Fama had tried and failed to do) solely on the basis of past statistical patterns, without worrying about whether the company had made a profit or a loss or whether it sold airplanes or hamburgers. (The more polite term for this activity is technical analysis.)
Perhaps we should have some sympathy for the poor chartist: distinguishing the noise from the signal is not always so easy. In figure 11-4, I have presented a series of six stock market charts. Four of them are fakes and were literally generated by telling my computer to flip a coin* (or rather, pick a random series of 1’s and 0’s). The other two are real and depict the actual movement of the Dow Jones Industrial Average over the first 1,000 trading days of the 1970s and 1980s, respectively. Can you tell which is which? It isn’t so easy. (The answer is in the endnotes.30) Investors were looking at stock-price movements like these and were mistaking noise for a signal.
FIGURE 11-4: RANDOM-WALK AND ACTUAL STOCK-MARKET CHARTS
Three Forms of Efficient-Market Hypothesis
After looking at enough of this type of data, Fama refined his hypothesis to cover three distinct cases,31 each one making a progressively bolder claim about the predictability of markets.
First, there is the weak form of efficient-market hypothesis. What this claims is that stock-market prices cannot be predicted from analyzing past statistical patterns alone. In other words, the chartist’s techniques are bound to fail.
The semistrong form of efficient-market hypothesis takes things a step further. It argues that fundamental analysis—meaning, actually looking at publicly available information on a company’s financial statements, its business model, macroeconomic conditions and so forth—is also bound to fail and will also not produce returns that consistently beat the market.
Finally, there is the strong form of efficient-market hypothesis, which claims that even private information—insider secrets—will quickly be incorporated into market prices and will not produce above-average returns. This version of efficient-market hypothesis is meant more as the logical extreme of the theory and is not believed literally by most proponents of efficient markets (including Fama.32) There is fairly unambiguous evidence, instead, that insiders make above-average returns. One disturbing example is that members of Congress, who often gain access to inside information about a company while they are lobbied and who also have some ability to influence the fate of companies through legislation, return a profit on their investments that beats market averages by 5 to 10 percent per year,33 a remarkable rate that would make even Bernie Madoff blush.
But the debates over the weak form and semistrong forms of the hypothesis have been perhaps the hottest topic in all the social sciences. Almost nine hundred academic papers are published on the efficient-market hypothesis every year,34 and it is now discussed almost as often in financial journals35 as the theory of evolution is discussed in biological ones.36
Efficient-market hypothesis is sometimes mistaken for an excuse for the excesses of Wall Street; whatever else those guys are doing, it seems to assert, at least they’re behaving rationally. A few proponents of the efficient-market hypothesis might interpret it in that way. But as the theory was originally drafted, it really makes just the opposite case: the stock market is fundamentally and profoundly unpredictable. When something is truly unpredictable, nobody from your hairdresser to the investment banker making $2 million per year is able to beat it consistently.
However, as powerful as the theory claims to be, it comes with a few qualifications. The most important is that it pertains to returns on a risk-adjusted basis. Suppose you pursue an investment strategy that entails a 10 percent chance of going broke every year. This is exceptionally foolish—if you followed the strategy over a twenty-year investment horizon, there’s only a 12 percent chance that your money would live to tell about it. But if you are that ballsy, you deserve an excess profit. All versions of the efficient-market hypothesis allow for investors to make an above-average return provided it’s proportionate to the additional risks they are taking on.
Another important qualification is that profits are measured net of the cost of trading. Investors incur transaction costs every time they trade a stock. These costs are fairly small in most circumstances—perhaps 0.25 percent of a trade.37 But they accumulate the more often you trade and can be quite devastating to an overactive trader. This gives efficient-market hypothesis a bit of a buffer zone. Some investment strategies might be a tiny bit profitable in a world where trading was free. But in the real world, a trader needs to earn a large enough profit to cover this additional expense, in somewhat the same way that a winning poker player needs to beat the game by a large enough margin to cover the house’s take.
A Statistical Test of Efficient-Market Hypothesis
Opponents of efficient-market hypothesis have two good ways to attempt to disprove it. One is to demonstrate that some investors are consistently beating the stock market. The other is more direct: illustrate predictability in the returns.
One simple way to refute the hypothesis would be to demonstrate that stock-price movements are correlated from one day to the next. If the stock market rises on Tuesday, does that mean it is also more likely to rise on Wednesday? If so, that means that an investor could potentially benefit through a simple strategy of buying stocks each day that the market rises and selling them or shorting them each time that it declines. Depending on how large the investor’s transaction costs were, he might be able to beat the market in this way.
Suppose that we looked at the daily closing price of the Dow Jones Industrial Average in the 10 years between 1966 and 1975—the decade just after Fama had published his thesis. Over this period, the Dow moved in the same direction from day to day—a gain was followed by a gain or a loss by a loss—58 percent of the time. It switched directions just 42 percent of the time. That seems nonrandom and it is: a standard statistical test38 would have claimed that there was only about a 1-in-7 quintillio
n possibility (1 chance in 7,000,000,000,000,000) that this resulted from chance alone.
But statistical significance does not always equate to practical significance. An investor could not have profited from this trend.
Suppose that an investor had observed this pattern for ten years—gains tended to be followed by gains and losses by losses. On the morning of January 2, 1976, he decided to invest $10,000 in an index fund39 which tracked the Dow Jones Industrial Average. But he wasn’t going to be a passive investor. Instead he’d pursue what he called a Manic Momentum strategy to exploit the pattern. Every time the stock market declined over the day, he would pull all his money out, avoiding what he anticipated would be another decline the next day. He’d hold his money out of the market until he observed a day that the market rose, and then he would put it all back in. He would pursue this strategy for ten years, until the last trading day of 1985, at which point he would cash out his holdings for good, surely assured of massive profits.
How much money would this investor have at the end of the ten-year period? If you ignore dividends, inflation, and transaction costs, his $10,000 investment in 1976 would have been worth about $25,000 ten years later using the Manic Momentum strategy. By contrast, an investor who had adopted a simple buy-and-hold strategy during the same decade—buy $10,000 in stocks on January 2, 1976, and hold them for ten years, making no changes in the interim—would have only about $18,000 at the end of the period. Manic Momentum seems to have worked! Our investor, using a very basic strategy that exploited a simple statistical relationship in past market prices, substantially beat the market average, seeming to disprove the efficient-market hypothesis in the process.
But there is a catch. We ignored this investor’s transaction costs. This makes an enormous difference. Suppose that the investor had pursued the Manic Momentum strategy as before but that each time he cashes into or out of the market, he paid his broker a commission of 0.25 percent. Since this investor’s strategy requires buying or selling shares hundreds of times during this period, these small costs will nickel-and-dime him to death. If you account for his transaction costs, in fact, the $10,000 investment in the Manic Momentum strategy would have been worth only about $1,100 ten years later, eliminating not only his profit but also almost all the money he put in originally. In this case, there is just a little bit of predictability in stock-market returns—but not nearly enough to make a profit from them, and so efficient-market hypothesis is not violated.