Book Read Free

The Black Swan

Page 31

by Nassim Nicholas Taleb


  Chapter Fifteen

  THE BELL CURVE, THAT GREAT INTELLECTUAL FRAUD*

  Not worth a pastis—Quételet’s error—The average man is a monster—Let’s deify it—Yes or no—Not so literary an experiment

  Forget everything you heard in college statistics or probability theory. If you never took such a class, even better. Let us start from the very beginning.

  THE GAUSSIAN AND THE MANDELBROTIAN

  I was transiting through the Frankfurt airport in December 2001, on my way from Oslo to Zurich.

  I had time to kill at the airport and it was a great opportunity for me to buy dark European chocolate, especially since I have managed to successfully convince myself that airport calories don’t count. The cashier handed me, among other things, a ten deutschmark bill, an (illegal) scan of which can be seen on the next page. The deutschmark banknotes were going to be put out of circulation in a matter of days, since Europe was switching to the euro. I kept it as a valedictory. Before the arrival of the euro, Europe had plenty of national currencies, which was good for printers, money changers, and of course currency traders like this (more or less) humble author. As I was eating my dark European chocolate and wistfully looking at the bill, I almost choked. I suddenly noticed, for the first time, that there was something curious about it. The bill bore the portrait of Carl Friedrich Gauss and a picture of his Gaussian bell curve.

  The last ten deutschmark bill, representing Gauss and, to his right, the bell curve of Mediocristan.

  The striking irony here is that the last possible object that can be linked to the German currency is precisely such a curve: the reichsmark (as the currency was previously called) went from four per dollar to four trillion per dollar in the space of a few years during the 1920s, an outcome that tells you that the bell curve is meaningless as a description of the randomness in currency fluctuations. All you need to reject the bell curve is for such a movement to occur once, and only once—just consider the consequences. Yet there was the bell curve, and next to it Herr Professor Doktor Gauss, unprepossessing, a little stern, certainly not someone I’d want to spend time with lounging on a terrace, drinking pastis, and holding a conversation without a subject.

  Shockingly, the bell curve is used as a risk-measurement tool by those regulators and central bankers who wear dark suits and talk in a boring way about currencies.

  The Increase in the Decrease

  The main point of the Gaussian, as I’ve said, is that most observations hover around the mediocre, the average; the odds of a deviation decline faster and faster (exponentially) as you move away from the average. If you must have only one single piece of information, this is the one: the dramatic increase in the speed of decline in the odds as you move away from the center, or the average. Look at the list below for an illustration of this. I am taking an example of a Gaussian quantity, such as height, and simplifying it a bit to make it more illustrative. Assume that the average height (men and women) is 1.67 meters, or 5 feet 7 inches. Consider what I call a unit of deviation here as 10 centimeters. Let us look at increments above 1.67 meters and consider the odds of someone being that tall.*

  10 centimeters taller than the average (i.e., taller than 1.77 m, or 5 feet 10): 1 in 6.3

  20 centimeters taller than the average (i.e., taller than 1.87 m, or 6 feet 2): 1 in 44

  30 centimeters taller than the average (i.e., taller than 1.97 m, or 6 feet 6): 1 in 740

  40 centimeters taller than the average (i.e., taller than 2.07 m, or 6 feet 9): 1 in 32,000

  50 centimeters taller than the average (i.e., taller than 2.17 m, or 7 feet 1): 1 in 3,500,000

  60 centimeters taller than the average (i.e., taller than 2.27 m, or 7 feet 5): 1 in 1,000,000,000

  70 centimeters taller than the average (i.e., taller than 2.37 m, or 7 feet 9): 1 in 780,000,000,000

  80 centimeters taller than the average (i.e., taller than 2.47 m, or 8 feet 1): 1 in 1,600,000,000,000,000

  90 centimeters taller than the average (i.e., taller than 2.57 m, or 8 feet 5): 1 in 8,900,000,000,000,000,000

  100 centimeters taller than the average (i.e., taller than 2.67 m, or 8 feet 9): 1 in 130,000,000,000,000,000,000,000

  … and,

  110 centimeters taller than the average (i.e., taller than 2.77 m, or 9 feet 1): 1 in 36,000,000,000,000,000,000,000,000,000,000,000, 000,000,000,000,000,000,000,000,000,000,000,000,000,000,000, 000,000,000.

  Note that soon after, I believe, 22 deviations, or 220 centimeters taller than the average, the odds reach a googol, which is 1 with 100 zeroes behind it.

  The point of this list is to illustrate the acceleration. Look at the difference in odds between 60 and 70 centimeters taller than average: for a mere increase of four inches, we go from one in 1 billion people to one in 780 billion! As for the jump between 70 and 80 centimeters: an additional 4 inches above the average, we go from one in 780 billion to one in 1.6 million billion!*

  This precipitous decline in the odds of encountering something is what allows you to ignore outliers. Only one curve can deliver this decline, and it is the bell curve (and its nonscalable siblings).

  The Mandelbrotian

  By comparison, look at the odds of being rich in Europe. Assume that wealth there is scalable, i.e., Mandelbrotian. (This is not an accurate description of wealth in Europe; it is simplified to emphasize the logic of scalable distribution.)†

  Scalable Wealth Distribution

  People with a net worth higher than €1 million: 1 in 62.5

  Higher than €2 million: 1 in 250

  Higher than €4 million: 1 in 1,000

  Higher than €8 million: 1 in 4,000

  Higher than €16 million: 1 in 16,000

  Higher than €32 million: 1 in 64,000

  Higher than €320 million: 1 in 6,400,000

  The speed of the decrease here remains constant (or does not decline)! When you double the amount of money you cut the incidence by a factor of four, no matter the level, whether you are at €8 million or €16 million. This, in a nutshell, illustrates the difference between Mediocristan and Extremistan.

  Recall the comparison between the scalable and the nonscalable in Chapter 3. Scalability means that there is no headwind to slow you down.

  Of course, Mandelbrotian Extremistan can take many shapes. Consider wealth in an extremely concentrated version of Extremistan; there, if you double the wealth, you halve the incidence. The result is quantitatively different from the above example, but it obeys the same logic.

  Fractal Wealth Distribution with Large Inequalities

  People with a net worth higher than €1 million: 1 in 63

  Higher than €2 million: 1 in 125

  Higher than €4 million: 1 in 250

  Higher than €8 million: 1 in 500

  Higher than €16 million: 1 in 1,000

  Higher than €32 million: 1 in 2,000

  Higher than €320 million: 1 in 20,000

  Higher than €640 million: 1 in 40,000

  If wealth were Gaussian, we would observe the following divergence away from €1 million.

  Wealth Distribution Assuming a Gaussian Law

  People with a net worth higher than €1 million: 1 in 63

  Higher than €2 million: 1 in 127,000

  Higher than €3 million: 1 in 14,000,000,000

  Higher than €4 million: 1 in 886,000,000,000,000,000

  Higher than €8 million: 1 in 16,000,000,000,000,000,000,000,000,000,000,000

  Higher than €16 million: 1 in … none of my computers is capable of handling the computation.

  What I want to show with these lists is the qualitative difference in the paradigms. As I have said, the second paradigm is scalable; it has no headwind. Note that another term for the scalable is power laws.

  Just knowing that we are in a power-law environment does not tell us much. Why? Because we have to measure the coefficients in real life, which is much harder than with a Gaussian framework. Only the Gaussian yields its properties ra
ther rapidly. The method I propose is a general way of viewing the world rather than a precise solution.

  What to Remember

  Remember this: the Gaussian–bell curve variations face a headwind that makes probabilities drop at a faster and faster rate as you move away from the mean, while “scalables,” or Mandelbrotian variations, do not have such a restriction. That’s pretty much most of what you need to know.*

  Inequality

  Let us look more closely at the nature of inequality. In the Gaussian framework, inequality decreases as the deviations get larger—caused by the increase in the rate of decrease. Not so with the scalable: inequality stays the same throughout. The inequality among the superrich is the same as the inequality among the simply rich—it does not slow down.†

  Consider this effect. Take a random sample of any two people from the U.S. population who jointly earn $1 million per annum. What is the most likely breakdown of their respective incomes? In Mediocristan, the most likely combination is half a million each. In Extremistan, it would be $50,000 and $950,000.

  The situation is even more lopsided with book sales. If I told you that two authors sold a total of a million copies of their books, the most likely combination is 993,000 copies sold for one and 7,000 for the other. This is far more likely than that the books each sold 500,000 copies. For any large total, the breakdown will be more and more asymmetric.

  Why is this so? The height problem provides a comparison. If I told you that the total height of two people is fourteen feet, you would identify the most likely breakdown as seven feet each, not two feet and twelve feet; not even eight feet and six feet! Persons taller than eight feet are so rare that such a combination would be impossible.

  Extremistan and the 80/20 Rule

  Have you ever heard of the 80/20 rule? It is the common signature of a power law—actually it is how it all started, when Vilfredo Pareto made the observation that 80 percent of the land in Italy was owned by 20 percent of the people. Some use the rule to imply that 80 percent of the work is done by 20 percent of the people. Or that 80 percent worth of effort contributes to only 20 percent of results, and vice versa.

  As far as axioms go, this one wasn’t phrased to impress you the most: it could easily be called the 50/01 rule, that is, 50 percent of the work comes from 1 percent of the workers. This formulation makes the world look even more unfair, yet the two formulae are exactly the same. How? Well, if there is inequality, then those who constitute the 20 percent in the 80/20 rule also contribute unequally—only a few of them deliver the lion’s share of the results. This trickles down to about one in a hundred contributing a little more than half the total.

  The 80/20 rule is only metaphorical; it is not a rule, even less a rigid law. In the U.S. book business, the proportions are more like 97/20 (i.e., 97 percent of book sales are made by 20 percent of the authors); it’s even worse if you focus on literary nonfiction (twenty books of close to eight thousand represent half the sales).

  Note here that it is not all uncertainty. In some situations you may have a concentration, of the 80/20 type, with very predictable and tractable properties, which enables clear decision making, because you can identify beforehand where the meaningful 20 percent are. These situations are very easy to control. For instance, Malcolm Gladwell wrote in an article in The New Yorker that most abuse of prisoners is attributable to a very small number of vicious guards. Filter those guards out and your rate of prisoner abuse drops dramatically. (In publishing, on the other hand, you do not know beforehand which book will bring home the bacon. The same with wars, as you do not know beforehand which conflict will kill a portion of the planet’s residents.)

  Grass and Trees

  I’ll summarize here and repeat the arguments previously made throughout the book. Measures of uncertainty that are based on the bell curve simply disregard the possibility, and the impact, of sharp jumps or discontinuities and are, therefore, inapplicable in Extremistan. Using them is like focusing on the grass and missing out on the (gigantic) trees. Although unpredictable large deviations are rare, they cannot be dismissed as outliers because, cumulatively, their impact is so dramatic.

  The traditional Gaussian way of looking at the world begins by focusing on the ordinary, and then deals with exceptions or so-called outliers as ancillaries. But there is a second way, which takes the exceptional as a starting point and treats the ordinary as subordinate.

  I have emphasized that there are two varieties of randomness, qualitatively different, like air and water. One does not care about extremes; the other is severely impacted by them. One does not generate Black Swans; the other does. We cannot use the same techniques to discuss a gas as we would use with a liquid. And if we could, we wouldn’t call the approach “an approximation.” A gas does not “approximate” a liquid.

  We can make good use of the Gaussian approach in variables for which there is a rational reason for the largest not to be too far away from the average. If there is gravity pulling numbers down, or if there are physical limitations preventing very large observations, we end up in Mediocristan. If there are strong forces of equilibrium bringing things back rather rapidly after conditions diverge from equilibrium, then again you can use the Gaussian approach. Otherwise, fuhgedaboudit. This is why much of economics is based on the notion of equilibrium: among other benefits, it allows you to treat economic phenomena as Gaussian.

  Note that I am not telling you that the Mediocristan type of randomness does not allow for some extremes. But it tells you that they are so rare that they do not play a significant role in the total. The effect of such extremes is pitifully small and decreases as your population gets larger.

  To be a little bit more technical here, if you have an assortment of giants and dwarfs, that is, observations several orders of magnitude apart, you could still be in Mediocristan. How? Assume you have a sample of one thousand people, with a large spectrum running from the dwarf to the giant. You are likely to see many giants in your sample, not a rare occasional one. Your average will not be impacted by the occasional additional giant because some of these giants are expected to be part of your sample, and your average is likely to be high. In other words, the largest observation cannot be too far away from the average. The average will always contain both kinds, giants and dwarves, so that neither should be too rare—unless you get a megagiant or a microdwarf on very rare occasion. This would be Mediocristan with a large unit of deviation.

  Note once again the following principle: the rarer the event, the higher the error in our estimation of its probability—even when using the Gaussian.

  Let me show you how the Gaussian bell curve sucks randomness out of life—which is why it is popular. We like it because it allows certainties! How? Through averaging, as I will discuss next.

  How Coffee Drinking Can Be Safe

  Recall from the Mediocristan discussion in Chapter 3 that no single observation will impact your total. This property will be more and more significant as your population increases in size. The averages will become more and more stable, to the point where all samples will look alike.

  I’ve had plenty of cups of coffee in my life (it’s my principal addiction). I have never seen a cup jump two feet from my desk, nor has coffee spilled spontaneously on this manuscript without intervention (even in Russia). Indeed, it will take more than a mild coffee addiction to witness such an event; it would require more lifetimes than is perhaps conceivable—the odds are so small, one in so many zeroes, that it would be impossible for me to write them down in my free time.

  Yet physical reality makes it possible for my coffee cup to jump—very unlikely, but possible. Particles jump around all the time. How come the coffee cup, itself composed of jumping particles, does not? The reason is, simply, that for the cup to jump would require that all of the particles jump in the same direction, and do so in lockstep several times in a row (with a compensating move of the table in the opposite direction). All several trillion particles in my coffee cup
are not going to jump in the same direction; this is not going to happen in the lifetime of this universe. So I can safely put the coffee cup on the edge of my writing table and worry about more serious sources of uncertainty.

  FIGURE 7: How the Law of Large Numbers Works

  In Mediocristan, as your sample size increases, the observed average will present itself with less and less dispersion—as you can see, the distribution will be narrower and narrower. This, in a nutshell, is how everything in statistical theory works (or is supposed to work). Uncertainty in Mediocristan vanishes under averaging. This illustrates the hackneyed “law of large numbers.”

  The safety of my coffee cup illustrates how the randomness of the Gaussian is tamable by averaging. If my cup were one large particle, or acted as one, then its jumping would be a problem. But my cup is the sum of trillions of very small particles.

  Casino operators understand this well, which is why they never (if they do things right) lose money. They simply do not let one gambler make a massive bet, instead preferring to have plenty of gamblers make series of bets of limited size. Gamblers may bet a total of $20 million, but you needn’t worry about the casino’s health: the bets run, say, $20 on average; the casino caps the bets at a maximum that will allow the casino owners to sleep at night. So the variations in the casino’s returns are going to be ridiculously small, no matter the total gambling activity. You will not see anyone leaving the casino with $1 billion—in the lifetime of this universe.

 

‹ Prev