The Signal and the Noise
Page 13
The debate about predictability began to be carried out on different terms during the Age of Enlightenment and the Industrial Revolution. Isaac Newton’s mechanics had seemed to suggest that the universe was highly orderly and predictable, abiding by relatively simple physical laws. The idea of scientific, technological, and economic progress—which by no means could be taken for granted in the centuries before then—began to emerge, along with the notion that mankind might learn to control its own fate. Predestination was subsumed by a new idea, that of scientific determinism.
The idea takes on various forms, but no one took it further than Pierre-Simon Laplace, a French astronomer and mathematician. In 1814, Laplace made the following postulate, which later came to be known as Laplace’s Demon:
We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.13
Given perfect knowledge of present conditions (“all positions of all items of which nature is composed”), and perfect knowledge of the laws that govern the universe (“all forces that set nature in motion”), we ought to be able to make perfect predictions (“the future just like the past would be present”). The movement of every particle in the universe should be as predictable as that of the balls on a billiard table. Human beings might not be up to the task, Laplace conceded. But if we were smart enough (and if we had fast enough computers) we could predict the weather and everything else—and we would find that nature itself is perfect.
Laplace’s Demon has been controversial for all its two-hundred-year existence. At loggerheads with the determinists are the probabilists, who believe that the conditions of the universe are knowable only with some degree of uncertainty.* Probabilism was, at first, mostly an epistemological paradigm: it avowed that there were limits on man’s ability to come to grips with the universe. More recently, with the discovery of quantum mechanics, scientists and philosophers have asked whether the universe itself behaves probabilistically. The particles Laplace sought to identify begin to behave like waves when you look closely enough—they seem to occupy no fixed position. How can you predict where something is going to go when you don’t know where it is in the first place? You can’t. This is the basis for the theoretical physicist Werner Heisenberg’s famous uncertainty principle.14 Physicists interpret the uncertainty principle in different ways, but it suggests that Laplace’s postulate cannot literally be true. Perfect predictions are impossible if the universe itself is random.
Fortunately, weather does not require quantum mechanics for us to study it. It happens at a molecular (rather than an atomic) level, and molecules are much too large to be discernibly impacted by quantum physics. Moreover, we understand the chemistry and Newtonian physics that govern the weather fairly well, and we have for a long time.
So what about a revised version of Laplace’s Demon? If we knew the position of every molecule in the earth’s atmosphere—a much humbler request than deigning to know the position of every atomic particle in the universe—could we make perfect weather predictions? Or is there a degree of randomness inherent in the weather as well?
The Matrix
Purely statistical predictions about the weather have long been possible. Given that it rained today, what is the probability that it will rain tomorrow? A meteorologist could look up all the past instances of rain in his database and give us an answer about that. Or he could look toward long-term averages: it rains about 35 percent of the time in London in March.15
The problem is that these sorts of predictions aren’t very useful—not precise enough to tell you whether to carry an umbrella, let alone to forecast the path of a hurricane. So meteorologists have been after something else. Instead of a statistical model, they wanted a living and breathing one that simulated the physical processes that govern the weather.
Our ability to compute the weather has long lagged behind our theoretical understanding of it, however. We know which equations to solve and roughly what the right answers are, but we just aren’t fast enough to calculate them for every molecule in the earth’s atmosphere. Instead, we have to make some approximations.
The most intuitive way to do this is to simplify the problem by breaking the atmosphere down into a finite series of pixels—what meteorologists variously refer to as a matrix, a lattice, or a grid. According to Loft, the earliest credible attempt to do this was made in 1916 by Lewis Fry Richardson, a prolific English physicist. Richardson wanted to determine the weather over northern Germany at a particular time: at 1 P.M. on May 20, 1910. This was not, strictly speaking, a prediction, the date being some six years in the past. But Richardson had a lot of data: a series of observations of temperature, barometric pressures and wind speeds that had been gathered by the German government. And he had a lot of time: he was serving in northern France as part of a volunteer ambulance unit and had little to do in between rounds of artillery fire. So Richardson broke Germany down into a series of two-dimensional boxes, each measuring three degrees of latitude (about 210 miles) by three degrees of longitude across. Then he went to work attempting to solve the chemical equations that governed the weather in each square and how they might affect weather in the adjacent ones.
FIGURE 4-1: RICHARDSON’S MATRIX: THE BIRTH OF MODERN WEATHER FORECASTING
Richardson’s experiment, unfortunately, failed miserably16—it “predicted” a dramatic rise in barometric pressure that hadn’t occurred in the real world on the day in question. But he published his results nevertheless. It certainly seemed like the right way to predict the weather—to solve it from first principles, taking advantage of our strong theoretical understanding of how the system behaves, rather than relying on a crude statistical approximation.
The problem was that Richardson’s method required an awful lot of work. Computers were more suitable to the paradigm that he had established. As you’ll see in chapter 9, computers aren’t good at every task we hope they might accomplish and have been far from a panacea for prediction. But computers are very good at computing: at repeating the same arithmetic tasks over and over again and doing so quickly and accurately. Tasks like chess that abide by relatively simple rules, but which are difficult computationally, are right in their wheelhouse. So, potentially, was the weather.
The first computer weather forecast was made in 1950 by the mathematician John von Neumann, who used a machine that could make about 5,000 calculations per second.17 That was a lot faster than Richardson could manage with a pencil and paper in a French hay field. Still, the forecast wasn’t any good, failing to do any better than a more-or-less random guess.
Eventually, by the mid-1960s, computers would start to demonstrate some skill at weather forecasting. And the Bluefire—some 15 billion times faster than the first computer forecast and perhaps a quadrillion times faster than Richardson—displays quite a bit of acumen because of the speed of computation. Weather forecasting is much better today than it was even fifteen or twenty years ago. But, while computing power has improved exponentially in recent decades, progress in the accuracy of weather forecasts has been steady but slow.
There are essentially two reasons for this. One is that the world isn’t one or two dimensional. The most reliable way to improve the accuracy of a weather forecast—getting one step closer to solving for the behavior of each molecule—is to reduce the size of the grid that you use to represent the atmosphere. Richardson’s squares were about two hundred miles by two hundred miles across, providing for at best a highly generalized view of the planet (you could nearly squeeze both New York and Boston—which can have very different weather—into
the same two hundred by two hundred square). Suppose you wanted to reduce the diameter of the squares in half, to a resolution of one hundred by one hundred. That improves the precision of your forecast, but it also increases the number of equations you need to solve. In fact, it would increase this number not twofold but fourfold—since you’re doubling the magnitude both lengthwise and widthwise. That means, more or less, that you need four times as much computer power to produce a solution.
But there are more dimensions to worry about than just two. Different patterns can take hold in the upper atmosphere, in the lower atmosphere, in the oceans, and near the earth’s surface. In a three-dimensional universe, a twofold increase in the resolution of our grid will require an eightfold increase in computer power:
And then there is the fourth dimension: time. A meteorological model is no good if it’s static—the idea is to know how the weather is changing from one moment to the next. A thunderstorm moves at about forty miles per hour: if you have a three-dimensional grid that is forty by forty by forty across, you can monitor the storm’s movement by collecting one observation every hour. But if you halve the dimensions of the grid to twenty by twenty by twenty, the storm will now pass through one of the boxes every half hour. That means you need to halve the time parameter as well—again doubling your requirement to sixteen times as much computing power as you had originally.
If this was the only problem it wouldn’t be prohibitive. Although you need, roughly speaking, to get ahold of sixteen times more processing power in order to double the resolution of your weather forecast, processing power has been improving exponentially—doubling about once every two years.18 That means you only need to wait eight years for a forecast that should be twice as powerful; this is about the pace, incidentally, at which NCAR has been upgrading its supercomputers.
Say you’ve solved the laws of fluid dynamics that govern the movement of weather systems. They’re relatively Newtonian: the uncertainty principle—interesting as it might be to physicists—won’t bother you much. You’ve gotten your hands on a state-of-the-art piece of equipment like the Bluefire. You’ve hired Richard Loft to design the computer’s software and to run its simulations. What could possibly go wrong?
How Chaos Theory Is Like Linsanity
What could go wrong? Chaos theory. You may have heard the expression: the flap of a butterfly’s wings in Brazil can set off a tornado in Texas. It comes from the title of a paper19 delivered in 1972 by MIT’s Edward Lorenz, who began his career as a meteorologist. Chaos theory applies to systems in which each of two properties hold:
The systems are dynamic, meaning that the behavior of the system at one point in time influences its behavior in the future;
And they are nonlinear, meaning they abide by exponential rather than additive relationships.
Dynamic systems give forecasters plenty of problems—as I describe in chapter 6, for example, the fact that the American economy is continually evolving in a chain reaction of events is one reason that it is very difficult to predict. So do nonlinear ones: the mortgage-backed securities that triggered the financial crisis were designed in such a way that small changes in macroeconomic conditions could make them exponentially more likely to default.
When you combine these properties, you can have a real mess. Lorenz did not realize just how profound the problems were until, in the tradition of Alexander Fleming and penicillin20 or the New York Knicks and Jeremy Lin, he made a major discovery purely by accident.
Lorenz and his team were working to develop a weather forecasting program on an early computer known as a Royal McBee LGP-30.21 They thought they were getting somewhere until the computer started spitting out erratic results. They began with what they thought was exactly the same data and ran what they thought was exactly the same code—but the program would forecast clear skies over Kansas in one run, and a thunderstorm in the next.
After spending weeks double-checking their hardware and trying to debug their program, Lorenz and his team eventually discovered that their data wasn’t exactly the same: one of their technicians had truncated it in the third decimal place. Instead of having the barometric pressure in one corner of their grid read 29.5168, for example, it might instead read 29.517. Surely this couldn’t make that much difference?
Lorenz realized that it could. The most basic tenet of chaos theory is that a small change in initial conditions—a butterfly flapping its wings in Brazil—can produce a large and unexpected divergence in outcomes—a tornado in Texas. This does not mean that the behavior of the system is random, as the term “chaos” might seem to imply. Nor is chaos theory some modern recitation of Murphy’s Law (“whatever can go wrong will go wrong”). It just means that certain types of systems are very hard to predict.
The problem begins when there are inaccuracies in our data. (Or inaccuracies in our assumptions, as in the case of mortgage-backed securities). Imagine that we’re supposed to be taking the sum of 5 and 5, but we keyed in the second number wrong. Instead of adding 5 and 5, we add 5 and 6. That will give us an answer of 11 when what we really want is 10. We’ll be wrong, but not by much: addition, as a linear operation, is pretty forgiving. Exponential operations, however, extract a lot more punishment when there are inaccuracies in our data. If instead of taking 55—which should be 3,215—we instead take 56 (five to the sixth power), we wind up with an answer of 15,625. That’s way off: we’ve missed our target by 500 percent.
This inaccuracy quickly gets worse if the process is dynamic, meaning that our outputs at one stage of the process become our inputs in the next. For instance, say that we’re supposed to take five to the fifth power, and then take whatever result we get and apply it to the fifth power again. If we’d made the error described above, and substituted a 6 for the second 5, our results will now be off by a factor of more than 3,000.22 Our small, seemingly trivial mistake keeps getting larger and larger.
The weather is the epitome of a dynamic system, and the equations that govern the movement of atmospheric gases and fluids are nonlinear—mostly differential equations.23 Chaos theory therefore most definitely applies to weather forecasting, making the forecasts highly vulnerable to inaccuracies in our data.
Sometimes these inaccuracies arise as the result of human error. The more fundamental issue is that we can only observe our surroundings with a certain degree of precision. No thermometer is perfect, and if it’s off in even the third or the fourth decimal place, this can have a profound impact on the forecast.
Figure 4-2 shows the output of fifty runs from a European weather model, which was attempting to make a weather forecast for France and Germany on Christmas Eve, 1999. All these simulations are using the same software, and all are making the same assumptions about how the weather behaves. In fact, the models are completely deterministic: they assume that we could forecast the weather perfectly, if only we knew the initial conditions perfectly. But small changes in the input can produce large differences in the output. The European forecast attempted to account for these errors. In one simulation, the barometric pressure in Hanover might be perturbed just slightly. In another, the wind conditions in Stuttgart are permuted by a fraction of a percent. These small changes might be enough for a huge storm system to develop in Paris in some simulations, while it’s a calm winter evening in others.
FIGURE 4-2: DIVERGENT WEATHER FORECASTS WITH SLIGHTLY DIFFERENT INITIAL CONDITIONS
This is the process by which modern weather forecasts are made. These small changes, introduced intentionally in order to represent the inherent uncertainty in the quality of the observational data, turn the deterministic forecast into a probabilistic one. For instance, if your local weatherman tells you that there’s a 40 percent chance of rain tomorrow, one way to interpret that is that in 40 percent of his simulations, a storm developed, and in the other 60 percent—using just slightly different initial parameters—it did not.
It is still not quite that simple, however. The programs that meteorologists use to
forecast the weather are quite good, but they are not perfect. Instead, the forecasts you actually see reflect a combination of computer and human judgment. Humans can make the computer forecasts better or they can make them worse.
The Vision Thing
The World Weather Building is an ugly, butterscotch-colored, 1970s-era office building in Camp Springs, Maryland, about twenty minutes outside Washington. The building forms the operational headquarters of NOAA—the National Oceanic and Atmospheric Administration—which is the parent of the National Weather Service (NWS) on the government’s organization chart.24 In contrast to NCAR’s facilities in Boulder, which provide for sweeping views of the Front Range of the Rocky Mountains, it reminds one of nothing so much as bureaucracy.
The Weather Service was initially organized under the Department of War by President Ulysses S. Grant, who authorized it in 1870. This was partly because President Grant was convinced that only a culture of military discipline could produce the requisite accuracy in forecasting25 and partly because the whole enterprise was so hopeless that it was only worth bothering with during wartime when you would try almost anything to get an edge.
The public at large became more interested in weather forecasting after the Schoolhouse Blizzard of January 1888. On January 12 that year, initially a relatively warm day in the Great Plains, the temperature dropped almost 30 degrees in a matter of a few hours and a blinding snowstorm came.26 Hundreds of children, leaving school and caught unaware as the blizzard hit, died of hypothermia on their way home. As crude as early weather forecasts were, it was hoped that they might at least be able to provide some warning about an event so severe. So the National Weather Service was moved to the Department of Agriculture and took on a more civilian-facing mission.*