The Signal and the Noise
Page 15
FIGURE 4-6: COMPARISON OF HIGH-TEMPERATURE FORECASTS40
After a little more than a week, Loft told me, chaos theory completely takes over, and the dynamic memory of the atmopshere erases itself. Although the following analogy is somewhat imprecise, it may help to think of the atmosphere as akin to a NASCAR oval, with various weather systems represented by individual cars that are running along the track. For the first couple of dozen laps around the track, knowing the starting order of the cars should allow us to make a pretty good prediction of the order in which they might pass by. Our predictions won’t be perfect—there’ll be crashes, pit stops, and engine failures that we’ve failed to account for—but they will be a lot better than random. Soon, however, the faster cars will start to lap the slower ones, and before long the field will be completely jumbled up. Perhaps the second-placed car is running side by side with the sixteenth-placed one (which is about to get lapped), as well as the one in the twenty-eighth place (which has already been lapped once and is in danger of being lapped again). What we knew of the initial conditions of the race is of almost no value to us. Likewise, once the atmosphere has had enough time to circulate, the weather patterns bear so little resemblence to their starting positions that the models don’t do any good.
Still, Floehr’s finding raises a couple of disturbing questions. It would be one thing if, after seven or eight days, the computer models demonstrated essentially zero skill. But instead, they actually display negative skill: they are worse than what you or I could do sitting around at home and looking up a table of long-term weather averages. How can this be? It is likely because the computer programs, which are hypersensitive to the naturally occurring feedbacks in the weather system, begin to produce feedbacks of their own. It’s not merely that there is no longer a signal amid the noise, but that the noise is being amplified.
The bigger question is why, if these longer-term forecasts aren’t any good, outlets like the Weather Channel (which publishes ten-day forecasts) and AccuWeather (which ups the ante and goes for fifteen) continue to produce them. Dr. Rose took the position that doing so doesn’t really cause any harm; even a forecast based purely on climatology might be of some interest to their consumers.
The statistical reality of accuracy isn’t necessarily the governing paradigm when it comes to commercial weather forecasting. It’s more the perception of accuracy that adds value in the eyes of the consumer.
For instance, the for-profit weather forecasters rarely predict exactly a 50 percent chance of rain, which might seem wishy-washy and indecisive to consumers.41 Instead, they’ll flip a coin and round up to 60, or down to 40, even though this makes the forecasts both less accurate and less honest.42
Floehr also uncovered a more flagrant example of fudging the numbers, something that may be the worst-kept secret in the weather industry. Most commercial weather forecasts are biased, and probably deliberately so. In particular, they are biased toward forecasting more precipitation than will actually occur43—what meteorologists call a “wet bias.” The further you get from the government’s original data, and the more consumer facing the forecasts, the worse this bias becomes. Forecasts “add value” by subtracting accuracy.
How to Know if Your Forecasts Are All Wet
One of the most important tests of a forecast—I would argue that it is the single most important one44—is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If, over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. If it wound up raining just 20 percent of the time instead, or 60 percent of the time, they weren’t.
Calibration is difficult to achieve in many fields. It requires you to think probabilistically, something that most of us (including most “expert” forecasters) are not very good at. It really tends to punish overconfidence—a trait that most forecasters have in spades. It also requires a lot of data to evaluate fully—cases where forecasters have issued hundreds of predictions.*
Meteoroloigsts meet this standard. They’ll forecast the temperatures, and the probability of rain and other precipitation, in hundreds of cities every day. Over the course of a year, they’ll make tens of thousands of forecasts.
This sort of high-frequency forecasting is extremely helpful not just when we want to evaluate a forecast but also to the forecasters themselves—they’ll get lots of feedback on whether they’re doing something wrong and can change course accordingly. Certain computer models, for instance, tend to come out a little wet45—forecasting rain more often than they should. But once you are alert to this bias you can correct for it. Likewise, you will soon learn if your forecasts are overconfident.
The National Weather Service’s forecasts are, it turns out, admirably well calibrated46 (figure 4-7). When they say there is a 20 percent chance of rain, it really does rain 20 percent of the time. They have been making good use of feedback, and their forecasts are honest and accurate.
FIGURE 4-7: NATIONAL WEATHER SERVICE CALIBRATION
The meteorologists at the Weather Channel will fudge a little bit under certain conditions. Historically, for instance, when they say there is a 20 percent chance of rain, it has actually only rained about 5 percent of the time.47 In fact, this is deliberate and is something the Weather Channel is willing to admit to. It has to do with their economic incentives.
People notice one type of mistake—the failure to predict rain—more than another kind, false alarms. If it rains when it isn’t supposed to, they curse the weatherman for ruining their picnic, whereas an unexpectedly sunny day is taken as a serendipitous bonus. It isn’t good science, but as Dr. Rose at the Weather Channel acknolwedged to me: “If the forecast was objective, if it has zero bias in precipitation, we’d probably be in trouble.”
Still, the Weather Channel is a relatively buttoned-down organization—many of their customers mistakenly think they are a government agency—and they play it pretty straight most of the time. Their wet bias is limited to slightly exaggerating the probability of rain when it is unlikely to occur—saying there is a 20 percent chance when they know it is really a 5 or 10 percent chance—covering their butts in the case of an unexpected sprinkle. Otherwise, their forecasts are well calibrated (figure 4-8). When they say there is a 70 percent chance of rain, for instance, that number can be taken at face value.
FIGURE 4-8: THE WEATHER CHANNEL CALIBRATION
Where things really go haywire is when weather is presented on the local network news. Here, the bias is very pronounced, with accuracy and honesty paying a major price.
Kansas City ought to be a great market for weather forecasting—it has scorching-hot summers, cold winters, tornadoes, and droughts, and it is large enough to be represented by all the major networks. A man there named J. D. Eggleston began tracking local TV forecasts to help his daughter with a fifth-grade classroom project. Eggleston found the analysis so interesting that he continued it for seven months, posting the results to the Freakonomics blog.48
The TV meteorologists weren’t placing much emphasis on accuracy. Instead, their forecasts were quite a bit worse than those issued by the National Weather Service, which they could have taken for free from the Internet and reported on the air. And they weren’t remotely well calibrated. In Eggleston’s study, when a Kansas City meteorologist said there was a 100 percent chance of rain, it failed to rain about one-third of the time (figure 4-9).
FIGURE 4-9: LOCAL TV METEOROLOGIST CALIBRATION
The weather forecasters did not make any apologies for this. “There’s not an evaluation of accuracy in hiring meteorologists. Presentation takes precedence over accuracy,” one of them told Eggleston. “Accuracy is not a big deal to viewers,” said another. The attitude seems to be that this is all in good fun—who cares if there is a little wet bias, especially if it makes for better television? And since the public doesn’t think our forecasts are any good anyway, why bother with being accurate?
/> This logic is a little circular. TV weathermen say they aren’t bothering to make accurate forecasts because they figure the public won’t believe them anyway. But the public shouldn’t believe them, because the forecasts aren’t accurate.
This becomes a more serious problem when there is something urgent—something like Hurricane Katrina. Lots of Americans get their weather information from local sources49 rather than directly from the Hurricane Center, so they will still be relying on the goofball on Channel 7 to provide them with accurate information. If there is a mutual distrust between the weather forecaster and the public, the public may not listen when they need to most.
The Cone of Chaos
As Max Mayfield told Congress, he had been prepared for a storm like Katrina to hit New Orleans for most of his sixty-year life.50 Mayfield grew up around severe weather—in Oklahoma, the heart of Tornado Alley—and began his forecasting career in the Air Force, where people took risk very seriously and drew up battle plans to prepare for it. What took him longer to learn was how difficult it would be for the National Hurricane Center to communicate its forecasts to the general public.
“After Hurricane Hugo in 1989,” Mayfield recalled in his Oklahoma drawl, “I was talking to a behavioral scientist from Florida State. He said people don’t respond to hurricane warnings. And I was insulted. Of course they do. But I have learned that he is absolutely right. People don’t respond just to the phrase ‘hurricane warning.’ People respond to what they hear from local officials. You don’t want the forecaster or the TV anchor making decisions on when to open shelters or when to reverse lanes.”
Under Mayfield’s guidance, the National Hurricane Center began to pay much more attention to how it presented its forecasts. It contrast to most government agencies, whose Web sites look as though they haven’t been updated since the days when you got those free AOL CDs in the mail, the Hurricane Center takes great care in the design of its products, producing a series of colorful and attractive charts that convey information intuitively and accurately on everything from wind speed to storm surge.
The Hurricane Center also takes care in how it presents the uncertainty in its forecasts. “Uncertainty is the fundamental component of weather prediction,” Mayfield said. “No forecast is complete without some description of that uncertainty.” Instead of just showing a single track line for a hurricane’s predicted path, for instance, their charts prominently feature a cone of uncertainty—“some people call it a cone of chaos,” Mayfield said. This shows the range of places where the eye of the hurricane is most likely to make landfall.51 Mayfield worries that even this isn’t enough. Significant impacts like flash floods (which are often more deadly than the storm itself) can occur far from the center of the storm and long after peak wind speeds have died down. No people in New York City died from Hurricane Irene in 2011 despite massive media hype surrounding the storm, but three people did from flooding in landlocked Vermont52 once the TV cameras were turned off.
What the Hurricane Center usually does not do is issue policy guidance to local officials, such as whether to evacuate a city. Instead, this function is outsourced to the National Weather Service’s 122 local offices, who communicate with governors and mayors, sheriffs and police chiefs. The official reason for this is that the Hurricane Center figures the local offices will have better working knowledge of the cultures and the people they are dealing with on the ground. The unofficial reason, I came to recgonize after speaking with Mayfield, is that the Hurricane Center wants to keep its mission clear. The Hurricane Center and the Hurricane Center alone issues hurricane forecasts, and it needs those forecasts to be as accurate and honest as possible, avoiding any potential distractions.
But that aloof approach just wasn’t going to work in New Orleans. Mayfield needed to pick up the phone.
Evacuation decisions are not easy, in part because evacuations themselves can be deadly; a bus carrying hospital evacuees from another 2005 storm, Hurricane Rita, burst into flames while leaving Houston, killing twenty-three elderly passengers.53 “This is really tough with these local managers,” Mayfield says. “They look at this probabilistic information and they’ve got to translate that into a decision. A go, no-go. A yes-or-no decision. They have to take a probabilistic decision and turn it into something deterministic.”
In this case, however, the need for an evacuation was crystal clear, and the message wasn’t getting through.
“We have a young man at the hurricane center named Matthew Green. Exceptional young man. Has a degree in meteorology. Coordinates warnings with the transit folks. His mother lived in New Orleans. For whatever reason, she was not leaving. Here’s a guy who knows about hurricanes and emergency management and he couldn’t get his own mother to evacuate.”
So the Hurricane Center started calling local officials up and down the Gulf Coast. On Saturday, August 27—after the forecast had taken a turn for the worse but still two days before Katrina hit—Mayfield spoke with Governor Haley Barbour of Mississippi, who ordered a mandatory evacuation for its most vulnerable areas almost immediately,54 and Governor Kathleen Blanco of Louisiana, who had already declared a state of emergency. Blanco told Mayfield that he needed to call Ray Nagin, the mayor of New Orleans, who had been much slower to respond.
Nagin missed Mayfield’s call but phoned him back. “I don’t remember exactly what I said,” Mayfield told me. “We had tons of interviews over those two or three days. But I’m absolutely positive that I told him, You’ve got some tough decisions and some potential for a large loss of life.” Mayfield told Nagin that he needed to issue a mandatory evacuation order, and to do so as soon as possible.
Nagin dallied, issuing a voluntary evacuation order instead. In the Big Easy, that was code for “take it easy”; only a mandatory evacuation order would convey the full force of the threat.55 Most New Orleanians had not been alive when the last catastrophic storm, Hurricane Betsy, had hit the city in 1965. And those who had been, by definition, had survived it. “If I survived Hurricane Betsy, I can survive that one, too. We all ride the hurricanes, you know,” an elderly resident who stayed in the city later told public officials.56 Reponses like these were typical. Studies from Katrina and other storms have found that having survived a hurricane makes one less likely to evacuate the next time one comes.57
The reasons for Nagin’s delay in issuing the evacuation order is a matter of some dispute—he may have been concerned that hotel owners might sue the city if their business was disrupted.58 Either way, he did not call for a mandatory evacuation until Sunday at 11 A.M.59—and by that point the residents who had not gotten the message yet were thoroughly confused. One study found that about a third of residents who declined to evacuate the city had not heard the evacuation order at all. Another third heard it but said it did not give clear instructions.60 Surveys of disaster victims are not always reliable—it is difficult for people to articulate why they behaved the way they did under significant emotional strain,61 and a small percentage of the population will say they never heard an evacuation order even when it is issued early and often. But in this case, Nagin was responsible for much of the confusion.
There is, of course, plenty of blame to go around for Katrina—certainly to FEMA in addition to Nagin. There is also credit to apportion—most people did evacuate, in part because of the Hurricane Center’s accurate forecast. Had Betsy topped the levees in 1965, before reliable hurricane forecasts were possible, the death toll would probably have been even greater than it was in Katrina.
One lesson from Katrina, however, is that accuracy is the best policy for a forecaster. It is forecasting’s original sin to put politics, personal glory, or economic benefit before the truth of the forecast. Sometimes it is done with good intentions, but it always makes the forecast worse. The Hurricane Center works as hard as it can to avoid letting these things compromise its forecasts. It may not be a concidence that, in contrast to all the forecasting failures in this book, theirs have become 350 percent mo
re accurate in the past twenty-five years alone.
“The role of a forecaster is to produce the best forecast possible,” Mayfield says. It’s so simple—and yet forecasters in so many fields routinely get it wrong.
5
DESPERATELY SEEKING SIGNAL
Just as the residents of L’Aquila, Italy, were preparing for bed on a chilly Sunday evening in April 2009, they felt a pair of tremors, each barely more perceptible than the rumbling of a distant freight train. The first earthquake, which occurred just before 11 P.M. local time, measured 3.9 on the magnitude scale,* a frequency strong enough to rattle nerves and loosen objects but little else. The second was even weaker, a magnitude 3.5; it would not have been powerful enough to awaken a sound sleeper.
But L’Aquila was on edge about earthquakes. The town, which sits in the foothills of the Apennine Mountains and is known for its ski resorts and medieval walls, had been experiencing an unusually large number of them—the two that Sunday were the seventh and eighth of at least magnitude 3 in the span of about a week. Small earthquakes are not uncommon in this part of the world, but the rate is normally much less—about one such earthquake every two or three months. These were coming almost one hundred times as often.
Meanwhile, the citizens of a town a mountain pass away, Sulmona, had just survived an earthquake scare of their own. A technician named Giampaolo Giuliani, who worked at Italy’s National Institute of Nuclear Physics, claimed to have detected unusually high levels of radon in the area. He theorized this might be a precursor to an earthquake and went so far as to tell Sulmona’s mayor that an earthquake would strike the town on the afternoon of March 29. The mayor, impressed by the prediction, ordered vans carrying loudspeakers to drive about town, warning residents of the threat.1