My favorite dataset, Google searches, can give us some clues as to what people find most addictive. According to Google, most addictions remain the ones people have struggled with for many decades—drugs, sex, and alcohol, for example. But the internet is starting to make its presence felt on the list—with “porn” and “Facebook” now among the top ten reported addictions.
TOP ADDICTIONS REPORTED TO GOOGLE, 2016
Drugs
Sex
Porn
Alcohol
Sugar
Love
Gambling
Facebook
A/B testing may be playing a role in making the internet so darn addictive.
Tristan Harris, a “design ethicist,” was quoted in Irresistible explaining why people have such a hard time resisting certain sites on the internet: “There are a thousand people on the other side of the screen whose job it is to break down the self-regulation you have.”
And these people are using A/B testing.
Through testing, Facebook may figure out that making a particular button a particular color gets people to come back to their site more often. So they change the button to that color. Then they may figure out that a particular font gets people to come back to their site more often. So they change the text to that font. Then they may figure out that emailing people at a certain time gets them coming back to their site more often. So they email people at that time.
Pretty soon, Facebook becomes a site optimized to maximize how much time people spend on Facebook. In other words, find enough winners of A/B tests and you have an addictive site. It is the type of feedback that cigarette companies never had.
A/B testing is increasingly a tool of the gaming industry. As Alter discusses, World of Warcraft A/B-tests various versions of its game. One mission might ask you to kill someone. Another might ask you to save something. Game designers can give different samples of players’ different missions and then see which ones keep more people playing. They might find, for example, that the mission that asked you to save a person got people to return 30 percent more often. If they test many, many missions, they start finding more and more winners. These 30 percent wins add up, until they have a game that keeps many adult men holed up in their parents’ basement.
If you are a little disturbed by this, I am with you. And we will talk a bit more about the ethical implications of this and other aspects of Big Data near the end of this book. But for better or worse, experimentation is now a crucial tool in the data scientists’ tool kit. And there is another form of experimentation sitting in that tool kit. It has been used to ask a variety of questions, including whether TV ads really work.
NATURE’S CRUEL—BUT ENLIGHTENING—EXPERIMENTS
It’s January 22, 2012, and the New England Patriots are playing the Baltimore Ravens in the AFC Championship game.
There’s a minute left in the game. The Ravens are down, but they’ve got the ball. The next sixty seconds will determine which team will play in the Super Bowl. The next sixty seconds will help seal players’ legacies. And the last minute of this game will do something that, for an economist, is far more profound: the last sixty seconds will help finally tell us, once and for all, Do advertisements work?
The notion that ads improve sales is obviously crucial to our economy. But it is maddeningly hard to prove. In fact, this is a textbook example of exactly how difficult it is to distinguish between correlation and causation.
There’s no doubt that products that advertise the most also have the highest sales. Twentieth Century Fox spent $150 million marketing the movie Avatar, which became the highest-grossing film of all time. But how much of the $2.7 billion in Avatar ticket sales was due to the heavy marketing? Part of the reason 20th Century Fox spent so much money on promotion was presumably that they knew they had a desirable product.
Firms believe they know how effective their ads are. Economists are skeptical they really do. University of Chicago economics professor Steven Levitt, while collaborating with an electronics company, was underwhelmed when the firm tried to convince him they knew how much their ads worked. How, Levitt wondered, could they be so confident?
The company explained that, every year, in the days preceding Father’s Day, they ramp up their TV ad spending. Sure enough, every year, before Father’s Day, they have the highest sales. Uh, maybe that’s just because a lot of kids buy electronics for their dads, particularly for Father’s Day gifts, regardless of advertising.
“They got the causality completely backwards,” says Levitt in a lecture. At least they might have. We don’t know. “It’s a really hard problem,” Levitt adds.
As important as this problem is to solve, firms are reluctant to conduct rigorous experiments. Levitt tried to convince the electronics company to perform a randomized, controlled experiment to precisely learn how effective their TV ads were. Since A/B testing isn’t possible on television yet, this would require seeing what happens without advertising in some areas.
Here’s how the firm responded: “Are you crazy? We can’t not advertise in twenty markets. The CEO would kill us.” That ended Levitt’s collaboration with the company.
Which brings us back to this Patriots-Ravens game. How can the results of a football game help us determine the causal effects of advertising? Well, it can’t tell us the effects of a particular ad campaign from a particular company. But it can give evidence on the average effects of advertisements from many large campaigns.
It turns out, there is a hidden advertising experiment in games like this. Here’s how it works. By the time these championship games are played, companies have purchased, and produced, their Super Bowl advertisements. When businesses decide which ads to run, they don’t know which teams will play in the game.
But the results of the playoffs will have a huge impact on who actually watches the Super Bowl. The two teams that ultimately qualify will bring with them an enormous amount of viewers. If New England, which plays near Boston, wins, far more people in Boston will watch the Super Bowl than folks in Baltimore. And vice versa.
To the firms, it is the equivalent of a coin flip to determine whether tens of thousands of extra people in Baltimore or Boston will be exposed to their advertisement, a flip that will happen after their spots are purchased and produced.
Now, back to the field, where Jim Nantz on CBS is announcing the final results of this experiment.
Here comes Billy Cundiff, to tie this game, and, in all likelihood, send it to overtime. The last two years, sixteen of sixteen on field goals. Thirty-two yards to tie it. And the kick. Look out! Look out! It’s no good. . . . And the Patriots take the knee and will now take the journey to Indianapolis. They’re heading to Super Bowl Forty-Six.
Two weeks later, Super Bowl XLVI would score a 60.3 audience share in Boston and a 50.2 share in Baltimore. Sixty thousand more people in Boston would watch the 2012 advertisements.
The next year, the same two teams would meet for the AFC Championship. This time, Baltimore would win. The extra ad exposures for the 2013 Super Bowl advertisements would be seen in Baltimore.
Hal Varian, chief economist at Google; Michael D. Smith, economist at Carnegie Mellon; and I used these two games and all the other Super Bowls from 2004 to 2013 to test whether—and, if so, how much—Super Bowl ads work. Specifically we looked at whether when a company advertises a movie in the Super Bowl, they see a big jump in ticket sales in the cities that had higher viewership for the game.
They indeed do. People in cities of teams that qualify for the Super Bowl attend movies that were advertised during the Super Bowl at a significantly higher rate than do those in cities of teams that just missed qualifying. More people in those cities saw the ad. More people in those cities decided to go to the film.
One alternative explanation might be that having a team in the Super Bowl makes you more likely to go see movies. However, we tested a group of movies that had similar budgets and were released at similar times but that did not adv
ertise in the Super Bowl. There was no increased attendance in the cities of the Super Bowl teams.
Okay, as you might have guessed, advertisements work. This isn’t too surprising.
But it’s not just that they work. The ads were incredibly effective. In fact, when we first saw the results, we double- and triple- and quadruple-checked them to make sure they were right—because the returns were so large. The average movie in our sample paid about $3 million for a Super Bowl ad slot. They got $8.3 million in increased ticket sales, a 2.8-to-1 return on their investment.
This result was confirmed by two other economists, Wesley R. Hartmann and Daniel Klapper, who independently and earlier came up with a similar idea. These economists studied beer and soft drink ads run during the Super Bowl, while also utilizing the increased ad exposures in the cities of teams that qualify. They found a 2.5-to-1 return on investment. As expensive as these Super Bowl ads are, our results and theirs suggest they are so effective in upping demand that companies are actually dramatically underpaying for them.
And what does all of this mean for our friends back at the electronics company Levitt had worked with? It’s possible that Super Bowl ads are more cost-effective than other forms of advertising. But at the very least our study does suggest that all that Father’s Day advertising is probably a good idea.
One virtue of the Super Bowl experiment is that it wasn’t necessary to intentionally assign anyone to treatment or control groups. It happened based on the lucky bounces in a football game. It happened, in other words, naturally. Why is that an advantage? Because nonnatural, randomly controlled experiments, while super-powerful and easier to do in the digital age, still are not always possible.
Sometimes we can’t get our act together in time. Sometimes, as with that electronics company that didn’t want to run an experiment on its ad campaign, we are too invested in the answer to test it.
Sometimes experiments are impossible. Suppose you are interested in how a country responds to losing a leader. Does it go to war? Does its economy stop functioning? Does nothing much change? Obviously, we can’t just kill a significant number of presidents and prime ministers and see what happens. That would be not only impossible but immoral. Universities have built up, over many decades, institutional review boards (IRBs) that determine if a proposed experiment is ethical.
So if we want to know causal effects in a certain scenario and it is unethical or otherwise unfeasible to do an experiment, what can we do? We can utilize what economists—defining nature broadly enough to include football games—call natural experiments.
For better or worse (okay, clearly worse), there is a huge random component to life. Nobody knows for sure what or who is in charge of the universe. But one thing is clear: whoever is running the show—the laws of quantum mechanics, God, a pimply kid in his underwear simulating the universe on his computer—they, She, or he is not going through IRB approval.
Nature experiments on us all the time. Two people get shot. One bullet stops just short of a vital organ. The other doesn’t. These bad breaks are what make life unfair. But, if it is any consolation, the bad breaks do make life a little easier for economists to study. Economists use the arbitrariness of life to test for causal effects.
Of forty-three American presidents, sixteen have been victims of serious assassination attempts, and four have been killed. The reasons that some lived were essentially random.
Compare John F. Kennedy and Ronald Reagan. Both men had bullets headed directly for their most vulnerable body parts. JFK’s bullet exploded his brain, killing him shortly afterward. Reagan’s bullet stopped centimeters short of his heart, allowing doctors to save his life. Reagan lived, while JFK died, with no rhyme or reason—just luck.
These attempts on leaders’ lives and the arbitrariness with which they live or die is something that happens throughout the world. Compare Akhmad Kadyrov, of Chechyna, and Adolf Hitler, of Germany. Both men have been inches away from a fully functioning bomb. Kadyrov died. Hitler had changed his schedule, wound up leaving the booby-trapped room a few minutes early to catch a train, and thus survived.
And we can use nature’s cold randomness—killing Kennedy but not Reagan—to see what happens, on average, when a country’s leader is assassinated. Two economists, Benjamin F. Jones and Benjamin A. Olken, did just that. The control group here is any country in the years immediately after a near-miss assassination—for example, the United States in the mid-1980s. The treatment group is any country in the years immediately after a completed assassination—for example, the United States in the mid-1960s.
What, then, is the effect of having your leader murdered? Jones and Olken found that successful assassinations dramatically alter world history, taking countries on radically different paths. A new leader causes previously peaceful countries to go to war and previously warring countries to achieve peace. A new leader causes economically booming countries to start busting and economically busting countries to start booming.
In fact, the results of this assassination-based natural experiment overthrew a few decades of conventional wisdom on how countries function. Many economists previously leaned toward the view that leaders largely were impotent figureheads pushed around by external forces. Not so, according to Jones and Olken’s analysis of nature’s experiment.
Many would not consider this examination of assassination attempts on world leaders an example of Big Data. The number of assassinated or almost assassinated leaders in the study was certainly small—as was the number of wars that did or did not result. The economic datasets necessary to characterize the trajectory of an economy were large but for the most part predate digitalization.
Nonetheless, such natural experiments—though now used almost exclusively by economists—are powerful and will take on increasing importance in an era with more, better, and larger datasets. This is a tool that data scientists will not long forgo.
And yes, as should be clear by now, economists are playing a major role in the development of data science. At least I’d like to think so, since that was my training.
Where else can we find natural experiments—in other words, situations where the random course of events places people in treatment and control groups?
The clearest example is a lottery, which is why economists love them—not playing them, which we find irrational, but studying them. If a Ping-Pong ball with a three on it rises to the top, Mr. Jones will be rich. If it’s a ball with a six instead, Mr. Johnson will be.
To test the causal effects of monetary windfalls, economists compare those who win lotteries to those who buy tickets but lose. These studies have generally found that winning the lottery does not make you happy in the short run but does in the long run.*
Economists can also utilize the randomness of lotteries to see how one’s life changes when a neighbor gets rich. The data shows that your neighbor winning the lottery can have an impact on your own life. If your neighbor wins the lottery, for example, you are more likely to buy an expensive car, such as a BMW. Why? Almost certainly, economists maintain, the cause is jealousy after your richer neighbor purchased his own expensive car. Chalk it up to human nature. If Mr. Johnson sees Mr. Jones driving a brand-new BMW, Mr. Johnson wants one, too.
Unfortunately, Mr. Johnson often can’t afford this BMW, which is why economists found that neighbors of lottery winners are significantly more likely to go bankrupt. Keeping up with the Joneses, in this instance, is impossible.
But natural experiments don’t have to be explicitly random, like lotteries. Once you start looking for randomness, you see it everywhere—and can use it to understand how our world works.
Doctors are part of a natural experiment. Every once in a while, the government, for essentially arbitrary reasons, changes the formula it uses to reimburse physicians for Medicare patients. Doctors in some counties see their fees for certain procedures rise. Doctors in other counties see their fees drop.
Two economists—Jeffrey Clemens and Joshua
Gottlieb, a former classmate of mine—tested the effects of this arbitrary change. Do doctors always give patients the same care, the care they deem most necessary? Or are they driven by financial incentives?
The data clearly shows that doctors can be motivated by monetary incentives. In counties with higher reimbursements, some doctors order substantially more of the better-reimbursed procedures—more cataract surgeries, colonoscopies, and MRIs, for example.
And then, the big question: do their patients fare better after getting all this extra care? Clemens and Gottlieb reported only “small health impacts.” The authors found no statistically significant impact on mortality. Give stronger financial incentives to doctors to order certain procedures, this natural experiment suggests, and some will order more procedures that don’t make much difference for patients’ health and don’t seem to prolong their lives.
Natural experiments can help answer life-or-death questions. They can also help with questions that, to some young people, feel like life-or-death.
Stuyvesant High School (known as “Stuy”) is housed in a ten-floor, $150 million tan, brick building overlooking the Hudson River, a few blocks from the World Trade Center, in lower Manhattan. Stuy is, in a word, impressive. It offers fifty-five Advanced Placement (AP) classes, seven languages, and electives in Jewish history, science fiction, and Asian-American literature. Roughly one-quarter of its graduates are accepted to an Ivy League or similarly prestigious college. Stuyvesant trained Harvard physics professor Lisa Randall, Obama strategist David Axelrod, Academy Award–winning actor Tim Robbins, and novelist Gary Shteyngart. Its commencement speakers have included Bill Clinton, Kofi Annan, and Conan O’Brien.
The only thing more remarkable than Stuyvesant’s offerings and graduates is its cost: zero dollars. It is a public high school and probably the country’s best. Indeed, a recent study used 27 million reviews by 300,000 students and parents to rank every public high school in the United States. Stuy ranked number one. It is no wonder, then, that ambitious, middle-class New York parents and their equally ambitious progeny can become obsessed with Stuy’s brand.
Everybody Lies Page 18