In total, we tracked more than 74 million of these diffusion chains initiated by more than 1.6 million users, over a two-month interval in late 2009. For each event, we counted how many times the URL in question was retweeted—first by the original “seed” user’s immediate followers, then by their followers, and their followers’ followers, and so on—thereby tracing out the full “cascade” of retweets triggered by each original tweet. As the figure on this page shows, some of these cascades were broad and shallow, while others were narrow and deep. Others still were very large, with complex structure, starting out small and trickling along before gaining momentum somewhere else in the network. Most of all, however, we found that the vast majority of attempted cascades—roughly 98 percent of the total—didn’t actually spread at all.
Cascades on Twitter
This result is important because, as I’ll discuss in more detail in the next chapter, if you want to understand why some things “go viral”—those occasional YouTube videos that attract millions of downloads, or funny messages that circulate wildly through e-mail or on Facebook—it’s a mistake to consider only the rare few that actually succeed. In most settings, unfortunately, it is only possible to study the “successes” for the simple reason that nobody bothers to keep track of all the failures, which have a tendency to get swept under the rug. On Twitter, however, we can keep track of every single event, no matter how small, thereby enabling us to learn who is influential, how much more influential than average they really are, and whether or not it is possible to tell the differences between individuals in a way that could potentially be exploited.
The way we went about this exercise was to imitate what a hypothetical marketer might try to do—that is, using everything known about the attributes and past performance of a million or so individuals, to predict how influential each of them will be in the future. Based on these predictions, the marketer could then “sponsor” some group of individuals to tweet whatever information it is trying to disseminate, thereby generating a series of cascades. The better the marketer can predict how large a cascade any particular individual can trigger, the more efficiently it can allocate its budget for sponsored tweets. Actually running such an experiment is still extremely difficult in practice, so we instead did our best to approximate it using the data we had already collected. Specifically, we divided our data in two, artificially setting the first month of our time period as our “history” and the second half as the “future.” We then fed all our “historical” data into a statistical model, including how many followers each user had, how many others they were following, how frequently they tweeted, when they had joined, and how successful they had been at triggering cascades during this period. Finally, we used the model to “predict” how influential each user would be in our “future” data and checked the model’s performance against what actually transpired.
In a nutshell, what we found was that individual-level predictions are extremely noisy. Even though it was the case that on average, individuals with many followers who had been successful at triggering cascades of retweets in the past were more likely to be successful in the future, individual cases fluctuated wildly at random. Just as with the Mona Lisa, for every individual who exhibited the attributes of a successful influencer, there were many other users with indistinguishable attributes who were not successful. Nor did this uncertainty arise simply because we weren’t able to measure the right attributes—in reality we had more data than any marketer would normally have—or to measure them accurately. Rather, the problem was that, like the simulations above, much of what drives successful diffusion depends on factors outside the control of the individual seeds. What this result suggests, in other words, is that marketing strategies that focus on targeting a few “special” individuals are bound to be unreliable. Like responsible financial managers, therefore, marketers should adopt a “portfolio” approach, targeting a large number of potential influencers and harnessing their average effect, thereby effectively reducing the individual-level randomness.24
Although promising in theory, a portfolio approach also raises a new issue, of cost effectiveness. To illustrate the point, consider a recent story in the New York Times that claimed that Kim Kardashian, the reality TV actress, was getting paid $10,000 per tweet by various sponsors who wanted her to mention their products. Kardashian at the time had well over a million followers, so it seems plausible that paying someone like her would generate more attention than paying some ordinary person with only a few hundred followers. But how did they come up with that particular figure? Ordinary people, that is, might be prepared to tweet about their products for much less than $10,000. Assuming, therefore, that more visible individuals “cost” more than less visible ones, should marketers be targeting a relatively small number of more influential, more expensive, individuals or a larger number of less influential, less expensive individuals? Better yet, how should one strike the optimal balance?25
Ultimately, the answer to this question will depend on the specifics of how much different Twitter users would charge prospective marketers to sponsor their tweets—if indeed, they would agree to such an arrangement at all. Nevertheless, as a speculative exercise, we tested a range of plausible assumptions, each corresponding to a different hypothetical “influencer-based” marketing campaign, and measured their return on investment using the same statistical model as before. What we found was surprising even to us: Even though the Kim Kardashians of the world were indeed more influential than average, they were so much more expensive that they did not provide the best value for the money. Rather, it was what we called ordinary influencers, meaning individuals who exhibit average or even less-than-average influence, who often proved to be the most cost-effective means to disseminate information.
CIRCULAR REASONING AGAIN
Before you rush out to short stock in Kim Kardashian, I should emphasize that we didn’t actually run the experiment that we imagined. Even though we were studying data from the real world, not a computer simulation, our statistical models still made a lot of assumptions. Assuming, for example, that our hypothetical marketer could persuade a few thousand ordinary influencers to tweet about their product, it is not at all obvious that their followers would respond as favorably as they do to normal tweets. As anyone whose friend has tried to sell them on Amway products would know, there is something a little icky about a sales message embedded in a personal communication. People who follow Kim Kardashian, however, might have no such concerns; thus she may be far more effective in real life than our study could determine. Or perhaps our measure of influence—the number of retweets—was the wrong measure. We measured retweets because that’s what we could measure, and that was definitely better than nothing. But presumably what you really care about is how many people click through to a story, or donate money to a charitable cause, or buy your product. Possibly Kardashian followers act on her tweets even when they don’t retweet them to their friends—in which case, once again, we would have underestimated her influence.
Then again, we may not have. In the end, we simply don’t know who is influential or what influencers, however defined, can accomplish. Until it is possible to measure influence with respect to some outcome that we actually care about, and until someone runs the real-world experiments that can measure the influence of different individuals, every result—including ours—ought to be taken with a grain of salt. Nevertheless, the findings I have discussed—from the small-world experiment, from the simulation studies of influence spreading on networks, and from the Twitter study—ought to raise some serious doubts about claims like the law of the few that explain social epidemics as the work of a tiny minority of special people.
It’s not even clear, in fact, that social epidemics are the right way to think about social change to begin with. Although our Twitter study found that epidemic-like events do occur, we also found that they are incredibly rare. Of 74 million events in our data, only a few dozen generated even a thousand retw
eets, and only one or two got to ten thousand. In a network of tens of millions of users, ten thousand retweets doesn’t seem like that big a number, but what our data showed is that even that is almost impossible to achieve. For practical purposes, therefore, it may be better to forget about the large cascades altogether and instead try to generate lots of small ones. And for that purpose, ordinary influencers may work just fine. They don’t accomplish anything dramatic, so you may need a lot of them, but in harnessing many such individuals, you can also average out much of the randomness, generating a consistently positive effect.
Finally, and quite apart from any specific findings, these studies help us to see a major shortcoming of commonsense thinking. It is ironic in a way that the law of the few is portrayed as a counterintuitive idea because in fact we’re so used to thinking in terms of special people that the claim that a few special people do the bulk of the work is actually extremely natural. We think that by acknowledging the importance of interpersonal influence and social networks, we have somehow moved beyond the circular claim from the previous chapter that “X happened because that’s what people wanted.” But when we try to imagine how a complex network of millions of people is connected—or worse still, how influence propagates through it—our intuition is immediately defeated. By effectively concentrating all the agency into the hands of a few individuals, “special people” arguments like the law of the few reduce the problem of understanding how network structure affects outcomes to the much simpler problem of understanding what it is that motivates the special people. As with all commonsense explanations, it sounds reasonable and it might be right. But in claiming that “X happened because a few special people made it happen,” we have effectively replaced one piece of circular reasoning with another.
CHAPTER 5
History, the Fickle Teacher
The message of the previous three chapters is that commonsense explanations are often characterized by circular reasoning. Teachers cheated on their students’ tests because that’s what their incentives led them to do. The Mona Lisa is the most famous painting in the world because it has all the attributes of the Mona Lisa. People have stopped buying gas-guzzling SUVs because social norms now dictate that people shouldn’t buy gas-guzzling SUVs. And a few special people revived the fortunes of the Hush Puppies shoe brand because a few people started buying Hush Puppies before everyone else did. All of these statements may be true, but all they are really telling us is that what we know happened, happened, and not something else. Because they can only be constructed after we know the outcome itself, we can never be sure how much these explanations really explain, versus simply describe.
What’s curious about this problem, however, is that even once you see the inherent circularity of commonsense explanations, it’s still not obvious what’s wrong with them. After all, in science we don’t necessarily know why things happen either, but we can often figure it out by doing experiments in a lab or by observing systematic regularities in the world. Why can’t we learn from history the same way? That is, think of history as a series of experiments in which certain general “laws” of cause and effect determine the outcomes that we observe. By systematically piecing together the regularities in our observations, can we not infer these laws just as we do in science? For example, imagine that the contest for attention between great works of art is an experiment designed to identify the attributes of great art. Even if it’s true that prior to the twentieth century, it might not have been obvious that the Mona Lisa was going to become the most famous painting in the world, we have now run the experiment, and we have the answer. We may still not be able to say what it is about the Mona Lisa that makes it uniquely great, but we do at least have some data. Even if our commonsense explanations have a tendency to conflate what happened with why it happened, are we not simply doing our best to act like good experimentalists?1
In a sense, the answer is yes. We probably are doing our best, and under the right circumstances learning from observation and experience can work pretty well. But there’s a catch: In order to be able to infer that “A causes B,” we need to be able to run the experiment many times. Let’s say, for example, that A is a new drug to reduce “bad” cholesterol and B is a patient’s chance of developing heart disease in the next ten years. If the manufacturer can show that a patient who receives drug A is significantly less likely to develop heart disease than one who doesn’t, they’re allowed to claim that the drug can help prevent heart disease; otherwise they can’t. But because any one person can only either receive the drug or not receive it, the only way to show that the drug is causing anything is to run the “experiment” many times, where each person’s experience counts as a single run. A drug trial therefore requires many participants, each of whom is randomly assigned either to receive the treatment or not. The effect of the drug is then measured as the difference in outcomes between the “treatment” and the “control” groups, where the smaller the effect, the larger the trial needs to be in order to rule out random chance as the explanation.
In certain everyday problem-solving situations, where we encounter more or less similar circumstances over and over again, we can get pretty close to imitating the conditions of the drug trial. Driving home from work every day, for example, we can experiment with different routes or with different departure times. By repeating these variations many times, and assuming that traffic on any given day is more or less like traffic on any other day, we can effectively bypass all the complex cause-and-effect relationships simply by observing which route results in the shortest commute time, on average. Likewise, the kind of experience-based expertise that derives from professional training, whether in medicine, engineering, or the military, works in the same way—by repeatedly exposing trainees to situations that are as similar as possible to those they will be expected to deal with in their eventual careers.2
HISTORY IS ONLY RUN ONCE
Given how well this quasi-experimental approach to learning works in everyday situations and professional training, it’s perhaps not surprising that our commonsense explanations implicitly apply the same reasoning to explain economic, political, and cultural events as well. By now, however, you probably suspect where this is heading. For problems of economics, politics, and culture—problems that involve many people interacting over time—the combination of the frame problem and the micro-macro problem means that every situation is in some important respect different from the situations we have seen before. Thus, we never really get to run the same experiment more than once. At some level, we understand this problem. Nobody really thinks that the war in Iraq is directly comparable to the Vietnam War or even the war in Afghanistan, and one must therefore be cautious in applying the lessons from one to another. Likewise, nobody thinks that by studying the success of the Mona Lisa we can realistically expect to understand much about the success and failure of contemporary artists. Nevertheless, we do still expect to learn some lessons from history, and it is all too easy to persuade ourselves that we have learned more than we really have.
For example, did the so-called surge in Iraq in the fall of 2007 cause the subsequent drop in violence in the summer of 2008? Intuitively the answer seems to be yes—not only did the drop in violence take place reasonably soon after the surge was implemented, but the surge was specifically intended to have that effect. The combination of intentionality and timing strongly suggests causality, as did the often-repeated claims of an administration looking for something good to take credit for. But many other things happened between the fall of 2007 and the summer of 2008 as well. Sunni resistance fighters, seeing an even greater menace from hard-core terrorist organizations like Al Qaeda than from American soldiers, began to cooperate with their erstwhile occupiers. The Shiite militia—most importantly Moktada Sadr’s Mahdi Army—also began to experience a backlash from their grassroots, possibly leading them to moderate their behavior. And the Iraqi Army and police forces, finally displaying sufficient competence to take on the
militias, began to assert themselves, as did the Iraqi government. Any one of these other factors might have been at least as responsible for the drop in violence as the surge. Or perhaps it was some combination. Or perhaps it was something else entirely. How are we to know?
One way to be sure would be to “rerun” history many times, much as we did in the Music Lab experiment, and see what would have happened both in the presence and also the absence of the surge. If across all of these alternate versions of history, violence drops whenever there is a surge and doesn’t drop whenever there isn’t, then we can say with some confidence that the surge is causing the drop. And if instead we find that most of the time we have a surge, nothing happens to the level of violence, or alternatively we find that violence drops whether we have a surge or not, then whatever it is that is causing the drop, clearly it isn’t the surge. In reality, of course, this experiment got run only once, and so we never got to see all the other versions of it that may or may not have turned out differently. As a result, we can’t ever really be sure what caused the drop in violence. But rather than producing doubt, the absence of “counterfactual” versions of history tends to have the opposite effect—namely that we tend to perceive what actually happened as having been inevitable.
Everything Is Obvious Page 11