Everybody Lies

Home > Other > Everybody Lies > Page 14
Everybody Lies Page 14

by Seth Stephens-Davidowitz


  Two months after that original speech, Obama gave another televised speech on Islamophobia, this time at a mosque. Perhaps someone in the president’s office had read Soltas’s and my Times column, which discussed what had worked and what didn’t. For the content of this speech was noticeably different.

  Obama spent little time insisting on the value of tolerance. Instead, he focused overwhelmingly on provoking people’s curiosity and changing their perceptions of Muslim Americans. Many of the slaves from Africa were Muslim, Obama told us; Thomas Jefferson and John Adams had their own copies of the Koran; the first mosque on U.S. soil was in North Dakota; a Muslim American designed skyscrapers in Chicago. Obama again spoke of Muslim athletes and armed service members but also talked of Muslim police officers and firefighters, teachers and doctors.

  And my analysis of the Google searches suggests this speech was more successful than the previous one. Many of the hateful, rageful searches against Muslims dropped in the hours after the president’s address.

  There are other potential ways to use search data to learn what causes, or reduces, hate. For example, we might look at how racist searches change after a black quarterback is drafted in a city or how sexist searches change after a woman is elected to office. We might see how racism responds to community policing or how sexism responds to new sexual harassment laws.

  Learning of our subconscious prejudices can also be useful. For example, we might all make an extra effort to delight in little girls’ minds and show less concern with their appearance. Google search data and other wellsprings of truth on the internet give us an unprecedented look into the darkest corners of the human psyche. This is at times, I admit, difficult to face. But it can also be empowering. We can use the data to fight the darkness. Collecting rich data on the world’s problems is the first step toward fixing them.

  5

  ZOOMING IN

  My brother, Noah, is four years younger than I. Most people, upon first meeting us, find us eerily similar. We both talk too loudly, are balding in the same way, and have great difficulty keeping our apartments tidy.

  But there are differences: I count pennies. Noah buys the best. I love Leonard Cohen and Bob Dylan. For Noah, it’s Cake and Beck.

  Perhaps the most notable difference between us is our attitude toward baseball. I am obsessed with baseball and, in particular, my love of the New York Mets has always been a core part of my identity. Noah finds baseball impossibly boring, and his hatred of the sport has long been a core part of his identity.*

  Seth Stephens-Davidowitz

  Baseball-o- Phile

  Noah Stephens-Davidowitz

  Baseball-o- Phobe

  How can two guys with such similar genes, raised by the same parents, in the same town, have such opposite feelings about baseball? What determines the adults we become? More fundamentally, what’s wrong with Noah? There’s a growing field within developmental psychology that mines massive adult databases and correlates them with key childhood events. It can help us tackle this and related questions. We might call this increasing use of Big Data to answer psychological questions Big Psych.

  To see how this works, let’s consider a study I conducted on how childhood experiences influence which baseball team you support—or whether you support any team at all. For this study, I used Facebook data on “likes” of baseball teams. (In the previous chapter I noted that Facebook data can be deeply misleading on sensitive topics. With this study, I am assuming that nobody, not even a Phillies fan, is embarrassed to acknowledge a rooting interest in a particular team on Facebook.)

  To begin with, I downloaded the number of males of every age who “like” each of New York’s two baseball teams. Here are the percent that are Mets fans, by year of birth.

  The higher the point, the more Mets fans. The popularity of the team rises and falls then rises and falls again, with the Mets being very popular among those born in 1962 and 1978. I’m guessing baseball fans might have an idea as to what’s going on here. The Mets have won just two World Series: in 1969 and 1986. These men were roughly seven to eight years old when the Mets won. Thus a huge predictor of Mets fandom, for boys at least, is whether the Mets won a World Series when they were around the age seven or eight.

  In fact, we can extend this analysis. I downloaded information on Facebook showing how many fans of every age “like” every one of a comprehensive selection of Major League Baseball teams.

  I found that there are also an unusually high number of male Baltimore Orioles fans born in 1962 and male Pittsburgh Pirates fans born in 1963. Those men were eight-year-old boys when these teams were champions. Indeed, calculating the age of peak fandom for all the teams I studied, then figuring out how old these fans would have been, gave me this chart:

  Once again we see that the most important year in a man’s life, for the purposes of cementing his favorite baseball team as an adult, is when he is more or less eight years old. Overall, five to fifteen is the key period to win over a boy. Winning when a man is nineteen or twenty is about one-eighth as important in determining who he will root for as winning when he is eight. By then, he will already either love a team for life or he won’t.

  You might be asking, what about women baseball fans? The patterns are much less sharp, but the peak age appears to be twenty-two years old.

  This is my favorite study. It relates to two of my most beloved topics: baseball and the sources of my adult discontent. I was firmly hooked in 1986 and have been suffering along—rooting for the Mets—ever since. Noah had the good sense to be born four years later and was spared this pain.

  Now, baseball is not the most important topic in the world, or so my Ph.D. advisors repeatedly told me. But this methodology might help us tackle similar questions, including how people develop their political preferences, sexual proclivities, musical taste, and financial habits. (I would be particularly interested on the origins of my brother’s wacky ideas on the latter two subjects.) My prediction is that we will find that many of our adult behaviors and interests, even those that we consider fundamental to who we are, can be explained by the arbitrary facts of when we were born and what was going on in certain key years while we were young.

  Indeed, some work has already been done on the origin of political preferences. Yair Ghitza, chief scientist at Catalist, a data analysis company, and Andrew Gelman, a political scientist and statistician at Columbia University, tried to test the conventional idea that most people start out liberal and become increasingly conservative as they age. This is the view expressed in a famous quote often attributed to Winston Churchill: “Any man who is under 30, and is not a liberal, has no heart; and any man who is over 30, and is not a conservative, has no brains.”

  Ghitza and Gelman pored through sixty years of survey data, taking advantage of more than 300,000 observations on voting preferences. They found, contrary to Churchill’s claim, that teenagers sometimes tilt liberal and sometimes tilt conservative. As do the middle-aged and the elderly.

  These researchers discovered that political views actually form in a way not dissimilar to the way our sports team preferences do. There is a crucial period that imprints on people for life. Between the key ages of fourteen and twenty-four, numerous Americans will form their views based on the popularity of the current president. A popular Republican or unpopular Democrat will influence many young adults to become Republicans. An unpopular Republican or popular Democrat puts this impressionable group in the Democratic column.

  And these views, in these key years, will, on average, last a lifetime.

  To see how this works, compare Americans born in 1941 and those born a decade later.

  Those in the first group came of age during the presidency of Dwight D. Eisenhower, a popular Republican. In the early 1960s, despite being under thirty, this generation strongly tilted toward the Republican Party. And members of this generation have consistently tilted Republican as they have aged.

  Americans born ten years later—baby bo
omers—came of age during the presidencies of John F. Kennedy, an extremely popular Democrat; Lyndon B. Johnson, an initially popular Democrat; and Richard M. Nixon, a Republican who eventually resigned in disgrace. Members of this generation have tilted liberal their entire lives.

  With all this data, the researchers were able to determine the single most important year for developing political views: age eighteen.

  And they found that these imprint effects are substantial. Their model estimates that the Eisenhower experience resulted in about a 10 percentage point lifetime boost for Republicans among Americans born in 1941. The Kennedy, Johnson, and Nixon experience gave Democrats a 7 percentage point advantage among Americans born in 1952.

  I’ve made it clear that I am skeptical of survey data, but I am impressed with the large number of responses examined here. In fact, this study could not have been done with one small survey. The researchers needed the hundreds of thousands of observations, aggregated from many surveys, to see how preferences change as people age.

  Data size was also crucial for my baseball study. I needed to zoom in not only on fans of each team but on people of every age. Millions of observations are required to do this and Facebook and other digital sources routinely offer such numbers.

  This is where the bigness of Big Data really comes into play. You need a lot of pixels in a photo in order to be able to zoom in with clarity on one small portion of it. Similarly, you need a lot of observations in a dataset in order to be able to zoom in with clarity on one small subset of that data—for example, how popular the Mets are among men born in 1978. A small survey of a couple of thousand people won’t have a large enough sample of such men.

  This is the third power of Big Data: Big Data allows us to meaningfully zoom in on small segments of a dataset to gain new insights on who we are. And we can zoom in on other dimensions besides age. If we have enough data, we can see how people in particular towns and cities behave. And we can see how people carry on hour-by-hour or even minute-by-minute.

  In this chapter, human behavior gets its close-up.

  WHAT’S REALLY GOING ON IN OUR COUNTIES, CITIES, AND TOWNS?

  In hindsight it’s surprising. But when Raj Chetty, then a professor at Harvard, and a small research team first got a hold of a rather large dataset—all Americans’ tax records since 1996—they were not certain anything would come of it. The IRS had handed over the data because they thought the researchers might be able to use it to help clarify the effects of tax policy.

  The initial attempts Chetty and his team made to use this Big Data led, in fact, to numerous dead ends. Their investigations of the consequences of state and federal tax policies reached mostly the same conclusions everybody else had just by using surveys. Perhaps Chetty’s answers, using the hundreds of millions of IRS data points, were a bit more precise. But getting the same answers as everybody else, with a little more precision, is not a major social science accomplishment. It is not the type of work that top journals are eager to publish.

  Plus, organizing and analyzing all the IRS data was time-consuming. Chetty and his team—drowning in data—were taking more time than everybody else to find the same answers as everybody else.

  It was beginning to look like the Big Data skeptics were right. You didn’t need data for hundreds of millions of Americans to understand tax policy; a survey of ten thousand people was plenty. Chetty and his team were understandably discouraged.

  And then, finally, the researchers realized their mistake. “Big Data is not just about doing the same thing you would have done with surveys except with more data,” Chetty explains. They were asking little data questions of the massive collection of data they had been handed. “Big Data really should allow you to use completely different designs than what you would have with a survey,” Chetty adds. “You can, for example, zoom in on geographies.”

  In other words, with data on hundreds of millions of people, Chetty and his team could spot patterns among cities, towns, and neighborhoods, large and small.

  As a graduate student at Harvard, I was in a seminar room when Chetty presented his initial results using the tax records of every American. Social scientists refer in their work to observations—how many data points they have. If a social scientist is working with a survey of eight hundred people, he would say, “We have eight hundred observations.” If he is working with a laboratory experiment with seventy people, he would say, “We have seventy observations.”

  “We have one-point-two billion observations,” Chetty said, straight-faced. The audience giggled nervously.

  Chetty and his coauthors began, in that seminar room and then in a series of papers, to give us important new insights into how America works.

  Consider this question: is America a land of opportunity? Do you have a shot, if your parents are not rich, to become rich yourself?

  The traditional way to answer this question is to look at a representative sample of Americans and compare this to similar data from other countries.

  Here is the data for a variety of countries on equality of opportunity. The question asked: what is the chance that a person with parents in the bottom 20 percent of the income distribution reaches the top 20 percent of the income distribution?

  CHANCES A PERSON WITH POOR PARENTS WILL BECOME RICH (SELECTED COUNTRIES)

  United States

  7.5

  United Kingdom

  9.0

  Denmark

  11.7

  Canada

  13.5

  As you can see, America does not score well.

  But this simple analysis misses the real story. Chetty’s team zoomed in on geography. They found the odds differ a huge amount depending on where in the United States you were born.

  CHANCES A PERSON WITH POOR PARENTS WILL BECOME RICH (SELECTED PARTS OF THE UNITED STATES)

  San Jose, CA

  12.9

  Washington, DC

  10.5

  United States Average

  7.5

  Chicago, IL

  6.5

  Charlotte, NC

  4.4

  In some parts of the United States, the chance of a poor kid succeeding is as high as in any developed country in the world. In other parts of the United States, the chance of a poor kid succeeding is lower than in any developed country in the world.

  These patterns would never be seen in a small survey, which might only include a few people in Charlotte and San Jose, and which therefore would prevent you from zooming in like this.

  In fact, Chetty’s team could zoom in even further. Because they had so much data—data on every single American—they could even zoom in on the small groups of people who moved from city to city to see how that might have affected their prospects: those who moved from New York City to Los Angeles, Milwaukee to Atlanta, San Jose to Charlotte. This allowed them to test for causation, not just correlation (a distinction I’ll discuss in the next chapter). And, yes, moving to the right city in one’s formative years made a significant difference.

  So is America a “land of opportunity”?

  The answer is neither yes nor no. The answer is: some parts are, and some parts aren’t.

  As the authors write, “The U.S. is better described as a collection of societies, some of which are ‘lands of opportunity’ with high rates of mobility across generations, and others in which few children escape poverty.”

  So what is it about parts of the United States where there is high income mobility? What makes some places better at equaling the playing field, of allowing a poor kid to have a pretty good life? Areas that spend more on education provide a better chance to poor kids. Places with more religious people and lower crime do better. Places with more black people do worse. Interestingly, this has an effect on not just the black kids but on the white kids living there as well. Places with lots of single mothers do worse. This effect too holds not just for kids of single mothers but for kids of married parents living in places with lots of single moth
ers. Some of these results suggest that a poor kid’s peers matter. If his friends have a difficult background and little opportunity, he may struggle more to escape poverty.

  The data tells us that some parts of America are better at giving kids a chance to escape poverty. So what places are best at giving people a chance to escape the grim reaper?

  We like to think of death as the great equalizer. Nobody, after all, can avoid it. Not the pauper nor the king, the homeless man nor Mark Zuckerberg. Everybody dies.

  But if the wealthy can’t avoid death, data tells us that they can now delay it. American women in the top 1 percent of income live, on average, ten years longer than American women in the bottom 1 percent of income. For men, the gap is fifteen years.

  How do these patterns vary in different parts of the United States? Does your life expectancy vary based on where you live? Is this variation different for rich and poor people? Again, by zooming in on geography, Raj Chetty’s team found the answers.

  Interestingly, for the wealthiest Americans, life expectancy is hardly affected by where they live. If you have excesses of money, you can expect to make it roughly eighty-nine years as a woman and about eighty-seven years as a man. Rich people everywhere tend to develop healthier habits—on average, they exercise more, eat better, smoke less, and are less likely to suffer from obesity. Rich people can afford the treadmill, the organic avocados, the yoga classes. And they can buy these things in any corner of the United States.

  For the poor, the story is different. For the poorest Americans, life expectancy varies tremendously depending on where they live. In fact, living in the right place can add five years to a poor person’s life expectancy.

  So why do some places seem to allow the impoverished to live so much longer? What attributes do cities where poor people live the longest share?

 

‹ Prev