Stern, Howard, 157
   stock market
   data for, 55–56
   and examples of Big Data searches, 22
   Summers-Stephens-Davidowitz attempt to predict the, 245–48, 251–52
   Stone, Oliver, 185
   Stoneham, James, 266, 269
   Storegard, Adam, 99–101
   stories
   categories/types of, 91–92
   viral, 22, 92
   and zooming in, 205–6
   See also specific story
   Stormfront (website), 7, 14, 18, 137–40
   stretch marks, and pregnancy, 188–89
   Stuyvesant High School (New York City), 231–37, 238, 240
   suburban areas, and origins of notable Americans, 183–84
   successful/notable Americans
   factors that drive, 185–86
   zooming in on, 180–86
   suffering, and benefits of digital truth serum, 161
   suicide, and danger of empowered government, 266, 267–68
   Summers, Lawrence
   and Obama-racism study, 243–44
   and predicting the stock market, 245, 246, 251–52
   Stephens-Davidowitz’s meeting with, 243–45
   Sunstein, Cass, 140
   Super Bowl games, advertising during, 221–25, 239
   Super Crunchers (Gnau), 264
   Supreme Court, and abortion, 147
   Surowiecki, James, 203
   surveys
   in-person, 108
   internet, 108
   and lying, 105–7, 108, 108n
   and pictures as data, 97
   skepticism about, 171
   telephone, 108
   and truth about sex, 113, 116
   and zooming in on hours and minutes, 193
   See also specific survey or topic
   Syrian refugees, 131
   Taleb, Nassim, 17
   Tartt, Donna, 283
   TaskRabbit, 212
   taxes
   cheating on, 22, 178–80, 206
   and examples of Big Data searches, 22
   and lying, 180
   and self-employed people, 178–80
   and words as data, 93–95
   zooming in on, 172–73, 178–80, 206
   teachers, using tests to judge, 253–54
   teenagers
   adopted, 108n
   as gay, 114, 116
   lying by, 108n
   and origins of political preferences, 169
   and truth about sex, 114, 116
   See also children
   television
   and A/B testing, 222
   advertising on, 221–26
   Terabyte, 264
   terrorism, 18, 129–31
   tests/testing
   of high school students, 231–37, 253–54
   and judging teacher, 253–54
   and obsessive infatuations with numbers, 253–54
   online behavior as supplement to, 278
   and small data, 255–56
   See also specific test or study
   Thiel, Peter, 155
   Think Progress (website), 130
   Thinking, Fast and Slow (Kahneman), 283
   Thome, Jim, 200
   Tourangeau, Roger, 107, 108
   towns, zooming in on, 172–90
   Toy Story (movie), 192
   Trump, Donald
   elections of 2012 and, 7
   and ignoring what people tell you, 157
   and immigration, 184
   issues propagated by, 7
   and origins of notable Americans, 184
   polls about, 1
   predictions about, 11–14
   and racism, 8, 9, 11, 12, 14, 133, 139
   See also elections, 2016
   truth
   benefits of knowing, 158–63
   handling the, 158–63
   See also digital truth serum; lying; specific topic
   Tuskegee University, 183
   Twentieth Century Fox, 221–22
   Twitter, 151–52, 160–61n, 201–3
   typing errors by searchers, 48–50
   The Unbearable Lightness of Being (Kundera), 233
   Uncharted (Aiden and Michel), 78–79
   unemployment
   and child abuse, 145–47
   data about, 56–57, 58–59
   unintended consequences, 197
   United States
   and Civil War, 79
   as united or divided, 78–79
   University of California, Berkeley, racism in 2008 election study at, 2
   University of Maryland, survey of graduates of, 106–7
   urban areas
   and life expectancy, 177
   and origins of notable Americans, 183–84, 186
   vagina, smells of, 19, 126–27, 161
   Varian, Hal, 57–58, 224
   Vikingmaiden88, 136–37, 140–41, 145
   violence
   and real science, 273
   zooming in on, 190–97
   See also murder
   voter registration, 106
   voter turnout, 9–10, 109–10
   voting behavior, and lying, 106, 107, 109–10
   Vox, 202
   Walmart, 71–72
   Washington Post, and words as data, 75, 94
   Washington Times, and words as data, 75, 94–95
   wealth
   and life expectancy, 176–77
   See also income distribution
   weather, and predictions about wine, 73–74
   Weil, David N., 99–101
   Weiner, Anthony, 234n
   white nationalism, 137–40, 145. See also Stormfront
   Whitepride26, 139
   Wikipedia, 14, 180–86
   wine, predictions about, 72–74
   wives
   and descriptions of husbands, 160–61, 160–61n
   and suspicions about gayness of husbands, 116–17
   women
   breasts of, 125, 126
   butt of, 125–26
   genitals of, 126–27
   violence against, 121–22
   See also girls; wives; specific topic
   words
   and bias, 74–76, 93–97
   and categories/types of stories, 91–92
   as data, 74–97
   and dating, 80–86
   and digital revolution, 278
   and digitalization of books, 77, 79
   and gay marriage, 74–76
   and sentiment analysis, 87–92
   and U.S. as united or divided, 78–79
   workers’ rights, 93, 94
   World Bank, 102
   World of Warcraft (game), 220
   Wrenn, Doug, 39–40, 41
   Yahoo News, 140, 143
   yearbooks, high school, 98–99
   Yelp, 265
   Yilmaz, Ahmed (alias), 231–33, 234, 234n
   YouTube, 152
   Zayat, Ahmed, 63–64, 65
   Zero to One (Thiel), 155
   zooming in
   on baseball, 165–69, 165–66n, 171, 197–200, 200n, 203, 206, 239
   benefits of, 205–6
   on counties, cities, and towns, 172–90, 239–40
   and data size, 171, 172–73
   on doppelgangers, 197–205
   on equality of opportunity, 173–75
   on gambling, 263–65
   on health, 203–5, 275
   on income distribution, 174–76, 185
   and influence of childhood experiences, 165–71, 165–66n, 206
   on life expectancy, 176–78
   on minutes and hours, 190–97
   and natural experiments, 239–40
   and origin of political preferences, 169–71
   on pregnancy, 187–90
   stories from, 205–6
   on successful/notable Americans, 180–86
   on taxes, 172–73, 178–80, 206
   Zuckerberg, Mark, 154–56, 157, 158, 238–39
   ABOUT THE AUTHOR
   Seth Stephens-Davidowitz is a New York Times op-ed contributor, a visit
ing lecturer at The Wharton School, and a former Google data scientist. He received a BA in philosophy from Stanford, where he graduated Phi Beta Kappa, and a PhD in economics from Harvard. His research—which uses new, big data sources to uncover hidden behaviors and attitudes—has appeared in the Journal of Public Economics and other prestigious publications. He lives in New York City.
   Discover Great Authors, Exclusive Offers, and more at hc.com.
   COPYRIGHT
   EVERYBODY LIES. Copyright © 2017 by Seth Stephens-Davidowitz. Copyright © 2017 by Seth Stephens-Davidowitz. All rights reserved under International and Pan-American Copyright Conventions. By payment of the required fees, you have been granted the nonexclusive, nontransferable right to access and read the text of this e-book on-screen. No part of this text may be reproduced, transmitted, downloaded, decompiled, reverse-engineered, or stored in or introduced into any information storage and retrieval system, in any form or by any means, whether electronic or mechanical, now known or hereafter invented, without the express written permission of HarperCollins e-books.
   FIRST EDITION
   Cover design by Lisa Amoroso
   Cover photograph of elephant/zebra © Visuals Unlimited, Inc./Victor Habbick
   Other zebras © Shutterstock/Aaron Amat
   ISBN 978-0-06-239085-1
   EPub Edition May 2017 ISBN 9780062390875
   ABOUT THE PUBLISHER
   Australia
   HarperCollins Publishers (Australia) Pty. Ltd.
   Level 13, 201 Elizabeth Street
   Sydney, NSW 2000, Australia
   www.harpercollins.com.au
   Canada
   HarperCollins Canada
   2 Bloor Street East - 20th Floor
   Toronto, ON M4W 1A8, Canada
   www.harpercollins.ca
   New Zealand
   HarperCollins Publishers New Zealand
   Unit D1, 63 Apollo Drive
   Rosedale 0632
   Auckland, New Zealand
   www.harpercollins.co.nz
   United Kingdom
   HarperCollins Publishers Ltd.
   1 London Bridge Street
   London SE1 9GF, UK
   www.harpercollins.co.uk
   United States
   HarperCollins Publishers Inc.
   195 Broadway
   New York, NY 10007
   www.harpercollins.com
   * Google Trends has been a source of much of my data. However, since it only allows you to compare the relative frequency of different searches but does not report the absolute number of any particular search, I have usually supplemented it with Google AdWords, which reports exactly how frequently every search is made. In most cases I have also been able to sharpen the picture with the help of my own Trends-based algorithm, which I describe in my dissertation, “Essays Using Google Data,” and in my Journal of Public Economics paper, “The Cost of Racial Animus on a Black Candidate: Evidence Using Google Search Data.” The dissertation, a link to the paper, and a complete explanation of the data and code used in all the original research presented in this book are available on my website, sethsd.com.
   * Full disclosure: Shortly after I completed this study, I moved from California to New York. Using data to learn what you should do is often easy. Actually doing it is tough.
   * While the initial version of Google Flu had significant flaws, researchers have recently recalibrated the model, with more success.
   * In 1998, if you searched “cars” on a popular pre-Google search engine, you were inundated with porn sites. These porn sites had written the word “cars” frequently in white letters on a white background to trick the search engine. They then got a few extra clicks from people who meant to buy a car but got distracted by the porn.
   * One theory I am working on: Big Data just confirms everything the late Leonard Cohen ever said. For example, Leonard Cohen once gave his nephew the following advice for wooing women: “Listen well. Then listen some more. And when you think you are done listening, listen some more.” That seems to be roughly similar to what these scientists found.
   * Another reason for lying is simply to mess with surveys. This is a huge problem for any research regarding teenagers, fundamentally complicating our ability to understand this age group. Researchers originally found a correlation between a teenager’s being adopted and a variety of negative behaviors, such as using drugs, drinking alcohol, and skipping school. In subsequent research, they found this correlation was entirely explained by the 19 percent of self-reported adopted teenagers who weren’t actually adopted. Follow-up research has found that a meaningful percent of teenagers tell surveys they are more than seven feet tall, weigh more than four hundred pounds, or have three children. One survey found 99 percent of students who reported having an artificial limb to academic researchers were kidding.
   * Some may find it offensive that I associate a male preference for Judy Garland with a preference for having sex with men, even in jest. And I certainly don’t mean to imply that all—or even most—gay men have a fascination with divas. But search data demonstrates that there is something to the stereotype. I estimate that a man who searches for information about Judy Garland is three times more likely to search for gay porn than straight porn. Some stereotypes, Big Data tells us, are true.
   * I think this data also has implications for one’s optimal dating strategy. Clearly, one should put oneself out there, get rejected a lot, and not take rejection personally. This process will allow you, eventually, to find the mate who is most attracted to someone like you. Again, no matter what you look like, these people exist. Trust me.
   * I wanted to call this book How Big Is My Penis? What Google Searches Teach Us About Human Nature, but my editor warned me that would be a tough sell, that people might be too embarrassed to buy a book with that title in an airport bookstore. Do you agree?
   * To further test the hypothesis that parents treat kids of different genders differently, I am working on obtaining data from parenting websites. This would include a much larger number of parents than those who make these particular, specific searches.
   * I analyzed Twitter data. I thank Emma Pierson for help downloading this. I did not include descriptors of what one’s husband is doing right now, which are prevalent on social media but wouldn’t really make sense on search. Even these descriptions tilt toward the favorable. The top ways to describe what a husband is doing right now on social media are “working” and “cooking.”
   * Full disclosure: When I was fact-checking this book, Noah denied that his hatred of America’s pastime is a key part of his personality. He does admit to hating baseball, but he believes his kindness, love of children, and intelligence are the core elements of his personality—and that his attitudes about baseball would not even make the top ten. However, I concluded that it’s sometimes hard to see one’s own identity objectively and, as an outside observer, I am able to see that hating baseball is indeed fundamental to who Noah is, whether he’s able to recognize it or not. So I left it in.
   * This story shows how things that seem bad may be good if they prevent something worse. Ed McCaffrey, a Stanford-educated former wide receiver, uses this argument to justify letting all four of his sons play football: “These guys have energy. And, so, if they’re not playing football, they’re skateboarding, they’re climbing trees, they’re playing tag in the backyard, they’re doing paintball. I mean, they’re not going to sit there and do nothing. And, so, the way I look at it is, hey, at least there’s rules within the sport of football. . . . My kids have been to the emergency room for falling off decks, getting in bike crashes, skateboarding, falling out of trees. I mean, you name it . . . Yea, it’s a violent collision sport. But, also, my guys just have the personality, where, at least they’re not squirrel-jumping off mountains and doing crazy stuff like that. So, it’s organized aggression, I guess.” McCaffrey’s argument, made in an interview on The Herd with Colin Cowherd, is one I had never heard before. After reading the Dahl/DellaVigna paper, I ta
ke the argument seriously. An advantage of huge real-world datasets, rather than laboratory data, is that they can pick up these kinds of effects.
   * You can probably tell by this part of the book I tend to be cynical about good stories. I wanted one feel-good story in here, so I am leaving my cynicism to a footnote. I suspect PECOTA just found out that Ortiz was a steroid user who stopped using steroids and would start using them again. From the standpoint of prediction, it is actually pretty cool if PECOTA was able to detect that—but it makes it a less moving story.
   * A famous 1978 paper that claimed that winning the lottery does not make you happy has largely been debunked.
   * I have changed his name and a few details.
   * In looking for people like Yilmaz who scored near the cutoff, I was blown away by the number of people—in their twenties through their fifties—who remember this test-taking experience from their early teens and speak about missing a cutoff in dramatic terms. This includes former congressman and New York City mayoral candidate Anthony Weiner, who says he missed Stuy by a single point. “They didn’t want me,” he told me, in a phone interview.
   * Since everybody lies, you should question much of this story. Maybe I’m not an obsessive worker. Maybe I didn’t work extraordinarily hard on this book. Maybe I, like lots of people, can exaggerate just how much I work. Maybe my thirteen months of “hard work” included full months in which I did no work at all. Maybe I didn’t live as a hermit. Maybe, if you checked my Facebook profile, you’d see pictures of me out with friends during this supposed hermit period. Or maybe I was a hermit, but it was not self-imposed. Maybe I spent many nights alone, unable to work, hoping in vain that someone would contact me. Maybe nobody e-vites me to anything. Maybe nobody messages me on Bumble. Everybody lies. Every narrator is unreliable.
   
 
 Everybody Lies Page 29