Dataclysm: Who We Are (When We Think No One's Looking)
Page 13
musiq soulchild portugal. the man raising my children gravitate toward
neyo camera obscura a better life for brussels
2ne1 rancid associates degree in toronto
esperanza yo la tengo curly hair and march madness
mangas paddle boarding madea cambridge
zane armin im a single mom adventures of kavalier
n.e.r.d santa cruz mexican and italian food creole
coldest winter ever ecuador i’m a country girl meetup
mines ccr ellen hopkins parentheses
ratchet the dog park people notice my eyes arbor
aventura bbqing my name is ashley curl up with a
malcolm x origami brittany for my next meal
asians handshake at a daycare singer songwriters
carne gabriela my family my cell ann arbor
hw line is it anyway want a man that raleigh
earphones sunblock me and my son interpreter of maladies
I’ve talked about race a lot so far, and I’ve done so, as I’ve said, because it’s something rarely addressed analytically. And the data I have is ideal for tackling taboos. But sex is the single most important grouping that humanity has. It’s existed forever, even stretching back to when we were just one people, and perhaps because of those deep-time roots, gender roles are more universal and more stubborn than any other. It’s easy to forget, given how ineradicable the color line can seem, that ideas of race are a product of time and place. The Irish and eastern Europeans weren’t considered “white” until the 1900s; in Mexico, the indigenous Mayans and the mestizos with Spanish blood have been distinct ethnic groups (and political opponents) for centuries. Yet to most people from the United States, they’re both just “Hispanic.” But sexual division is a given in human culture—every culture, every time.
Paradoxically, OkCupid isn’t the best place to explore the differences between men and women, at least through the method we’ve developed here. Your sex is built into how you use a dating site, so, for example, the most salient thing you find about (straight) women from their profile text is that they’re looking for men, and so on. Sex and profile text are inextricable, and analysis gets you little more than tautologies. The ideal source for analyzing gender difference is instead one where a user’s gender is nominally irrelevant, where it doesn’t matter if the person is a man or woman. I chose Twitter as that neutral ground. The lists below were made using the same math as the OkCupid lists above, but they use the text from users’ tweets.
most typical words for …
men women
good bro my nails done
ps4 my sissy
james harden mani pedi
mark sanchez my makeup
my beard my purse
cp3 girls night
in 2k my hair for
bynum prom dress
the squad girls day
bro we retail therapy
manziel thanks girl
in nba my future husband
year deal to dye
iverson dress shopping
yeah bro too girl
kyrie happy girl
hoopin bobby pins
free agent wanelo
tim duncan my boyfriend and
scorer my belly button
offseason my roomie
hof girlies
xbox one dying my
david stern cute texts
yds girl crush
fantasy team my boyfriends
gameplay eyebrows done
gasol curl my
lbj my hubby
bro u us girls
This gives you the distilled essences of men and women—read and grow stupider. Remember, before you get depressed, that the method is designed to find what’s unique about each group, find the things they don’t have in common and bring them to the fore. It’s the mathematical version of the guy at the state fair: caricature by algorithm instead of airbrush.
These are the words at the extremes, but for men and women, as for the ethnic groups before, the essential vocabulary (“the,” “pizza,” and so on) is shared. In fact, there’s a growing consensus among psychologists that men and women are fundamentally very similar, despite the popular cosmology that has them on different planets. Researchers at the University of Rochester recently pronounced “Men Are from Mars Earth, Women Are from Venus Earth,” concluding:
From empathy and sexuality to science inclination and extroversion, statistical analysis of 122 different characteristics involving 13,301 individuals shows that men and women, by and large, do not fall into different groups.
And yet, though my method is built to tease out differences, it’s hard to imagine two more opposite sets of interests than the ones listed above. I can’t tell which side to root for here—on the one hand, it’s surely a worse world where women fixate on their appearance and men live the beef jerky lifestyle. On the other hand, if men and women were exactly alike, life wouldn’t be much fun. Same goes for the by-race lists above. Cultural differences, even if they’re occasionally laughable, make the world a richer place.
The Mars/Venus thing, metaphor though it is, reminds me that the heavens are an ancient reference point for science. Aristotle looked to the emptiness overhead to verify his aether. Newton confirmed his law of inverse squares through the motion of Mars. Even Einstein wasn’t truly Einstein until the sun and moon said so, in a 1919 eclipse that confirmed the theory of General Relativity. Even though we’re working on nothing so grand as all that here, I have to say I hope that paper’s snarky strikeout typeface is premature, at least for the things we like and talk about and the ways we spend our time. Look at it this way: if there were no planet out there but Earth, it would be a very boring universe.
1 Another, much more famous, example is: eπi + 1 = 0. Here, astoundingly, the five most important values in mathematics form a single equation. It’s called the Euler Identity, by the way. He was a slacker.
2 This example is adapted from “Zipf’s Law and Vocabulary,” by C. Joseph Sorell, Victoria University of Wellington. Like any empirical law, Zipf’s is a very good (and time-tested) descriptive framework, but as you can see there is some variance in observed outcomes. It’s like knowing that a fair coin comes up heads half the time. Nonetheless, even after a thousand flips, it’s very unlikely that exactly half of them will have been heads.
3 The algorithm converted all words to lowercase and so I present them like that here.
11.
Ever Fallen in Love?
A few years ago a couple of MIT students, as a class project, used Facebook’s data to create a working “gaydar.” It was a simple piece of software that behaved a lot like any human trying to make an educated guess about somebody: it looked at who the person’s friends were. The program quickly learned to recognize that a certain balance of gays and straights in a guy’s social circle reliably indicated his sexuality; it didn’t need to know anything directly about him at all. As the Boston Globe put it at the time, “People may be effectively ‘outing’ themselves just by the virtual company they keep.” After the students had trained it on known profiles, the software was able to correctly predict if a man was gay 78 percent of the time, just from the nature of his social graph. That’s a highly robust result when you consider that the expected success rate, if the program were just guessing blind, would’ve been only … uh, like … 10 percent? 2 percent? 8? π/2?
That’s just the thing—part of the reason the kids made a program to guess in the first place—nobody really knows how many gay people there are. Past estimates vary wildly, as past estimates are wont to do.1 The Kinsey Report in 1948 was one of the first scientific attempts to get a real number; it drew many brows together over horn-rimmed glasses by suggesting that 10 percent of men and 6 percent of women were gay. Later studies, many politically motivated and all using either survey data or contrived setups in laboratories, have put the number as low as 1 percent and as high as 15.2 We are now able to get a better guess by
a different route, and improving the accuracy here is important because, as one study blandly put it, “This work can usefully inform public policy.” All but four presidential elections since 1952 would’ve flipped had 5 percent of the electorate changed their minds, so the question of whether a group makes up 1 percent or 5 percent or 10 percent of the country is of primal interest to the political calculus. Although the number of gay people carries no moral weight—even if there were just one in the whole United States, he or she would deserve the same rights as everyone else—it’s a simple practical reality that policy decisions depend on the actual size of the population.
Also, for a group historically so stigmatized, a well-supported number speaks up where the individual cannot. It says: I am here. Gay people are a somewhat unusual minority, in that they can seem straight, at least superficially, if they decide they must. This surely involves a painful choice between self-preservation and self-expression that few other people ever have to weigh. But aside from the clear cost to the individual, “the closet” costs our society, too, as secrecy allows old attitudes to go unchallenged—and prejudice unchallenged is prejudice perpetuated. By forcing people to hide, intolerance creates its own cynical logic: when a large portion of a group goes unrecognized, it only makes marginalizing the whole easier. Visibility, on the other hand, creates acceptance. Even at lower estimates, homosexuality is no more unusual than naturally blond hair—which something like 2 percent of humanity is born with. In fact, being gay appears to be much more common than that. It’s just less accepted and therefore much more often forced from view. Think about that the next time you pick up a celebrity magazine.
Turning to the data, Google Trends again shows its power to reveal what people feel they cannot say. According to Stephens-Davidowitz, the Google researcher, 5 percent of searches for porn in the United States are looking for what he calls “depictions of gay men”—that’s a catchall that includes straightforward queries like “gay porn” and related searches like “rocket tube,” a popular gay portal. What’s more, that 1 in 20 ratio is consistent from state to state, meaning that same-sex desire is unaffected by a man’s political and religious milieu. This evenness has a few powerful implications. First, it frustrates the argument that homosexuality is anything but genetic. If men from such different environments as Mississippi and Massachusetts are looking for gay porn at equal rates, that’s strong evidence that supposed external forces have little effect on same-sex attraction.
The second implication of the state-by-state sameness in the data—that is, what it reveals not so much about gay people but about intolerance—needs a little time to unfurl. In early 2013, when he was still covering politics for the Times, Nate Silver applied his famous poll-modeling technique to same-sex-marriage ballot initiatives across the country. As he had done in the presidential elections, he aggregated data to get a snapshot of public opinion in each state, and then he performed some forward-looking analysis to guess how those attitudes might evolve. Silver estimated that gay marriage will be legal in forty-four states by 2020.
An interesting thing about Silver’s work on the question, which was based on political polls, is how it relates to another data source: what people in each state told Gallup about their own sexuality. Here are those self-reported numbers graphed against Silver’s most current projections for the acceptance of gay marriage, state-by-state. I’ve coded each state by its legal treatment of gay marriage and labeled a few of the outliers, as well.
On the horizontal, you see that, per Silver, Mississippi is the least tolerant state and Rhode Island is the most. On the vertical axis, Gallup’s numbers range from 1.7 percent in North Dakota, to 5.1 percent in Hawaii. And, as you see from the slant of the trend line, the more accepting a state is of homosexuality, the higher its self-reported gay population. Remarkably, if you walk that dotted line out to 100 percent support of gay marriage (statistically imagining a future world of perfect tolerance), you find it implies that roughly 5 percent of the population would say they are gay, absent social pressure not to be. That’s the same number implied by Google Search, where the lack of social pressure isn’t just theoretical.
Furthermore, that trend line isn’t a function of folks simply living where they’re more welcome. The state-to-state steadiness in searches for gay porn provides evidence of this and so does mobility data from Facebook. Comparing the hometowns of gay users to their current residences you find that relocation explains only a small fraction of the variance in Gallup’s rates of homosexuality above. Gay people do not disproportionately move to more tolerant places. On the one hand, this is a testament to the strength of home ties, upbringing, and simple inertia. On the other, it means that for every person picking up and moving to a San Francisco or a New York City to live life fully, there are likely dozens still living in self-negation.
If you accept these two independent estimates of 5 percent, arrived at using three of the biggest forces in modern data—Nate Silver, Google, and Facebook, with an assist from that standby of old-school polling, Gallup—you begin to see those self-reported numbers in a different light. When Gallup tells us that, for example, 1.7 percent of North Dakotans are gay, then perhaps something like 3.3 percent of the state is gay and unwilling to acknowledge it. In New York, about 4 percent of the population is openly gay, leaving maybe 1 percent gay and silent. And likewise for every state. Against the steadiness of the data, the ups and downs in self-reported gay populations take on a new meaning: it shows a nation of Americans leading secret lives. This adds specific wisdom to the broad poetry often attributed to Thoreau: “most men lead lives of quiet desperation and go to the grave with the song still in them.” These are refugees of the soul, and we see it in the data.
Data even gives us a picture of the collateral damage. Here’s Stephens-Davidowitz again:
In the United States, of all Google searches that begin “Is my husband …,” the most common word to follow is “gay.” “Gay” is 10 percent more common in such searches than the second-place word, “cheating.” It is 8 times more common than “an alcoholic” and 10 times more common than “depressed.”
And those questioning searches are most common where repression is at its highest: South Carolina and Louisiana, for example, have the highest rates, and acceptance of gay marriage is below the national average in 21 of the 25 states where this search is most frequent. One wonders what the people so intent on driving homosexuality underground (or “curing” it) make of this data, and of the sexless marriages and children with unhappy parents their efforts so clearly create. Again, this isn’t rhetoric—it’s numbers. The old economic “misery index” is inflation + unemployment. I suggest the social version is the fraction of the population living in places where they can’t be themselves. It’s a situation that serves no end but suffering.3
Unfortunately, Google Search is ineffective for estimating the number of lesbians in the country. The many straight men looking for women-with-women porn garbles the data. However, we can see shadows of Silver’s acceptance estimates in OkCupid’s data, with some interesting twists. I estimate that more than a quarter of the country’s dating gay population used OkCupid in 2013.4 Gay online daters generally should be more open than average about their sexuality—after all, they’re putting up profiles on a website. However, recognizing that many people would rather not broadcast their sexual identity Internet-wide, OkCupid gives its gay users the option to “hide” their profile from everyone except other gay users. Fifty-nine percent of gay men and 53 percent of gay women take advantage of the option. In this data too, the correlation between a state’s tolerance and openness is visible, though more so for women, whom I’ve plotted below.
After you get past questions of “outness,” gay users look a lot like everyone else on OkCupid. In the match questions, the site’s gay users show the same rates of drug use, racial prejudice, and horniness as the straights, and gays want the same types of relationships. In fact, for sexual attitudes, if any group
is an outlier, it’s straight women. They’re comparative prudes: 6.1 percent of straight men, 6.9 percent of gay men, and 7.0 percent of lesbians are on OkCupid explicitly looking for casual sex. Only 0.8 percent of straight women are, which probably says more about the taboo against sexual forwardness in (straight) females than anything else.5
The number of reported lifetime sex partners among all four groups is essentially the same. The median for gay men and straight women is four; for lesbians and straight men it’s five, but just barely.6 If there is a significant difference in sexual behavior, it’s at the extreme end: there we find a stereotype partially fulfilled. Highly promiscuous gay men (the cohort reporting twenty-five or more partners) outnumber their straight male counterparts 2 to 1. Funnily enough, in sex, as in wealth and language, we have an inequality problem. According to this data, the top 2 percent of gay men are having about 28 percent of the total gay sex.
To see how identities are formed around the labels “gay” and “straight,” we can apply the “word rank square” method from the last chapter to investigate personal self-descriptions. As before, profile essays give us a sense of what makes each group unique versus the others: what’s special about lesbians, what makes gay men different from straight, and so on, and the method puts everything in the users’ own words. The behavioral data above shows that how we love isn’t all that different, but below we see that who we love, of course, is. The math forces up the vocabulary most typical of each group:
most typical words for …
gay men gay women straight men straight women
first wives i am gay knows what she wants honest man
velvet rage old lesbian i have no kids man to share
tales of the city i’m a lesbian treat a woman to meet a man
you’re a nice guy i am a lesbian care of herself a man who knows