Bitwise Page 25 Read online free by David Auerbach

Bitwise Page 25

Users who are likely to move soon

Users who are interested in the Olympics, fall football, cricket, or Ramadan

Users who travel frequently, for work or pleasure

Users who commute to work

Types of vacations user tends to go on

Users who recently returned from a trip

Users who recently used a travel app

Users who participate in a timeshare

Much of this data is not directly provided to Facebook by users. Facebook gathers information on us even when we aren’t explicitly providing it. In addition to what it collects from profiles, pictures, and clicks, Facebook correlates its information with what it obtains from third-party sources—such as car registrations, residential information, and corporate information.

For each of its users (as well as for many people who don’t use Facebook), Facebook creates a detailed shadow, and it profits by presenting these shadows as valuable targets for the marketing of goods, services, and ideas. Facebook’s marketing partners, whether local businesses or scam artists or shady political operatives, use this data not just for marketing but for their own surveillance of human behavior. Taken together, the politics, habits, demographics, hobbies, and personal relationships of a Facebook user allow for a greater degree of persuasion—and potential control—than ever before. Health information is protected by law, for example, but given a person’s drugstore purchases, alcohol habits, and life events, any company could make some very educated guesses as to the state of their health. Collect enough public information, shake it vigorously, and private information will fall out of it.

The advertiser categories Facebook has assigned to me. Some are correct; some are not.

* * *

—

Facebook silently classifies users by “Ethnic Affinity” in order to target ads, a category that it infers from the entirety of a user’s profile—tastes, friends, location, habits, etc. Such categories can be wrong, and most users will never even see the mistake. Last year, Facebook told me my ethnic affinity was Asian. This year, they think I’m African American (or, at least, someone “whose activity on Facebook aligns with African American multicultural affinity”). And even if you don’t mark your political orientation (a spectrum from “Very Conservative” to “Very Liberal”), Facebook makes a guess at that too.

To Facebook, targeting is a matter of profit and loss rather than life and death. Yet its behavior has an enormous impact on our lives. Facebook’s ethnic affinities cover only a handful of the countless ethnicities in the world—the only options are Asian American, African American, and Hispanic—yet it’s Facebook’s categories by which you are identified to advertisers worldwide.*4 Products have been targeted toward particular ethnicities for decades. But it’s one thing to appeal broadly to a demographic, and another to target individual consumers by race, which means classifying each individual as belonging to one “ethnic affinity” or another—let’s just call it race. As we see with other taxonomies, it is very hard to make such a classification without smuggling in all sorts of assumptions and biases. Here is one way to encode racial categories, following the taxonomy of the 2010 U.S. Census:

enum Race {

White = 0,

Black = 1,

Native American = 2,

Asian = 3,

Pacific Islander = 4,

Other = 5,

};

This is controversial territory. The reasons for choosing these five “races” aren’t apparent and owe as much to historical accident as to any purported biology. For comparison, here is an encoding for the 1939 classification created by respected American anthropologist Carleton S. Coon:

enum Race {

Caucasoid = 0,

Mongoloid = 1,

Negroid = 2,

Capoid = 3,

Australoid = 4,

};

It’s just a number to a computer, but a wholly different classification to us. Even if there is some scientific concept that deserves to be termed “race” (a question that seems, at the least, unresolved), it does not match up with any degree of rigor to anything that is popularly called “race”—and our shifting, amorphous notion of “race” is a social construct. It functions in much the same way as Myers-Briggs or the DSM, except with far more reaching consequences and far less logic: the census offers no options for those with two parents of different “races” other than a nebulous “Other” checkbox. These classifications, then, are akin to what DSM psychiatrist Allen Frances referred to as “temporarily useful diagnostic constructs.”*5 With race, however, the labels stick earlier and more permanently. We are slotted into categories before we are born.

To a computer, this data is neutral: the first set of racial labels I gave above are represented as the numbers 0 through 5, the second as 0 through 4. Computers are ignorant of the moiré of shifting meanings imposed on the labels assigned to those numbers, as well as whether those meanings are at all just.

Such a classification becomes an ontology to us, a way of seeing and carving up the world. Computers chronically reinforce the ontologies fed into them. When we select “Black” or “White” on a form, whether on the census or on Facebook, we render the underlying taxonomy real. We turn it into an ontology. When Facebook intuits a person’s race from his or her interests and posts, they are stereotyping—in my case, incorrectly. Since these categories are as prescriptive as they are descriptive, labels and associations are reinforced without any particular consideration as to what is being reinforced.

In the case of the racial categories of the U.S. Census listed above, what would my own “mixed-race” children select? There’s the catchall “Other,” but that merely signifies that the categories need to be revised to be more comprehensive and precise. But computers don’t deal well with shifting ontologies. Once they have an ontology, computers reify it. If we classify people in black and white, we bias ourselves to ignore all the factors that get lost in the cracks.

On Facebook, gender is one of the primary factors that determine what kinds of ads we see. Despite Facebook’s seeming embrace of gender multiplicity (57 in the United States, 71 in the UK), advertisers choose from only three options: male, female, and all. As a consequence, men and women have very different experiences on Facebook. Women will see ads for beauty products, home supplies, feminine hygiene, and other products targeted at female-dominant demographics. Men will see ads for technology, sports, and cars. This is bias. Facebook’s algorithms classify ads and people by the numbers given to them. The meanings of these classifications emerge implicitly from what data gets shunted under one label or the other. The effect is that men and women are encouraged to keep consuming along a firm divide. Crossover is discouraged. It’s hard to gauge the severity of impact of such segregation,*6 but I’m telling my daughter to use an ad-blocker.

Language, and our use of language, inevitably carry bias. Computer code itself lacks any such bias. But the computer data our software processes reflects life, and so it reflects our blind spots and prejudices. Once a computer starts to speak the language of humans and human practices, it plays out our linguistic biases. To be labeled is to be prejudged. By standardizing classification and making explicit our social classifications, computers have amplified the gaps and biases in our concepts to their breaking point.

Bad Labels

I have spent much of my life turning away from the scripts given to me, in China and in America; my refusal to be defined by the will of others is my one and only political statement.

—YIYUN LI, Dear Friend, from My Life I Write to You in Your Life

Facebook’s targeting categories are cases of labels being applied to data objects—in Facebook’s instance, people. What happens, however, w
hen a computer tries to label an object without knowing what it is?

In 2015, Google rolled out a new feature called photo categorization, which sorts users’ photos into folders based on their subject matter—what’s pictured in the photos. Photo categorization assigned labels to photos, then sorted the photos by label. With this program, Google stumbled into a minefield. I heard about the first issue when my wife sent me a link with the email subject “very bad machine learning false positive.” The link took me to tweets by a Haitian American software developer, who took a series of photos of himself with a female friend, and found that Google Photos had placed them into a folder tagged “gorillas,” because Google’s machine learning algorithms decided—wrongly—that there were gorillas in the photos.

Image recognition is an inexact science, based on “training” persistent machine learning networks through feedback mechanisms. Google’s systems made a horrendous error because the label assignment invoked a legacy of racism and dehumanization. Google apologized and quickly rolled out a fix, but didn’t explain what had happened and didn’t assuage concerns about the possibility of the error happening again.

There were three problems. First, the word “gorillas” was not being regulated as a potentially sensitive or offensive word. Google had surely dealt with sensitivities around that word at some point in the past, and so this error seems genuinely sloppy. They quickly restricted use of the word, which put a stop to the problem. But there likely still remain other loaded words that are similarly unrestricted.

The second problem was the miscategorization of the friends’ faces as animals. A higher-order algorithm failed to recognize that the photo contained human faces. Facial recognition algorithms such as those used by the FBI are frequently distinct from general object recognition algorithms, as faces are distinguished by variations quite specific to human physiology. But before a facial recognition subsystem can kick in, the general image recognition algorithm must determine whether a particular photo contains a human face.

Google trained its image recognition networks on hundreds of millions of faces; what caused it to make a false negative? It could well have been the subjects’ skin color. Color plays a large part in what is termed “skin detection,” which in turn plays a large part in face detection. (Image search engines use this approach to detect porn: safety filters that restrict image results to work-safe images determine which pictures show human skin, and then whether that skin belongs to certain body parts.) “Marginal” cases, or people with skin tones deemed too far from the overall average, are most likely to generate wrong results. It’s a problem when those marginal cases happen to be subjects who are also culturally marginalized by race. An Uber fatality when one of its self-driving cars struck a pedestrian in Arizona made clear the life-or-death stakes of computer vision algorithms. Self-driving cars must avoid humans even at the cost of hitting another object or an animal. But if a self-driving car must avoid two humans, and only one of them registers to the car as a human, the consequences could be disastrous. And if an error comes about due to a failure to recognize humanness based on skin color, then that algorithm would deserve to be called racist.

The final problem is the intrinsic difficulty of labeling images in the absence of cultural and semantic context. Also in 2015, Flickr launched a similar auto-tagging feature. One user posted a photo of the gates of the Dachau concentration camp, only to be given the suggestions “Dachau” and “jungle gym.” How is Flickr to detect such a distasteful (if not exactly racist) association? People can be hired to identify specific images as culturally sensitive. But for computers to do this, they would need cross-cultural knowledge beyond the possession of any single person or community. Humans have a great talent for drawing implicit associations. We also have a great talent for making some of those associations unpleasant and offensive. But there is no repository of that knowledge from which computers can learn. So computers move naively, Candide-like, and make missteps that only we can correct.

Compounding this problem is the fact that there often are bad actors who inject prejudice and bigotry into rich data. Google faced a much bigger problem when its Maps product started labeling geographical locations with some terms used on the web to describe those locations. As Google’s Jen Fitzpatrick put it, “Certain offensive search terms were triggering unexpected maps results, typically because people had used the offensive term in online discussions of the place.” In practice, this equated to Obama-era searches for “n—ga house” returning the White House and searches for “n—r university” returning Howard University. What had happened was that enough users had, in discussing the White House and Howard University, referred to them in such ways, and in its sweep of the web, Google’s knowledge engine had swept up the offensive labels and found them to be common enough to associate with those locations. While Google filtered out these terms to prevent them from showing up in the info boxes in the map results, Google Maps neglected to prevent searches on the problematic terms from returning offensive results.

In a subtler case, a 2016 paper by two electrical engineering professors specializing in image processing claimed to be able to group faces as “criminal” or “non-criminal” based on a machine learning analysis of facial pictures. Analyzing 1,126 internet photos of Chinese men and 730 photos of Chinese criminal males obtained from the Chinese government, they made assertions about the link of physiognomy to criminal behavior such as this:

the angle θ from nose tip to two mouth corners is on average 19.6% smaller for criminals than for non-criminals and has a larger variance. Also, the upper lip curvature ρ is on average 23.4% larger for criminals than for non-criminals. On the other hand, the distance d between two eye inner corners for criminals is slightly narrower (5.6%) than for non-criminals.

This analysis assumed that the photographic subjects all held the same neutral facial expression, but criminal mug shots are likely to be taken under conditions that produce rather different facial expressions than ordinary photographs. And the physiological differences, particularly those final 5.6 percentage points, are not broken down enough in the paper to establish whether they are statistically significant.*7 The authors concluded:

Although criminals are a small minority in total population, they have appreciably greater variations in facial appearance than general public. This coincides with the fact that all law-biding [sic] citizens share many common social attributes, whereas criminals tend to have very different characteristics and circumstances, some of which are quite unique of the individual’s own.

There are no grounds for the “fact” that criminals’ personalities vary more than noncriminals’, nor that this correlates to a greater variance in facial features. It’s not clear—and seems unlikely—that the criminals were sampled from the same population in the same manner as the noncriminals. Finally, what of the noncriminals? Are internet photos a truly representative sampling of people’s faces, or are they more likely to skew toward humans’ preferences for more beautiful and attractive faces, or faces shot in ways that make them seem more beautiful? In any event, it’s reasonable to expect a divergence between a set of mug shots and a set of photos culled from the internet, and it’s also reasonable to doubt that this divergence is reflective of any real physiological difference. The real classification happened at the onset of the study, with the division into criminal and noncriminal. The analysis begged the question.

Anthropologist and eugenicist Francis Galton’s classification of facial types of criminals and noncriminals, 1879

In this research, machine learning served to exaggerate certain feature differences and manufacture a classification. The mystique of data and “impartial” computation gives scientific veneer to this modern phrenology. The researchers boast that they use computers to neutralize human bias:

Relatively little study has been done on the accuracy of character inference bas
ed solely on still face images. This is probably due to, aside from the historical controversies surrounding the inquiry and stigmas associated with social Darwinism, the difficulty to neutralize all possible prejudice and preconditioning of human experimenters and subjects when assessing the accuracy of face-induced inference on socially charged matters such as criminality. In this work, we adopt the approach of data-driven machine learning to fully automate the assessment process, and purposefully take any subtle human factors out of the assessment process.

They have it backward. “Data laundering,” where human biases and predispositions are fed into algorithms in order to make them look “objective,” doesn’t remove those “subtle human factors.” Rather, it disguises and then amplifies them into harmful, binding taxonomies. This application of machine learning threatens to predetermine whether people are criminal simply based on the requirement to perform such a classification, whether differentiating factors are found or not.

Thus, people are stigmatized by computers. Sociologist Erving Goffman describes the function of stigma as follows:

By definition, of course, we believe the person with a stigma is not quite human. On this assumption we exercise varieties of discrimination, through which we effectively, if often unthinkingly, reduce his life chances. We construct a stigma theory, an ideology to explain his inferiority and account for the danger he represents, sometimes rationalizing an animosity based on other differences, such as those of social class. We use specific stigma terms such as cripple, bastard, moron in our daily discourse as a source of metaphor and imagery, typically without giving thought to the original meaning.

‹ Prev Next ›