enum TRAFFICLIGHT_STATE {
RED = 1,
YELLOW = 2,
GREEN = 3
}
The computer has no conception of what it’s doing when a particular spot in its memory shifts from one (RED) to three (GREEN), but if that computer is hooked up to a traffic light controller, such a change would be meaningful to us as designating the change of a traffic light from red to green.
That gap between seeing data as numbers and seeing data for what it represents is the fundamental difference between computer language and human language. It is the difference between a rote encoding and a meaningful conversational tool. As computers increasingly facilitate and even dominate the socioeconomic fabric of our world, closing the gap between those two kinds of languages has become an increasingly urgent concern of computer science and software engineering. If computers could bridge the gap and understand the world the way we do, we could off-load our own mental labor onto it, and perhaps leave computers to do a better job of managing large-scale economic and organizational problems. But it is easy to underestimate that gap if you are a techno-optimist—just as easy as it is to think it is unbridgeable if you are a techno-pessimist.
What’s the secret to creating a language in which computers do understand what their data represents? In a nutshell, this is the problem of artificial intelligence.
We ourselves are discovering that we do not quite understand how we talk about things. Our brains are hard-wired to learn human language. We are born with an innate capacity for it, which other animals lack. Humans have also learned how to speak the codes of machines, that is, mathematics and logic. As we try to drag computers into the realm of human language and clumsily try to teach them a capacity for understanding, they are teaching us to think in their codes.
Much of the nuts and bolts of data processing today is done not in the arcane and ingenious algorithms that analyze data, but in the labeling required for those algorithms to be useful to us. The process of labeling is more accurately a process of encoding; assigning discrete identifiers to the hazy concepts that permit us to function in the world. We do not yet know how to program a computer to understand our conception of a thing like “dog.” A concept like “dog” is a loose-knit collection of facts and images that varies from person to person. All the computer knows, to begin with, is the word itself. We encode this concept in three letters: “DOG.” A computer would represent it as three numbers for the ASCII equivalent of those letters—68 79 71—or perhaps it would encode it in some other numerical scheme in a specific program: say a pet-classifying program, where dogs are 1, cats are 2, turtles are 3, and so on. Regardless, the computer has no conception of a dog, just the encoding for the label assigned to the concept. We need to encode the world before computers can process it.
The word “code” can mean either a series of programming instructions or the labels that an encoding contains. To keep things clear, I will use the word “label” to refer to the elements of an encoding rather than “code” or “encoding.”
Computers have us thinking so much in labels today that we must remember we were the ones who assigned those labels, rather arbitrarily, in the first place. We take them for granted, but they are merely a matter of agreed-upon—or disputed—convention. Today, the origins of a complex encoding such as Western musical notation are obscured, the notation taken for granted. Even the Western alphabet, which we take to be fixed, arrived at its current stable form only because too many people were using it for local changes to remain feasible. The histories of most encodings are so gnarled that it is difficult to see past their complexities to understand how labels shape life. So consider two of the oldest and most universal labels of all.
Male and Female
The words come rolling up to us, we must be careful not to get run over.
—ALFRED DÖBLIN, Berlin Alexanderplatz
Some questions have more definite answers than others. There are questions whose answers lie in some underlying fact of the matter, like “Does the Earth go around the Sun or vice versa?” There are questions like “How many planets are there in our solar system?” whose answers change as criteria and terminology evolve. And there are questions where it’s not clear which facts are relevant, like “When did the Roman Empire fall?” (which yields answers ranging from “476 CE” to “It never did”) and “Is it proper English grammar to wantonly split infinitives?” (where some will say “Of course” and others will say “It sounds awful and déclassé”). We make answers to these last kinds of questions, arguing over them until there is either an authoritative consensus or people agree to disagree.
Computers, generally, do not distinguish between these types of questions. Their data rarely contains epistemological caveats. They present it back to us exactly how it was stored, which is usually with stark certainty. We tend not to turn to computers for those last sort of questions, but the ubiquity of computers forces us to answer some of those questions before they can process our data. Here is one such question: “How many genders are there?”
Western societies increasingly recognize divisions between gender and biological sex. That is, one’s gender may not have any necessary relation to one’s physiology or biology, even though it frequently correlates. But what are the criteria for gender? There is no consensus.*2 The debate around this question has split the term “gender” into many subcategories: gender identity, gender role, gender presentation, gender expression. Yet we are so incessantly identified by our sex and/or gender that we are all forced to pick one label under which we will be classified. Most people are assigned the label of male or female at birth and stick with it. Some do not. Outside the two dominant categories, today there are various flavors of transgenders, genderqueers, agenders, androgynes, pomosexuals, and others—to say nothing of intersex people, who despite being classified by physiology are often spoken of as being their own gender. Science does not provide definitive answers here. If, as the World Health Organization states, gender is a social construct that varies by culture and era, there is no scientific fact as to how many genders there are. There are only people’s opinions as to how many there ought to be.
Computers can easily handle the addition of new categories, simply by extending the encodings of sex or gender to encompass whatever labels we think they should contain. When Facebook changed their list of gender choices from two to fifty-one in 2014, then subsequently added seven more (twenty more in the UK), Facebook wasn’t trying to prescribe a new taxonomy of gender. But it did just that. With approximately a billion users for whom they are setting the rules, Facebook can’t avoid being prescriptive. That’s why activist groups lobbied Facebook to, first, add additional genders and then to let people type in their own. Google, whose social network does not hold much sway in our lives, merely offers “Male,” “Female,” and “Other,” but Facebook at the time of this writing offers:
Agender
Androgyne
Androgynous
Bigender
Cis
Cis Female
Cis Male
Cis Man
Cis Woman
Cisgender
Cisgender Female
Cisgender Male
Cisgender Man
Cisgender Woman
Female
Female to Male
FTM
Gender Fluid
Gender Nonconforming
Gender Questioning
Gender Variant
Genderqueer
Intersex
Male to Female
Male
MTF
Neither
Neutrois
Non-binary
Other
Pangender
Trans
Trans*
Transsexual
Transsexual Female
Transsexual Male
Transsexual Man
Transsexual Person
Transsexual Woman
Trans Female
Trans* Female
Trans Male
Trans* Male
Trans Man
Trans* Man
Trans Person
Trans* Person
Trans Woman
Trans* Woman
Transfeminine
Transgender Female
Transgender Male
Transgender Man
Transgender Person
Transgender Woman
Transmasculine
Two-spirit
Facebook originally forced users to select exclusively from this list, but subsequently let users type in whatever terms they want, though the interface strongly discourages doing so. (It took me a few minutes to figure out how to do it.) Users may choose up to ten terms to describe themselves, with no restrictions on contradictions. I’ve set mine to “Male, Female, Neither, and None.” By offering a strictly defined set of two, fifty-one, fifty-seven, seventy-one, or any other number of genders and mandating that every user choose one, Facebook ensures that its data analysis will be neat. Without a default set, Facebook would be left with a long tail of rare, sometimes unique, and unstandardized genders. The data would be messier and harder to analyze.
Facebook does not, however, encourage going outside the dominant two categories. Users have to choose male or female when they sign up for an account, then they must change their gender on their profile pages. And Facebook asks users to choose a preferred pronoun, for which there are only three options: male, female, and “neutral” (“they”). As we’ll see later, for all the praise and criticism that Facebook’s expansion has received from activists on both sides of the fence, Facebook’s ultimate commitment to these new labels is rather superficial indeed.
Facebook tried to accommodate an increasingly flexible gender taxonomy. Computers are not terribly good with flexibility. Yet there is a feedback mechanism at work here. The demand to explicitly categorize one’s self within a particular typology, whether to Facebook or to corporate management, is apt to foster dissatisfaction with that typology. Gender is a construct that is continually reified—but also continually criticized.
The danger of reinforcing one specific typology is that typologies come to look antiquated and prejudiced in the long run. Our racial categories may very well evolve from what they are right now, and a lot of current thinking about race may be considered outdated and ignorant in a few years’ time. “Conservative” and “Liberal,” labels that Facebook secretly assigns to its users, have very different meanings today than they did twenty-five years ago; their definitions will undoubtedly continue to evolve. In such a dynamic world, computers paradoxically enable us to revise and refine our categorizations even as they insist that we continue to make those classifications. To be described to a computer is to be described by labels. To be described by labels is to make a selection from categories. The implicit is not so much disallowed as it is overshadowed. Data prefers to be explicit.
Some of these typologies are obscured from users, who are quietly classified without their knowledge. But like gender, many of these labels are unstable. Facebook deems me a member of Generation X, which it defines as people having been born between 1961 and 1981. A quick glance at the literature shows the end date as early as 1964 and as late as 1984. Growing up, I was told I was too young to be Generation X, but at some point, the collective hivebrain of marketers decided differently. I was not Generation Y (later redubbed millennials); I was Generation X after all. If they change their minds again, will Facebook catch up? More likely, Facebook itself will help select the particular date range by virtue of their sheer market power. It’s not the truth; it’s just the force of Facebook’s version of the truth. Facebook is now trying to determine the difference between “fake news” and “real news,” and between “political information” and “political propaganda.” Those distinctions, too, are Facebook’s versions of the truth, defined not only by Facebook but by third parties trying to shape and influence users through Facebook advertising and applications.
Moreover, discrete categories ignore variation within those categories. Here’s an example. I am one of the proud 10 percent of left-handed people, as are both my siblings and one of my children. Both of my parents are right-handed, though my father, like many of his generation, was forced to write (rather poorly) with his right hand in school, so we believe that he is latently left-handed. I associate being left-handed with smearing ink and graphite on my pinky finger and with uncomfortable right-handed desk-chairs.*3 Let us say that Facebook requires us to specify our handedness, and let us say that this is something I care about. I would certainly select left-handed, but maybe I would then want to elaborate and explain that I do not think the dominance is strong. My right arm and hand are stronger than my left. Whereas my left hand suggests dominance by my right brain, my right eye is my dominant eye, governed by my left brain. In first grade, frustrated with the terrible single pair of left-handed scissors in the classroom, I forced myself to use the crisp, new right-handed scissors they had, and I never went back. I use a mouse with my right hand but tend to use the phone with my left. None of this changes my self-classification and societal classification as left-handed. But if society were, for whatever reason, to start caring about and refining those labels, as it has with gender and race, those nuances would be impossible to capture on Facebook.
Cultural change drives the revision of labels. A stagnant culture allows for static (and stagnant) labels. A fluid culture permits the current classification to be torn up and replaced by the next, equally contingent classification. Stasis gives the illusion of permanence. And the more a society reinforces particular taxonomies, the more inertia these taxonomies create against social change.
As a consequence of mass media and global interconnectedness, today’s societies are far more dynamic than their predecessors. Subcultures meet, exchange ideas, and evolve at a rate that was unknown before the invention of the railroad, and that was subsequently amplified by the automobile and airplane. The internet is only the latest upshift to the speed of cultural exchange. We are constantly inundated with new cultural material that obscures the fact that a great deal of humanity’s social heritage emerged from vastly more static societies. These older heritages will not adapt to our current high-velocity culture without severe modifications.
People may not be using terms like “pangender” or “biracial” in fifty years, much less two thousand. But a label like “Confucian,” which calmly escorted the regimented bureaucracy of the Chinese empire, managed to retain a complex conceptual meaning for centuries.*4 Is it possible for anything to have that sort of longevity today? Since computers demand that we label ourselves, our beliefs, and everything in the world around us, how will technology impact the labels that we use today, and how durable will those labels prove to be?
Masterminds and Crackpots
In some sciences, the endeavor to discover a universal principle may often be just as fruitless as the endeavor of a mineralogist to discover some primary universal element through the compounding of which all minerals arose. Nature creates neither genera nor species, but individua, and in our shortsightedness, we must seek out similarities to be able to retain many things simultaneously. These concepts becom
e more and more inaccurate the broader the categories are which we create.
—GEORG CHRISTOPH LICHTENBERG
My name is David, and I am a green. I discovered this at a corporate training retreat where the hosts asked me and my coworkers to take a test that classified each of us as one of four personality types. They were identified by color: blue, green, gold, orange. There are many variations on this sort of classification, but this test described the colors more or less like this:
GREEN: Analytical, logical, theoretical, introverted engineer
BLUE: Harmony-oriented, compassionate peacemaker
GOLD: Traditional, conventional, structure-loving bureaucrat
ORANGE: Outgoing, creative, upbeat, visionary leader
Bitwise Page 10