Bitwise

Home > Other > Bitwise > Page 24
Bitwise Page 24

by David Auerbach


  9

  BIG HUMAN

  The Vacuum Cleaner

  Technology’s primary effect is to amplify human forces.

  —KENTARO TOYAMA, Geek Heresy

  IF THE 2000S were the decade of Google, the 2010s have been the decade of Facebook—and social media more generally. The internet’s infancy of unstructured information gave way to an adolescence of clumsy, crude social interaction. Facebook’s revenues ($40 billion in 2017) remain only a third of Google’s ($110 billion in 2017) and half of Microsoft’s ($89 billion), less than a quarter of Amazon’s ($178 billion), and all pale next to Apple’s $229 billion. Yet Facebook has had a more powerful effect on transforming the web in the last ten years than any of those other companies. The web is becoming less centered on pages and more centered on people—more specifically, computational representations of people.

  In retrospect, it was inevitable. Friendster was so hapless that it seemed like only a novelty, and MySpace was popular but ugly and underdeveloped, yet from today’s vantage it seems impossible that a Facebook would not come along to colonize the web and the world alike. I never worked at Facebook, and so my attitude toward it is a bit like that of a land-based dinosaur observing the birds, scrawny yet with the power of flight, with a mixture of curiosity and condescension. I’m not sure that Facebook itself will outlast Google or Apple, but Facebook deployed something that will be with us for a very long time: the datafication of humans and, more significantly, our identification with our digital representations.

  In the 1990s, anonymity ruled the web. Online celebrity was still an oxymoron. Any biographical details added to a personal web page were frequently cursory and likely to never be noticed. The early web was better suited to the spread of information than to making social connections. For many years I blogged pseudonymously, enjoying the ability to create and write in whatever voice I chose.*1 Many others did the same, whether anonymously, pseudonymously, or under their real—but frequently unknown—names. Beginning in 2000, the tireless culture blogger Mark Woods maintained a daily commonplace book of art and literature on his wood s lot blog until he passed away in 2017, revealing little of himself beyond his name. On his Frequently Asked Questions page, there was only a single blurry photo of him on the beach, and an extract from Wallace Stevens’s “Peter Quince at the Clavier.” For the few thousands of us who read him, he was someone we knew. We related not through personal details or life events, but by the sharing of our enthusiasms. Conversation was often more implicit than explicit. I too wrote my blog as an outpost away from the world, not as a reflection of it.

  By 2010, people were beginning to trump content. Information became increasingly centralized on large sites like Wikipedia, Amazon, and Google, and digital identity was the new game: how to tie this diffuse, online information to identifiable individuals. Google’s link graph, which mapped the influence relationships between web pages, was succeeded by Friendster, MySpace, and Facebook’s friend graph, which charted the social relationships between human beings. Facebook, Twitter, and Google+ were not only social networks. They were identity services, attempts to bind individuals permanently to public or semipublic online identities that would be managed by corporations.

  Having left Google in 2008 to return to being a digital civilian, I was wary of these new developments. I dislike declaring any affiliation or affinity, lest I be held to it. But I seemed to be in the minority. Millions, particularly the young, signed up on MySpace, Facebook, Instagram, and elsewhere in order to publish their lives, demographics, and tastes, much as Google had classified web pages by their words and Amazon had classified products by who bought them.

  The categorization and taxonomizing of human beings was not itself a new trend. Throughout the twentieth century, critics of modernity, industrialization, and capitalism, from Georg Simmel to Lewis Mumford to Jane Jacobs, had bemoaned that society was boxing people and organizing them by the work that they did. Simmel observed in his 1900 book The Philosophy of Money that the interchangeability of labor was generic to any large-scale economy, capitalist or not. Socialism, he suggested, required an even more generic treatment of labor, because a centrally planned economy would be less able to accommodate individual variation than a decentralized one.*2 The fear of automation began with the industrial revolution and accelerated with the introduction of computers, as more and more kinds of human labor began to be performed by machines. Yet computers are uniquely able to track the variations among hundreds of millions of people. This was only a distant vision, since meaningfully analyzing data on billions of people remains an extremely difficult task, but computers enabled the possibility of a digitally micromanaged society. While the industrial revolution and the advent of the assembly line generated coarse-grained classifications broken down by job requirements, the emergence of mass computation in the latter part of the twentieth century enabled large-scale, centralized classification of individuals.

  The computerized shift toward micromanagement was driven by national defense and advertising. Advertising refined and enhanced its demographic analysis of consumer segments,*3 while governments initiated the trend toward computational representation and analysis of individuals on a mass scale. The National Security Agency, in the wake of September 11, 2001, initiated its “vacuum cleaner” approach by amassing as much data as it could in order to weed out any and all national security threats. It ended up with an ever-growing haystack containing a handful of real needles and a tremendous number of fake needles. The NSA’s vacuum cleaner anticipated what would come to be called the quantified self: track everything in the hopes of learning something.

  The NSA, alongside the FBI and the CIA and the TSA, assembled profiles as part of the No-Fly List, the Terrorist Screening Database, the Computer-Assisted Passenger Prescreening System, and others. It built these lists with the aid of programs like PRISM, XKeyscore, and MUSCULAR, which were designed to surveil and assemble information about specific people. This was hardly a new approach for intelligence agencies trying to identify people of interest, but these programs enabled unprecedented scope. Before the computer age, it was not feasible for organizations to compile dossiers on unlikely or irrelevant targets, because it was difficult to track such large quantities of data on paper. Storage would have been a problem, as would trying to search through the data. Consequently, the selection of what was important and what was not had to take place before collection. But the dawn of big data removed any practical physical limitations on storage, and amassing lots of data was far easier than analyzing it. At the time of the September 11 attacks, the FBI was unable to search its databases for multiple words: it could search on “flight” and “school,” but not for “flight school.” This crippled the utility of their data.

  In Foreign Policy, Shane Harris described the approach of the director of the NSA under Presidents George W. Bush and Barack Obama, Keith Alexander:

  Alexander wants as much data as he can get. And he wants to hang on to it for as long as he can. To prevent the next terrorist attack, he thinks he needs to be able to see entire networks of communications and also go “back in time,” as he has said publicly, to study how terrorists and their networks evolve. To find the needle in the haystack, he needs the entire haystack.

  “Alexander’s strategy is the same as Google’s: I need to get all of the data,” says a former administration official who worked with the general. “If he becomes the repository for all that data, he thinks the resources and authorities will follow.”

  From an NSA document on the MUSCULAR program. According to the Washington Post, “Two engineers with close ties to Google exploded in profanity when they saw the drawing.”

  Alexander was wrong. Having the data was not enough. A 2010 Washington Post exposé, “Top Secret America,” revealed just how unprepared the NSA was to do the much harder job of analyzing their petabytes of data for the ri
ght signals. Half-baked tools like the “automatic ingestion manager” designed by NSA advisor and “mad scientist” James Heath were ill equipped to figure out the who and the what of the data, much less the how and why. Later documents leaked by Edward Snowden showed that NSA analysts actually begged the agency to stop collecting so much useless data. A 2010 UK report on MI5’s digital intelligence capabilities concluded the same thing: there was too much data and too little analysis.

  The Obama administration continued the trend of its predecessor, endorsing Alexander’s approach, amassing even more data while still remaining unable to process it all. James Bamford wrote in Foreign Policy, summing up the Obama years:

  Alexander asked, “Why can’t we collect all the signals all the time?” He applied this approach in Iraq, pulling intelligence from phone interceptions, planes, drones, satellites, and other sensors into a powerful computer analysis system known as the Real Time Regional Gateway. He also ran the NSA’s massive metadata surveillance program, which involved secretly keeping track of every phone in the United States: what numbers were called, from where, and exactly when—billions of communications each year….

  Privacy hasn’t been traded for security, but for the government hoarding more data than it knows how to handle. Kinne, the former intercept operator, described her work as “just like searching blindly through all these cuts to see what the hell was what.”

  Tracking “every phone in the United States” was impossible before the era of big data. In the twenty-first century, it was not just possible, but aggressively embraced. Companies like Facebook and Google still need users to bring their data to them, to a point. But government organizations show us where corporations are headed, as data consolidation continues apace.

  Once data is collected, there are no intrinsic restrictions on its use. The original purpose of this collection is to intercept communications and prevent terror, but because the data dragnet is so total, other uses opened up.

  In a top-secret memo dated Oct. 3, 2012, Alexander raised the possibility of using vulnerabilities discovered in mass data—“viewing sexually explicit material online,” for instance—to damage reputations. The agency could, say, smear individuals it believed were radicalizing others in an effort to diminish their influence.

  When government agencies—or corporations, for that matter—combine this aggressive anti-privacy stance with the inevitable mistakes profiling systems make, the potential for chaotic and ubiquitous abuse balloons.

  Profiles

  I was left with this surrogate mirror

  I thought: Who created this monster?

  —THE FALL, “Surrogate Mirage”

  In order to identify terror suspects, the NSA needed to classify everybody as a likely terrorist or an unlikely terrorist. In other words, they applied labels. Data collection and storage have become cheap, and the vacuum cleaner approach has been adopted by private corporations as well. Consumer profiling has long been a staple of marketing data providers like Experian and Acxiom, but Facebook has become a centralization point for the collection of personal information in order to target individual consumers. Facebook has sorted its users into a large number of categories and buckets, assigning them advertiser-friendly demographic labels. Here are some of the axes on which Facebook allows advertisers, data analysts, and other third parties to “microtarget” users:

  Location

  Age

  Generation

  Gender

  Language

  Education level

  Field of study

  School

  Ethnic affinity

  Income and net worth

  Home ownership and type

  Home value

  Property size

  Square footage of home

  Year home was built

  Household composition

  Users who have an anniversary within thirty days

  Users who are away from family or hometown

  Users who are friends with someone who has an anniversary, is newly married or engaged, recently moved, or has an upcoming birthday

  Users in long-distance relationships

  Users in new relationships

  Users who have new jobs

  Users who are newly engaged

  Users who are newly married

  Users who have recently moved

  Users who have birthdays soon

  Parents

  Expectant parents

  Mothers, divided by “type” (soccer, trendy, etc.)

  Users who are likely to engage in politics

  Conservatives and liberals

  Relationship status

  Employer

  Industry

  Job title

  Office type

  Interests

  Users who own motorcycles

  Users who plan to buy a car (and what kind/brand of car, and how soon)

  Users who bought auto parts or accessories recently

  Users who are likely to need auto parts or services

  Style and brand of car you drive

  Year car was bought

  Age of car

  How much money user is likely to spend on next car

  Where user is likely to buy next car

  How many employees your company has

  Users who own small businesses

  Users who work in management or are executives

  Users who have donated to charity (divided by type)

  Operating system

  Users who play canvas games

  Users who own a gaming console

  Users who have created a Facebook event

  Users who have used Facebook Payments

  Users who have spent more than average on Facebook Payments

  Users who administer a Facebook page

  Users who have recently uploaded photos to Facebook

  Internet browser

  Email service

  Early/late adopters of technology

  Expats (divided by what country they are from originally)

  Users who belong to a credit union, national bank, or regional bank

  Users who invest (divided by investment type)

  Number of credit lines

  Users who are active credit card users

  Credit card type

  Users who have a debit card

  Users who carry a balance on their credit card

  Users who listen to the radio

  Preference in TV shows

  Users who use a mobile device (divided by what brand they use)

  Internet connection type

  Users who recently acquired a smartphone or tablet

  Users who access the internet through a smartphone or tablet

  Users who use coupons

  Types of clothing user’s household buys

  Time of year user’s household shops most

  Users who are “heavy” buyers of beer, wine, or spirits

  Users who buy groceries (and what kinds)

  Users who buy beauty products

  Use
rs who buy allergy medications, cough/cold medications, pain relief products, and over-the-counter meds

  Users who spend money on household products

  Users who spend money on products for kids or pets, and what kinds of pets

  Users whose household makes more purchases than is average

  Users who tend to shop online (or off)

  Types of restaurants user eats at

  Kinds of stores user shops at

  Users who are “receptive” to offers from companies offering online auto insurance, higher education, or mortgages, and prepaid debit cards/satellite TV

  Length of time user has lived in house

 

‹ Prev