The Information

Home > Science > The Information > Page 45
The Information Page 45

by James Gleick


  Is it signaling, like telegraphs? Is it Zen poetry? Is it jokes scribbled on the washroom wall? Is it John Hearts Mary carved on a tree? Let’s just say it’s communication, and communication is something human beings like to do.♦

  Shortly thereafter, the Library of Congress, having been founded to collect every book, decided to preserve every tweet, too. Possibly undignified, and probably redundant, but you never know. It is human communication.

  And the network has learned a few things that no individual could ever know.

  It identifies CDs of recorded music by looking at the lengths of their individual tracks and consulting a vast database, formed by accretion over years, by the shared contributions of millions of anonymous users. In 2007 this database revealed something that had eluded distinguished critics and listeners: that more than one hundred recordings released by the late English pianist Joyce Hatto—music by Chopin, Beethoven, Mozart, Liszt, and others—were actually stolen performances by other pianists. MIT established a Center for Collective Intelligence, devoted to finding group wisdom and “harnessing” it. It remains difficult to know when and how much to trust the wisdom of crowds—the title of a 2004 book by James Surowiecki, to be distinguished from the madness of crowds as chronicled in 1841 by Charles Mackay, who declared that people “go mad in herds, while they recover their senses slowly, and one by one.”♦ Crowds turn all too quickly into mobs, with their time-honored manifestations: manias, bubbles, lynch mobs, flash mobs, crusades, mass hysteria, herd mentality, goose-stepping, conformity, groupthink—all potentially magnified by network effects and studied under the rubric of information cascades. Collective judgment has appealing possibilities; collective self-deception and collective evil have already left a cataclysmic record. But knowledge in the network is different from group decision making based on copying and parroting. It seems to develop by accretion; it can give full weight to quirks and exceptions; the challenge is to recognize it and gain access to it. In 2008, Google created an early warning system for regional flu trends based on data no firmer than the incidence of Web searches for the word flu; the system apparently discovered outbreaks a week sooner than the Centers for Disease Control and Prevention. This was Google’s way: it approached classic hard problems of artificial intelligence—machine translation and voice recognition—not with human experts, not with dictionaries and linguists, but with its voracious data mining of trillions of words in more than three hundred languages. For that matter, its initial approach to searching the Internet relied on the harnessing of collective knowledge.

  Here is how the state of search looked in 1994. Nicholson Baker—in a later decade a Wikipedia obsessive; back then the world’s leading advocate for the preservation of card catalogues, old newspapers, and other apparently obsolete paper—sat at a terminal in a University of California library and typed, BROWSE SU[BJECT] CENSORSHIP.♦ He received an error message,

  LONG SEARCH: Your search consists of one or more very common words, which will retrieve over 800 headings and take a long time to complete,

  and a knuckle rapping:

  Long searches slow the system down for everyone on the catalog and often do not produce useful results. Please type HELP or see a reference librarian for assistance.

  All too typical. Baker mastered the syntax needed for Boolean searches with complexes of ANDs and ORs and NOTs, to little avail. He cited research on screen fatigue and search failure and information overload and admired a theory that electronic catalogues were “in effect, conducting a program of ‘aversive operant conditioning’ ” against online search.

  Here is how the state of search looked two years later, in 1996. The volume of Internet traffic had grown by a factor of ten each year, from 20 terabytes a month worldwide in 1994 to 200 terabytes a month in 1995, to 2 petabytes in 1996. Software engineers at the Digital Equipment Corporation’s research laboratory in Palo Alto, California, had just opened to the public a new kind of search engine, named AltaVista, continually building and revising an index to every page it could find on the Internet—at that point, tens of millions of them. A search for the phrase truth universally acknowledged and the name Darcy produced four thousand matches. Among them:

  The complete if not reliable text of Pride and Prejudice, in several versions, stored on computers in Japan, Sweden, and elsewhere, downloadable free or, in one case, for a fee of $2.25.

  More than one hundred answers to the question, “Why did the chicken cross the road?” including “Jane Austen: Because it is a truth universally acknowledged that a single chicken, being possessed of a good fortune and presented with a good road, must be desirous of crossing.”

  The statement of purpose of the Princeton Pacific Asia Review: “The strategic importance of the Asia Pacific is a truth universally acknowledged …”

  An article about barbecue from the Vegetarian Society UK: “It is a truth universally acknowledged among meat-eaters that …”

  The home page of Kevin Darcy, Ireland. The home page of Darcy Cremer, Wisconsin. The home page and boating pictures of Darcy Morse. The vital statistics of Tim Darcy, Australian footballer. The résumé of Darcy Hughes, a fourteen-year-old yard worker and babysitter in British Columbia.

  Trivia did not daunt the compilers of this ever-evolving index. They were acutely aware of the difference between making a library catalogue—its target fixed, known, and finite—and searching a world of information without boundaries or limits. They thought they were onto something grand. “We have a lexicon of the current language of the world,”♦ said the project manager, Allan Jennings.

  Then came Google. Brin and Page moved their fledgling company from their Stanford dorm rooms into offices in 1998. Their idea was that cyberspace possessed a form of self-knowledge, inherent in the links from one page to another, and that a search engine could exploit this knowledge. As other scientists had done before, they visualized the Internet as a graph, with nodes and links: by early 1998, 150 million nodes joined by almost 2 billion links. They considered each link as an expression of value—a recommendation. And they recognized that all links are not equal. They invented a recursive way of reckoning value: the rank of a page depends on the value of its incoming links; the value of a link depends on the rank of its containing page. Not only did they invent it, they published it. Letting the Internet know how Google worked did not hurt Google’s ability to leverage the Internet’s knowledge.

  At the same time, the rise of this network of all networks was inspiring new theoretical work on the topology of interconnectedness in very large systems. The science of networks had many origins and evolved along many paths, from pure mathematics to sociology, but it crystallized in the summer of 1998, with the publication of a letter to Nature from Duncan Watts and Steven Strogatz. The letter had three things that combined to make it a sensation: a vivid catchphrase, a nice result, and a surprising assortment of applications. It helped that one of the applications was All the World’s People. The catchphrase was small world. When two strangers discover that they have a mutual friend—an unexpected connection—they may say, “It’s a small world,” and it was in this sense that Watts and Strogatz talked about small-world networks.

  The defining quality of a small-world network is the one unforgettably captured by John Guare in his 1990 play, Six Degrees of Separation. The canonical explanation is this:

  I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation. Between us and everyone else on this planet. The President of the United States. A gondolier in Venice. Fill in the names.♦

  The idea can be traced back to a 1967 social-networking experiment by the Harvard psychologist Stanley Milgram and, even further, to a 1929 short story by a Hungarian writer, Frigyes Karinthy, titled “Láncszemek”—Chains.♦ Watts and Strogatz took it seriously: it seems to be true, and it is counterintuitive, because in the kinds of networks they studied, nodes tended to be highly clustered. They are cliquish. You may know many p
eople, but they tend to be your neighbors—in a social space, if not literally—and they tend to know mostly the same people. In the real world, clustering is ubiquitous in complex networks: neurons in the brain, epidemics of infectious disease, electric power grids, fractures and channels in oil-bearing rock. Clustering alone means fragmentation: the oil does not flow, the epidemics sputter out. Faraway strangers remain estranged.

  But some nodes may have distant links, and some nodes may have an exceptional degree of connectivity. What Watts and Strogatz discovered in their mathematical models is that it takes astonishingly few of these exceptions—just a few distant links, even in a tightly clustered network—to collapse the average separation to almost nothing and create a small world.♦ One of their test cases was a global epidemic: “Infectious diseases are predicted to spread much more easily and quickly in a small world; the alarming and less obvious point is how few short cuts are needed to make the world small.”♦ A few sexually active flight attendants might be enough.

  In cyberspace, almost everything lies in the shadows. Almost everything is connected, too, and the connectedness comes from a relatively few nodes, especially well linked or especially well trusted. However, it is one thing to prove that every node is close to every other node; that does not provide a way of finding the path between them. If the gondolier in Venice cannot find his way to the president of the United States, the mathematical existence of their connection may be small comfort. John Guare understood this, too; the next part of his Six Degrees of Separation explanation is less often quoted:

  I find that A) tremendously comforting that we’re so close, and B) like Chinese water torture that we’re so close. Because you have to find the right six people to make the connection.

  There is not necessarily an algorithm for that.

  The network has a structure, and that structure stands upon a paradox. Everything is close, and everything is far, at the same time. This is why cyberspace can feel not just crowded but lonely. You can drop a stone into a well and never hear a splash.

  No deus ex machina waits in the wings; no man behind the curtain. We have no Maxwell’s demon to help us filter and search. “We want the Demon, you see,” wrote Stanislaw Lem, “to extract from the dance of atoms only information that is genuine, like mathematical theorems, fashion magazines, blueprints, historical chronicles, or a recipe for ion crumpets, or how to clean and iron a suit of asbestos, and poetry too, and scientific advice, and almanacs, and calendars, and secret documents, and everything that ever appeared in any newspaper in the Universe, and telephone books of the future.”♦ As ever, it is the choice that informs us (in the original sense of that word). Selecting the genuine takes work; then forgetting takes even more work. This is the curse of omniscience: the answer to any question may arrive at the fingertips—via Google or Wikipedia or IMDb or YouTube or Epicurious or the National DNA Database or any of their natural heirs and successors—and still we wonder what we know.

  We are all patrons of the Library of Babel now, and we are the librarians, too. We veer from elation to dismay and back. “When it was proclaimed that the Library contained all books,” Borges tells us, “the first impression was one of extravagant happiness. All men felt themselves to be the masters of an intact and secret treasure. There was no personal or world problem whose eloquent solution did not exist in some hexagon. The universe was justified.”♦ Then come the lamentations. What good are the precious books that cannot be found? What good is complete knowledge, in its immobile perfection? Borges worries: “The certitude that everything has been written negates us or turns us into phantoms.” To which, John Donne had replied long before, “He that desires to print a book, should much more desire, to be a book.”♦

  The library will endure; it is the universe. As for us, everything has not been written; we are not turning into phantoms. We walk the corridors, searching the shelves and rearranging them, looking for lines of meaning amid leagues of cacophony and incoherence, reading the history of the past and of the future, collecting our thoughts and collecting the thoughts of others, and every so often glimpsing mirrors, in which we may recognize creatures of the information.

  Acknowledgments

  I am indebted and grateful to Charles H. Bennett, Gregory J. Chaitin, Neil J. A. Sloane, Susanna Cuyler, Betty Shannon, Norma Barzman, John Simpson, Peter Gilliver, Jimmy Wales, Joseph Straus, Craig Townsend, Janna Levin, Katherine Bouton, Dan Menaker, Esther Schor, Siobhan Roberts, Douglas Hofstadter, Martin Seligman, Christopher Fuchs, the late John Archibald Wheeler, Carol Hutchins, and Betty Alexandra Toole; also my agent, Michael Carlisle, and, as always, for his brilliance and his patience, my editor, Dan Frank.

  Notes

  PROLOGUE

  ♦ MY MIND WANDERS AROUND: Robert Price, “A Conversation with Claude Shannon: One Man’s Approach to Problem Solving,” IEEE Communications Magazine 22 (1984): 126.

  ♦ TRANSISTOR … BIT: The committee got transistor from John R. Pierce; Shannon got bit from John W. Tukey.

  ♦ SHANNON SUPPOSEDLY BELONGED: Interview, Mary Elizabeth Shannon, 25 July 2006.

  ♦ BY 1948 MORE THAN 125 MILLION: Statistical Abstract of the United States 1950. More exactly: 3,186 radio and television broadcasting stations, 15,000 newspapers and periodicals, 500 million books and pamphlets, and 40 billion pieces of mail.

  ♦ CAMPBELL’S SOLUTION: George A. Campbell, “On Loaded Lines in Telephonic Transmission,” Philosophical Magazine 5 (1903): 313.

  ♦ “THEORIES PERMIT CONSCIOUSNESS TO ‘JUMP OVER ITS OWN SHADOW’ ”: Hermann Weyl, “The Current Epistemological Situation in Mathematics” (1925), quoted in John L. Bell, “Hermann Weyl on Intuition and the Continuum,” Philosophia Mathematica 8, no. 3 (2000): 261.

  ♦ “SHANNON WANTS TO FEED NOT JUST DATA”: Andrew Hodges, Alan Turing: The Enigma (London: Vintage, 1992), 251.

  ♦ “OFF AND ON … I HAVE BEEN WORKING”: Letter, Shannon to Vannevar Bush, 16 February 1939, in Claude Elwood Shannon, Collected Papers, ed. N. J. A. Sloane and Aaron D. Wyner (New York: IEEE Press, 1993), 455.

  ♦ “NOWE USED FOR AN ELEGANT WORDE”: Thomas Elyot, The Boke Named The Governour (1531), III: xxiv.

  ♦ “MAN THE FOOD-GATHERER REAPPEARS”: Marshall McLuhan, Understanding Media: The Extensions of Man (New York: McGraw-Hill, 1965), 302.

  ♦ “WHAT LIES AT THE HEART OF EVERY LIVING THING”: Richard Dawkins, The Blind Watchmaker (New York: Norton, 1986), 112.

  ♦ “THE INFORMATION CIRCLE BECOMES THE UNIT OF LIFE”: Werner R. Loewenstein, The Touchstone of Life: Molecular Information, Cell Communication, and the Foundations of Life (New York: Oxford University Press, 1999), xvi.

  ♦ “EVERY IT—EVERY PARTICLE, EVERY FIELD OF FORCE”: John Archibald Wheeler, “It from Bit,” in At Home in the Universe (New York: American Institute of Physics, 1994), 296.

  ♦ “THE BIT COUNT OF THE COSMOS”: John Archibald Wheeler, “The Search for Links,” in Anthony J. G. Hey, ed., Feynman and Computation (Boulder, Colo.: Westview Press, 2002), 321.

  ♦ “NO MORE THAN 10120 OPS”: Seth Lloyd, “Computational Capacity of the Universe,” Physical Review Letters 88, no. 23 (2002).

  ♦ “TOMORROW … WE WILL HAVE LEARNED TO UNDERSTAND”: John Archibald Wheeler, “It from Bit,” 298.

  ♦ “IT IS HARD TO PICTURE THE WORLD BEFORE SHANNON”: John R. Pierce, “The Early Days of Information Theory,” IEEE Transactions on Information Theory 19, no. 1 (1973): 4.

  ♦ “NUMBERS TOO, CHIEFEST OF SCIENCES”: Aeschylus, Prometheus Bound, trans. H. Smyth, 460–61.

  ♦ “THE INVENTION OF PRINTING, THOUGH INGENIOUS”: Thomas Hobbes, Leviathan (London: Andrew Crooke, 1660), ch. 4.

  1. DRUMS THAT TALK

  ♦ “ACROSS THE DARK CONTINENT SOUND”: Irma Wassall, “Black Drums,” Phylon Quarterly 4 (1943): 38.

  ♦ “MAKE YOUR FEET COME BACK”: Walter J. Ong, Interfaces of the Word (Ithaca, N.Y.: Cornell University Press, 1977), 105.

  ♦ IN 1730 FRANCIS MOORE SAILED
EASTWARD: Francis Moore, Travels into the Inland Parts of Africa (London: J. Knox, 1767).

  ♦ “SUDDENLY HE BECAME TOTALLY ABSTRACTED”: William Allen and Thomas R. H. Thompson, A Narrative of the Expedition to the River Niger in 1841, vol. 2 (London: Richard Bentley, 1848), 393.

  ♦ A MISSIONARY, ROGER T. CLARKE: Roger T. Clarke, “The Drum Language of the Tumba People,” American Journal of Sociology 40, no. 1 (1934): 34–48.

  ♦ “VERY OFTEN ARRIVING BEFORE THE MESSENGERS”: G. Suetonius Tranquillus, The Lives of the Caesars, trans. John C. Rolfe (Cambridge, Mass.: Harvard University Press, 1998), 87.

  ♦ “YET WHO SO SWIFT COULD SPEED THE MESSAGE”: Aeschylus, Agamemnon, trans. Charles W. Eliot, 335.

  ♦ A GERMAN HISTORIAN, RICHARD HENNIG: Gerard J. Holzmann and Björn Pehrson, The Early History of Data Networks (Washington, D.C.: IEEE Computer Society, 1995), 17.

  ♦ A “CONCEIT … WHISPERED THOROW THE WORLD”: Thomas Browne, Pseudoxia Epidemica: Or, Enquiries Into Very Many Received Tenents, and Commonly Presumed Truths, 3rd ed. (London: Nath. Ekins, 1658), 59.

  ♦ IN ITALY A MAN TRIED TO SELL GALILEO: Galileo Galilei, Dialogue Concerning the Two Chief World Systems: Ptolemaic and Copernican, trans. Stillman Drake (Berkeley, Calif.: University of California Press, 1967), 95.

  ♦ “A SYSTEM OF SIGNS FOR LETTERS”: Samuel F. B. Morse: His Letters and Journals, vol. 2, ed. Edward Lind Morse (Boston: Houghton Mifflin, 1914), 12.

  ♦ “THE DICTIONARY OR VOCABULARY CONSISTS OF WORDS”: U. S. Patent 1647, 20 June 1840, 6.

  ♦ “THE SUPERIORITY OF THE ALPHABETIC MODE”: Samuel F. B. Morse, letter to Leonard D. Gale, in Samuel F. B. Morse: His Letters and Journals, vol. 2, 65.

 

‹ Prev