by Jon Gertner
“The fundamental problem of communication,” Shannon’s paper explained, “is that of reproducing at one point either exactly or approximately a message selected at another point.” Perhaps that seemed obvious, but Shannon went on to show why it was profound. If “universal connectivity” remained the goal at Bell Labs—if indeed the telecommunications systems of the future, as Kelly saw it, would be “more like the biological systems of man’s brain and nervous system”—then the realization of those dreams didn’t only depend on the hardware of new technologies, such as the transistor. A mathematical guide for the system’s engineers, a blueprint for how to move data around with optimal efficiency, which was what Shannon offered, would be crucial, too. Shannon maintained that all communications systems could be thought of in the same way, regardless of whether they involved a lunchroom conversation, a postmarked letter, a phone call, or a radio or telephone transmission. Messages all followed the same fairly simple pattern:
All messages, as they traveled from the information source to the destination, faced the problem of noise. This could be the background clatter of a cafeteria, or it could be static (on the radio) or snow (on television). Noise interfered with the accurate delivery of the message. And every channel that carried a message was, to some extent, a noisy channel.
To a non-engineer, Shannon’s drawing seemed sensible but didn’t necessarily explain anything. His larger point, however, as he proved in his mathematical proofs, was that there were ways to make sure messages got where they were supposed to, clearly and reliably so. The first place to start, Shannon suggested, was to think about the information within a message. The semantic aspects of communication were irrelevant to the engineering problem, he wrote. Or to say it another way: One shouldn’t necessarily think of information in terms of meaning. Rather, one might think of it in terms of its ability to resolve uncertainty. Information provided a recipient with something that was not previously known, was not predictable, was not redundant. “We take the essence of information as the irreducible, fundamental underlying uncertainty that is removed by its receipt,” a Bell Labs executive named Bob Lucky explained some years later.30 If you send a message, you are merely choosing from a range of possible messages. The less the recipient knows about what part of the message comes next, the more information you are sending. Some language choices, Shannon’s research suggested, occur with certain probabilities, and some messages are more frequent than others; this fact could therefore lead to precise calculations of how much information these words or messages contained.31 (Shannon’s favorite example was to explain that one might need to know that the word “quality” begins with q, for instance, but not that a u follows after. The u gives a recipient no information if they already have the q, since u always follows q; it can be filled in by the recipient.)
Shannon suggested it was most useful to calculate a message’s information content and rate in a term that he suggested engineers call “bits”—a word that had never before appeared in print with this meaning. Shannon had borrowed it from his Bell Labs math colleague John Tukey as an abbreviation of “binary digits.” The bit, Shannon explained, “corresponds to the information produced when a choice is made from two equally likely possibilities. If I toss a coin and tell you that it came down heads, I am giving you one bit of information about this event.” All of this could be summed up in a few points that might seem unsurprising to those living in the twenty-first century but were in fact startling—“a bolt from the blue,” as one of Shannon’s colleagues put it32—to those just getting over the Second World War: (1) All communications could be thought of in terms of information; (2) all information could be measured in bits; (3) all the measurable bits of information could be thought of, and indeed should be thought of, digitally. This could mean dots or dashes, heads or tails, or the on/off pulses that comprised PCM. Or it could simply be a string of, say, five or six 1s and 0s, each grouping of numerical bits representing a letter or punctuation mark in the alphabet. For instance, in the American Standard Code for Information Interchange (ASCII), which was worked out several years after Shannon’s theory, the binary representation for FACT IS STRANGER THAN FICTION would be as follows:
010001100100000101000011010101000010000001001001010
1001100100000010100110101010001010010010000010100111
001000111010001010101001000100000010101000100100001
000001010011100010000001000110010010010100001101010
100010010010100111101001110
Thus Shannon was suggesting that all information, at least from the view of someone trying to move it from place to place, was the same, whether it was a phone call or a microwave transmission or a television signal.
This was a philosophical argument, in many respects, and one that would only infiltrate the thinking of the country’s technologists slowly over the next few decades. To the engineers at the Labs, the practical mathematical arguments Shannon was also laying out made a more immediate impression. His calculations showed that the information content of a message could not exceed the capacity of the channel through which you were sending it. Much in the same way a pipe could only carry so many gallons of water per second and no more, a transmission channel could only carry so many bits of information at a certain rate and no more. Anything beyond that would reduce the quality of your transmission. The upshot was that by measuring the information capacity of your channel and by measuring the information content of your message you could know how fast, and how well, you could send your message. Engineers could now try to align the two—capacity and information content. For anyone who actually designed communications systems with wires or cables or microwave transmitters, Shannon had handed them not only an idea, but a new kind of yardstick.
SHANNON’S PAPER contained a claim so surprising that it seemed impossible to many at the time, and yet it would soon be proven true. He showed that any digital message could be sent with virtual perfection, even along the noisiest wire, as long as you included error-correcting codes—essentially extra bits of information, formulated as additional 1s and 0s—with the original message. In his earlier paper on cryptography, Shannon had already shown that by reducing redundancy you could compress a message to transmit its content more efficiently. Now he was also demonstrating something like the opposite: that in some situations you could increase the redundancy of a message to transmit it more accurately. One might think of the simplest case: Sending the message FCTSSTRNGRTHNFCTN FCTSSTRNGRTHNFCTN FCTSSTRNGRTHNFCTN—an intentional tripling—to improve the chances that the message was clearly received amid a noisy channel. But Shannon showed that there were myriad and far more elegant ways to fashion these kinds of self-correcting codes. Some mathematicians at the Labs would spend their careers working out the details, which Shannon was largely uninterested in doing.33
All the error-correcting codes were meant to ensure that the information on the receiving end was exactly or very close to the message that left the transmission end. Even fifty years later, this idea would leave many engineers slack-jawed. “To make the chance of error as small as you wish?” Robert Fano, a friend and colleague of Shannon’s, later pointed out. “How he got that insight, how he even came to believe such a thing, I don’t know.” All modern communications engineering, from cell phone transmissions to compact discs and deep space communications, is based upon this insight.34
For the time being, knowledge of Shannon’s work largely remained confined to the mathematics and engineering community. But in 1949, a year after the communications paper was published, it was published in book form for a wider audience. Shortly thereafter communication theory became more commonly known as information theory. And eventually it would be recognized as the astounding achievement it was. Shannon’s Bell Labs colleagues came to describe it as “one of the great intellectual achievements of the twentieth century”; some years later, Shannon’s disciple Bob Lucky wrote, “I know of no greater work of genius in the annals of technological thought.”35 A few observ
ers, in time, would place the work on a par with Einstein’s. Shannon himself refused to see the theory in such grand terms; he even urged a magazine journalist, in a genial prepublication letter in the early 1950s, to strip any comparison of his theory to Einstein’s ideas on relativity in a story: “Much as I wish it were,” he lamented, “communication theory is certainly not in the same league with relativity and quantum mechanics.”36
In any case, Shannon was ready to move on. He later quipped that at this point in time he was becoming highly interested in “useless” things—games, robots, and small machines he called automata. To some observers, Shannon’s new interests seemed to suggest that rather than advance his extraordinary insights on communications, he preferred to squander his genius on silly gadgetry or pointless questions, such as whether playing tic-tac-toe on an infinite sheet of graph paper would reveal patterns about tie games. Perhaps this was missing a point. Some of Shannon’s more curious pursuits were far less useless than they appeared, but he was growing increasingly reluctant to explain himself. People could see what he was doing, then they could think whatever they wanted.
AFTER THE PUBLICATION of information theory, no one could dare tell Shannon what he should work on. Often he went into his office just like any other Bell Labs employee. But beginning in the early 1950s sometimes he didn’t go in at all, or he wandered in late and played chess the rest of the day, much to the consternation, and sometimes the envy, of his colleagues.37 When he was in the office at Murray Hill he would often work with his door closed, something virtually unheard of at Bell Labs. And yet that breach, too, was permissible for Claude Shannon. “He couldn’t have been in any other department successfully,” Brock McMillan recalls. “But then, there weren’t many other departments where people just sat and thought.”
Increasingly, Shannon began to arouse interest not only for the quality of his ideas but for his eccentricity—“for being a very odd man in so many ways,” as his math department colleague David Slepian puts it. “He was not an unfriendly person,” Slepian adds, “and he was very modest,” but those who knew him and extended a hand of friendship realized that he would inevitably fail to reciprocate. They had to seek him out. And in calling on him (knocking on his door, writing, visiting) one had to penetrate his shyness or elusiveness or—in the case of expecting a reply to a letter—his intractable habits of procrastination and his unwillingness to do anything that bored him.
Shortly after the information paper was published, he did make one significant social connection, marrying a young woman named Betty Moore who worked in the Labs’ computational department. Sometimes he and Betty would go to Labs functions—dinner at Jim Fisk’s house, where Bill Shockley and his wife would also be guests. But mainly the math department, a social group both in and outside of the office, would extend invitations to Shannon and he would politely refuse.
In the 1940s and 1950s, the members of Bell Labs’ math department liked to play a game after lunch. “I don’t know who invented this,” Brock McMillan recalls, “but it was called ‘Convergence in Webster.’ Someone was supposed to compose in his head a four-word sentence. And people would try to guess letters and words.” As the men would guess, the creator of the mystery sentence would stand before a blackboard filling in blank spaces (as in Hangman) and telling them whether their guesses fell alphabetically before or after his words. Thus they would gradually converge on the right words, as one might home in on a word he was seeking in Webster’s Dictionary. Fifty years later, the men could still recall their favorites. One time a supervisor dropped in, offered to take a turn as the leader, and drew blanks on the blackboard for what eventually turned out to be You Are All Fired.38 The mathematicians thought it hilarious.
Shannon was in this world but not of it. Just as he had stood in the doorway when he met Norma at the college party—the one where she had thrown popcorn at him to get his attention, but rather than enter her life he had brought her back to his—he would wander by his colleagues’ offices some afternoons after lunch, and if he saw “Convergence in Webster” in progress, he would lean against the doorjamb and watch.
In a math department that thrived on its collective intelligence—where members of the staff were encouraged to work on papers together rather than alone—this set him apart. But in some respects his solitude was interesting, too, for it had become a matter of some consideration at the Labs whether the key to invention was a matter of individual genius or collaboration. To those trying to routinize the process of innovation—the lifelong goal of Mervin Kelly, the Labs’ leader—there was evidence both for and against the primacy of the group. So many of the wartime and postwar breakthroughs—the Manhattan Project, radar, the transistor—were clearly group efforts, a compilation of the ideas and inventions of individuals bound together with common purposes and complementary talents. And the phone system, with its almost unfathomable complexity, was by definition a group effort. It was also the case, as Shockley would later point out, that by the middle of the twentieth century the process of innovation in electronics had progressed to the point that a vast amount of multidisciplinary expertise was needed to bring any given project to fruition. “Things are much more complex than they were probably when Mendel was breeding peas, in which case you would put them in a pot and collect the fruits, and then cover up the blossoms and have that suffice,” Shockley said, referring to the nineteenth-century scientist whose work provided the foundation for modern genetics. An effective solid-state group, for example, required researchers with material processing skills, chemical skills, electrical measurement skills, theoretical physics skills, and so forth. It was exceedingly unlikely to find all those talents in a single person.39
And yet Kelly would say at one point, “With all the needed emphasis on leadership, organization and teamwork, the individual has remained supreme—of paramount importance. It is in the mind of a single person that creative ideas and concepts are born.”40 There was an essential truth to this, too—John Bardeen suddenly suggesting to the solid-state group that they should consider working on the hard-to-penetrate surface states on semiconductors, for instance. Or Shockley, mad with envy, sitting in his Chicago hotel room and laying the groundwork for the junction transistor. Or Bill Pfann, who took a nap after lunch and awoke, as if from an edifying dream, with a new method for purifying germanium.
Of course, these two philosophies—that individuals as well as groups were necessary for innovation—weren’t mutually exclusive. It was the individual from which all ideas originated, and the group (or the multiple groups) to which the ideas, and eventually the innovation responsibilities, were transferred. The handoffs often proceeded in logical progression: from the scientist pursuing with his colleagues a basic understanding in the research department, to the applied scientist working with a team in the development side, to a group of engineers working on production at Western Electric. What’s more, in the right environment, a group or wise colleague could provoke an individual toward an insight, too. In the midst of Shannon’s career, some lawyers in the patent department at Bell Labs decided to study whether there was an organizing principle that could explain why certain individuals at the Labs were more productive than others. They discerned only one common thread: Workers with the most patents often shared lunch or breakfast with a Bell Labs electrical engineer named Harry Nyquist. It wasn’t the case that Nyquist gave them specific ideas. Rather, as one scientist recalled, “he drew people out, got them thinking.” More than anything, Nyquist asked good questions.41
Shannon knew Nyquist, too. And though Shannon worked alone, he would later tell an interviewer that the institution of Bell Labs (its intellectual environment, its people, its freedom, and, most important, the Bell System’s myriad technical challenges) deserved a fair amount of credit for his information theory.42
What the Labs couldn’t take credit for, though, was how Shannon, whether by provocation or intuition, seemed to anticipate a different era altogether. His genius was roug
hly equivalent with prescience. There was little doubt, even by the transistor’s inventors, that if Shockley’s team at Bell Labs had not gotten to the transistor first, someone else in the United States or in Europe would have soon after. A couple of years, at most.43 With Shannon’s startling ideas on information, it was one of the rare moments in history, an academic would later point out, “where somebody founded a field, stated all the major results, and proved most of them all pretty much at once.”44 Eventually, mathematicians would debate not whether Shannon was ahead of his contemporaries. They would debate whether he was twenty, or thirty, or fifty years ahead.
Eight
MAN AND MACHINE
Early in his Bell Labs career, Shannon had begun to conceive of his employer’s system—especially its vast arrangement of relays and switches that automatically connected callers—as more than a communications network. He saw it as an immense computer that was transforming and organizing society. This was not yet a conventional view, though it was one that Shockley, too, would soon adopt. As Shannon put it, the system and its automatic switching mechanisms was “a really beautiful example of a highly complex machine. This is in many ways the most complex machine that man has ever attempted, and in many ways also a most reliable one.” He was also intrigued by the fact that the phone system was built to be efficient and tremendously broad in its sweep but was not built to think in depth. In connecting callers, it did innumerable simple tasks over and over and over again. But he knew that other machines could be built for a contrary purpose, to be deep rather than broad, and he began thinking about how to do so. Soon after Shannon’s information theory was published, he started working on a computer program and a scientific paper for playing chess. “He was a very good chess player,” Shannon’s colleague Brock McMillan recalls. “He clobbered all the rest of us.” But there seemed little point in studying chess at an industrial lab for communications. While preparing for a radio interview at around that time in Morristown, New Jersey—the town where Shannon and Betty had just moved to an apartment—he scribbled down some responses to the questioner: