For Shannon, consummate applied mathematician and engineering theorist though he was, was also intrigued with what computers could do. He was interested in the computer as far more than an instrument of numeric computation. He envisioned machines that would design circuits, route telephone calls, perform symbolic (not numeric) mathematical computations, translate from one language to another, make strategic military decisions, reason logically, orchestrate melodies.50 In other words, he was interested in the computer as an intelligent being in the most eclectic sense of “intelligence.” He believed that such computers were not only theoretically possible, but also economically viable—that is, they were feasible from an engineering point of view.51
Shannon ruminated on these ideas in 1950. Notice that these possibilities formed the other face of the natural–artificial wall. Craik and von Neumann had dwelled on the possibility of regarding human neural activity and thought in computational terms. Shannon was envisioning computers as capable of the kind of neural processing that produces human intelligent thought and action. If Craik and von Neumann contemplated human cognition in mechanical terms, Shannon was imagining machines in human cognitive terms. But, both perspectives converged to a common theme: computing (in some broad sense) and mind–brain processing were related.
The great French philosopher René Descartes (1596–1650) had famously uttered the dictum cogito ergo sum, “I think, therefore I am”—meaning, basically, that the very act of thinking is proof of one’s mental existence. Craik, von Neumann, and Shannon were among the first who were claiming that I compute, therefore I am. The “I” here could as well be the computer as a human being.
However, Shannon was not merely speculating on the possibility of machine intelligence. He was an applied mathematician above all else and, as such, he was interested in specific problems and their solutions. Thus, in November 1949, the editor of the venerable British Philosophical Magazine (with a pedigree reaching back to the 18th century) received a manuscript authored by Shannon titled “Programming a Computer for Playing Chess.”52 The article was published in March 1950.53 Less than a year earlier, the EDSAC and the Manchester Mark I had run their first programs.
VII
Chess is, of course, among the most sophisticated board games. As Shannon pointed out, the “problem” of chess playing is, on the one hand, extremely well-defined in terms of the rules that determine legal chess moves and the goal of the game (to checkmate the opponent); yet, it is neither too trivial nor too difficult to achieve the goal. To play well, demands considerable thought.
These very characteristics had prompted others long before Shannon to try and design chess-playing machines. One of these was by Leonardo Torres y Quevedo, the Spanish engineer–inventor whose Essays in Automatics (1914) we encountered in Chapter 3, Section IX. Evidently, among the “thinking automata” Torres y Quevedo had envisioned was a chess-playing machine he designed in 1914 for playing an end game of king and rook against king. Torres y Quevedo’s machine played the side with king and rook, and could checkmate the human opponent in a few moves regardless of how the latter played.54
Shannon would have liked to design a special-purpose chess computer. Rather wistfully, he rejected the idea because of cost. So he began with the assumption that he would have at his disposal a stored-program digital computer along the lines available at the time. The challenge was to simulate a chess-playing machine on what was an “arithmetic computer.” Or, in present-centered language, to design a virtual chess machine that could be implemented on a typical stored-program computer.
Chess, unlike many other board or card games, has no chance element in it. Moreover, it is a game of “perfect information” in that each player can see all the pieces on the board at all times. Shannon refers to the monumental and seminal book on the interdisciplinary field of game theory invented in 1944 by the ever-fertile von Neumann and German-American economist Oskar Morgernstern (1902–1977).55 Along with cybernetics, information theory, and the computer, game theory was yet another of the extraordinary intellectual creations of the mid 1940s. Referring to the von Neumann-Morgernstern book, Shannon noted that any particular position (that is, configuration of the pieces on a chess board) at any time would lead to one of three possibilities: White winning, Black winning, or a draw. However, there is no algorithm that can determine, in a practical way, which of these situations prevails in any given position. Indeed, as Shannon noted, if that was the case, chess would not be at all interesting as a game.56
In principle, in any position, the following algorithm would work. The machine considers all possible next moves for itself. For each such next move it then considers all possible moves by its opponent. Then, for each of those, it considers all possible moves for itself—and so on until the end is reached. The outcome in each end situation will be a win, loss, or draw. By working backward from the end, the machine would determine whether the current position would force a win, loss, or draw.
In current language, this strategy is known as exhaustive search or brute-force search.57 And, to repeat, this strategy would work in principle. However, in practice even using a high-speed electronic computer, the amount of computation required would be unimaginably large. Shannon, referring to the pioneering experimental studies of chess play conducted by Dutch psychologist and chess master Adriaan De Groot (1914–2006) in the mid 1940s, noted that, in typical positions, there are about 30 possible legal moves.58 Shannon estimated that, assuming a typical game to last about 40 moves to resignation of one of the players, something like 10120 alternatives would have to be considered from the initial position. Assuming the computer could ascertain one alternate position each microsecond, he calculated that something like 1090 years would be required to compute an optimal move.59
So exhaustive search was ruled out. More practical strategies were required. Rather than dream of a “perfect chess” machine, the aim should be to produce a machine that could perform at the level of a “good” human chess player.60 This would involve a strategy that evaluated the “promise” of a position P using some appropriate “evaluation function” f(P), with the “promise” depending on such considerations as the overall positions of the pieces on the board at any particular stage of the game, the number of Black and White pieces on the board, and so on.61
Shannon gave an example of an evaluation function f(P) for a position P that uses different “weights” (measures of importance) to the various types of chess pieces on the board. Assuming that the machine explores only one move deep—that is, it explores its own next move—a possible strategy would suppose that M1, M2, …, Mn are the possible moves that can be made by the machine in a position P. If M1P, M2 P, and so forth, signify the resulting positions when M1, M2, and so on, are made, then choose the move Mq that maximizes the evaluation function f(MqP).62
A “deeper” strategy would consider the opponent’s response—that is, the strategy would “look ahead” to the opponent’s possible moves. However, if the machine is playing White, Black’s reply to a White move would endeavor to minimize f(P). So if White plays Mi, Black would choose move Mij such that f(MijMiP) is a minimum. In which case, White should choose his first move such that f is a maximum after Black chooses his minimizing (that is, best) reply.
Shannon described a simple version of what would later be called a minmax strategy in game play.63 He did not actually write a program to play chess; rather, he explored possible strategies that would lead to the development of a practical, “virtual” chess-playing machine to play what, in chess is called, the middle game.64
It is worth noting that in deciding on a move to make using some kind of minmax strategy, Shannon was advocating the construction of a plan that would consider alternative moves, anticipate the opponent’s move in response, up to a certain “depth” of look-ahead, and then decide on the actual move to play. This was precisely what Craik had envisioned in his discussion on the nature of thought in The Nature of Explanation (1943) (see Section II,
this chapter).
VIII
Shannon’s 1950 article was (probably) the first publication on the possibility of a chess-playing program. In addition, it is fair to say that the article marked the beginning of a distinct branch of the emerging computer science later called artificial intelligence.
But why should a computer capable of playing chess of the same level of skill as a good human chess player be deemed “intelligent” (in the ordinary human sense) in contrast to a computer capable of solving differential equations of the sort Stanley Gill explored for his PhD dissertation (see Chapter 10, Section IV)? After all, the rules of chess are quite simple and most people can learn to play chess, at least at a basic level, whereas one has to have considerable mathematical acumen to learn to solve differential equations. In what significant sense is the problem of playing chess (or other similar board games, such as checkers) superior to the problem of solving differential equations?
Shannon addressed this issue, albeit briefly. For the kinds of problems he identified at the beginning of his 1950 article—machines that could design, translate, make decisions, perform logical deductions, and so on—the procedures performed entailed making judgments, trying out something to see if it works, and trying something else if it does not. The solutions of such problems were never just right or wrong, but rather spanned a spectrum of possibilities, from the very best to the very worst and several shades in between. Thus, a solution might be one that is acceptably good rather than the very best.65
These are significant insights, anticipating much that will follow in this story. What strikes us most immediately is that problems such as chess entail ingredients of ordinary thought that humans negotiate on an everyday basis—with all the uncertainties, propensity for error, limited rationality, and subjectivity attendant on such thinking. These were the challenges Shannon broached on—and that Alan Turing would boldly confront.
IX
Perhaps it was no coincidence that among the possible things Shannon believed the computer could be programmed to do was create a machine that could translate between languages.66 Automatic translation was much in the mind of Shannon’s coauthor on information theory, Warren Weaver. In 1949, Weaver wrote a memorandum simply titled Translation, in which, referring to himself somewhat archly in the third person, he remembered how his wartime experience with computing machines had led him to think about automatic translation.67
Even more than Shannon’s deliberations on a chess-playing machine, Weaver’s memorandum reveals the optimism, bordering on brashness, that attended the thinking of early scientists concerned with the application of the digital computer. In the realm of translation, we have previously witnessed Herman Goldstine and von Neumann grapple with the problem of coding (programming) as an act of translation—mapping from a mathematical culture to machine culture—and producing a computational text from mathematical text (see Chapter 9, Section III). We have witnessed David Wheeler invent an artificial symbolic language (“assembly language”) to write programs in and a means (an “assembler”) to translate such programs to machine-executable form (“machine language”; see Chapter 9, Section VI).
What Weaver was contemplating was of a different qualitative order altogether—translating text from one natural language to another using the computer. As it turned out, Weaver’s memorandum of 1949 marked the beginning of a discipline straddling linguistics and computing called machine translation.68
We have also previously noted that translating literary text is a creative process involving not only the conversion of words in one language to another, but a mapping of one linguistic culture to another (see Chapter 9, Section III), wherein the translation enacts a complicated interweaving of understanding and interpretation.69 The prospect of machine translation, in the literary sense of translation, thus seems still more formidable.
But Weaver was not alone in this contemplation. Across the Atlantic, at Birbeck College, London University, Andrew D. Booth (1918–2009), a physicist working in Desmond Bernal’s famed laboratory—and like many scientists of the time, drawn into computers and computing through his particular brand of research—was also dwelling on the possibility of machine translation.70 However, Booth (who may have influenced Weaver, who visited the Englishman in 194871) was, at the time, concerned with mechanizing a dictionary.72 Weaver had more vaulting ambitions, in which such language issues as polysemy (that is, the phenomenon of multiple meanings of words) and word order would enter the frame.
Weaver was neither a linguist nor a literary translator. As a mathematician-turned-science policy shaper (employed by the Rockefeller Foundation), as Shannon’s collaborator on a mathematical theory of communication, Weaver was drawn to cryptography as a source of analogical insight to the problem of machine translation. Airily and rather extraordinarily, he confessed to being “tempted” to propose that a book written in Chinese is nothing but a book written in English but using Chinese code.73 For Weaver, the act of translation became a problem of deciphering—a position that would surely make translators, translation theorists, and literary scholars wince. Possible methods of cryptology would become, he said, when properly interpreted, “useful methods of translation.”74
These “useful methods of translation” used in cryptography could have an interesting attribute. Deciphering a message was a process that made use of “frequencies of letters, letter combinations, interval between letters and letter combinations, letter patterns, etc.”75 So translation, from this point of view, was a process that entailed finding certain statistical regularities in texts. But these regularities, Weaver continued, were—broadly speaking—language independent.76 This meant, according to him, that among all the languages invented by humankind, there were certain properties that were statistically invariant across the languages.77
Languages, in other words (as Weaver saw them), had, deep down, certain invariant characteristics. This suggested to Weaver a way of tackling the problem of machine translation. Rather than translate directly from one natural language to another—say, from Russian to Portuguese—perhaps the proper strategy is to translate “down” from the source language to the shared base language, to an as-yet-undiscovered “universal language,” and then translate “up” from the latter to the target language.78
Weaver’s conception of a “universal language” is striking because it seems to anticipate, in a general sort of way, the idea of “linguistic universals”—linguistic features common to (most) languages, such as the presence of the categories nouns and verbs, patterns of word order—which would be discussed by linguists during the 1960s.79 At any rate, it is at the level of a universal language that Weaver believed that human communication takes place, and it would be at this level that the process of machine translation should begin. How this would happen he did not elaborate. He recognized that much work on the logical structure of language would have to be done before automatic translation could be tackled effectively. However, in any case, regardless of whether the approach he was advocating led to success in machine translation, it would surely produce “much useful knowledge about the general nature of communication.”80
Machine translation has to do with natural language, the most human of human characteristics, one that separates humans from nonhumans. Yet, strangely enough, machine translation never quite penetrated into the innards of artificial intelligence in the way computer chess would; rather, it became an enterprise and a research tradition of its own. But, like the computer chess project, the machine translation enterprise as imagined by Weaver turned out to be a far more difficult problem than early machine translation researchers had anticipated. In any case, within a decade of the Weaver memorandum, the study of the structure of language itself would be turned topsy-turvy by a young linguist named Noam Chomsky.
X
Shannon was by no means the only person at the time who was thinking about thinking machines. Indeed, soon after the ENIAC seeped into public consciousness, the term electronic brain b
egan to appear in the popular press. No less a personage than Lord Louis Mountbatten (1900–1979) used the term to describe the ENIAC in a talk he delivered to the (British) Institute of Radio Engineers (IRE) in 1946.81 As Sir Maurice Wilkes recollected in 1985, this reference to computers as electronic brains excited much debate in the British press.82 An American participant in the early development of the commercial computer, Edmund C. Berkeley (1909–1988), a founding member of the ACM in 1947 (see Chapter 8, Section XVI) and its first secretary, published in 1949 a book called Giant Brains, or Machines That Think.83 So the climate was already in place for serious discussions of thinking machines in the later 1940s.
Indeed, well before Shannon’s manuscript on computer chess was submitted for publication, Turing had dwelt on the topic. Like Shannon, he was much interested in the uses of the digital computer. As early as 1945, in his definitive report on the ACE being developed at the NPL in Teddington (see Chapter 8, Section IX), Turing asked whether a machine could play chess.84 Three years later, he submitted a report to the NPL, then still his employer, titled Intelligent Machinery.85
However, Turing’s thoughts on thinking machines came into public notice most famously—insofar as an article in one of England’s most august philosophical journals could be said to excite “public” notice—with an article titled “Computing Machinery and Intelligence” in the October 1950 issue of Mind.86
XI
Turing began with the question: Can machines think? But, wishing to avoid the pitfalls of defining such terms as machine and think, he proposed a thought experiment that he called the “imitation game.” Imagine, first, he wrote, a man (M), a woman (W), and an interrogator (I) who may be of either sex. I is placed in a room separate from the room occupied by M and W. The purpose of the game is for I to put questions to M and W (without knowing which one of them is the man and which is the woman) in such a fashion that I can determine, from the answers, which is the man and which is the woman. The answers, of course, from M and W must be given in written or typewritten form to mask the sex of the answerer.
It Began with Babbage Page 25