by David Shenk
For Turing during World War II, chess was also particularly attractive as just about the only part of his intellectual life that was not top secret. Turing and his Bletchley Park colleagues could discuss chess problems anytime and anywhere without compromising their military work. One emerging thread of their discussions was the possibility of building a chess-playing machine, which would allow them to test their ideas about the mechanization of thought. They considered chess an excellent model for such a test. Among other attributes, it was an elegant example of what Princeton mathematics guru John von Neumann (a mentor of Turing) called games of perfect information, meaning that all variables were always known to all players. Nothing about the game was hidden. The same was true of less complex games like checkers, tic-tac-toe, and others—all of these games stood in contrast to poker, for example, where cards are concealed and players can bluff. In his work, von Neumann had established that each game of perfect information has an ideal “pure strategy”—a set of rules that will suit every possible contingency. Theoretically at least, the perfectly designed computer could play the perfect game of chess.
In 1946, as part of an exploration about what computers could potentially do, Turing became perhaps the first person to seriously broach the concept of machine intelligence. Chess was his vehicle for conveying the notion. “Can [a] machine play chess?” he asked, and then offered an answer:
It could fairly easily be made to play a rather bad game. It would be bad because chess requires intelligence…. There are indications however that it is possible to make the machine display intelligence at the risk of its making occasional serious mistakes. By following up this aspect the machine could probably be made to play very good chess.
Today the words fall flat on the page. Sixty years ago, they were revolutionary. The most startling word of all was intelligence, which Turing did not use casually. He was not merely talking about the ability to follow complex instructions. “What we want,” Turing explained, “is a machine that can learn from experience…[the] possibility of letting the machine alter its own instructions.” It was a stunning prognostication, and Turing is today revered for his vision. For someone surrounded at the time by machines not much smarter than a light switch to imagine a machine that could someday learn from mistakes and alter its own code was like an eighteenth-century stagecoach driver envisioning a sporty car with a hybrid engine and satellite navigation.
Two years later, in 1948, Turing and his colleague David Champernowne built a computer chess program called “Turochamp.” Compared to later such programs, it was extremely primitive. But at the time their program was too complex for the available hardware. Of the few actual computers in existence at that time, none of them was even remotely powerful enough to execute their software. So Turing himself became a machine: in a game against Champernowne, Turing followed the Turochamp instruction code as if he were the computer, making the computations by hand and moving his pieces accordingly. It took Turing about thirty minutes to calculate each move. Not surprisingly, the program lost to the experienced human chess player. But subsequently, it managed to beat Champernowne’s wife, a chess novice. Chess computing—and artificial intelligence (AI) itself—had taken its first baby step forward.
VERY EARLY ON , AI pioneers in the United States and Britain hit on a conceptual quandary: should they design machines to actually think like human beings—incorporating experience, recognizing patterns, and formulating and executing plans—or should they play to the more obvious strengths of machines’ ability to conduct brute-force mathematical calculations? In his 1947 essay on machine intelligence, Turing had suggested that he would pursue the former—“letting the machine alter its own instructions.” But from a practical standpoint, he focused on the latter. Turing’s counterparts across the Atlantic, including MIT’s Claude Shannon, independently came to the same way of thinking. Like Turing, Shannon was fascinated by chess’s potential in the pursuit of what he called “mechanized thinking.” But he became convinced that computer chess and other AI pursuits should not be modeled on human thought. Unlike human brains, computers did not have scores of different specialized components that could read information, contextualize it, prioritize it, store it in different forms, recall it in a variety of ways, and then decide on how to apply it; computers, at least as they were understood then, could calculate very quickly, following programmed instructions. This particular strength—and limitation—of computers suggested a different route for AI, a new sort of quasi-intelligence based on mathematical computation.
Chess would be a central proving ground for this new type of intelligence. Theoretically, at least, the game could be fully converted into one long mathematical formula. The board could be represented as a numerical map, pieces weighted according to their relative value, moves and board positions scored according to the numerical gain or loss that each would bring. But the scope of computation was immense—too much for the earliest computers to handle. One of the first was John von Neumann’s Maniac I, built in 1956 in Los Alamos, New Mexico, to help refine the American hydrogen bomb arsenal. With 2,400 vacuum tubes, the machine could process a staggering ten thousand instructions per second. It could not, though, handle a full-scale chess-board. Playing a simplified version of chess on a chessboard six squares by six with no Bishops, no castling, and no double-square first Pawn move, Maniac I required twelve minutes to look just two full moves ahead. (With Bishops, it would have needed three hours.) The machine did go on to help design potent nuclear warheads, but as a chess player it was pretty hopeless.
The problem, put simply, was time. Even as programmers devised increasingly elegant equations to represent quasi-intelligent chess decisions, the computers still had to evaluate an overwhelming number of positions in order to play competently. Human experts could get around this problem using intuition to ferret out most of the silly moves and narrow their decision down to a few strong possibilities. But machines couldn’t do this. They had to consider all of the choices equally, and to look ahead as many moves as possible to see how each choice would play out. That amount of calculation took a lot of time. With an average of thirty-five or so options per turn, and then thirty-five different subsequent responses for each possible move, and so on, geometric progression quickly made the large number of calculations untenable for even a speedy computer. To look ahead a mere two moves each, the computer would have to evaluate 35 × 35 × 35 × 35 = 1,500,625 positions. Three full moves required analysis of 1,838,265,625 (nearly two billion) positions. Four moves: 2,251,875,390,625 (over two trillion) positions.
This was not a short-term problem for engineers. Even as computers got faster and faster, they would not come remotely close to truly managing chess’s near-infinity. In fact, it seemed safe to predict that no machine would be able to overcome this problem for many centuries, if ever. Computer scientists estimated, for example, that a future computer examining moves five times faster than Deep Blue’s supercomputing 200 million positions per second would take an estimated 3.3 billion years to consider all of the possibilities for ten moves by each player.
In Bletchley Park during the war, Turing realized that the single most essential tool for the mathematizing of chess would be a technique developed by his Princeton mentor John von Neumann called minimax. Short for “minimizing the expected maximum loss,” minimax was essentially a method for choosing the least bad move. It came out of the logical recognition that, when competing in a game where one player’s success is another player’s failure (also called a zero-sum game), Player A will always try to choose the moves that are best for him and worst for Player B. It therefore behooves Player B to identify not his absolute ideal series of moves—since no worthy opponent will allow that course—but rather the moves which will deprive Player A of his best moves. Player B, in fact, wants to move in such a way as to leave Player A with his least best option, his minimum maximum.
The minimax logic applied to any game where all the information was known by every
player—chess, checkers, tic-tac-toe, and so on. In practice, it required placing all game decisions onto an enormous “game tree,” with a single trunk (the current board position), some primary branches (all of White’s next possible moves), a larger number of secondary branches (all of Black’s possible responses to each of White’s next possible moves), and on and on, eventually ending with an abundance of “leaves” (representing the deepest level being analyzed at the time).
Imagine, for example, that each one of the following tiny squares is a chessboard with an array of pieces. This artificially simple chess game tree represents three possible moves by White, each one of which is followed by three possible moves by Black, each of which is followed by two possible moves by White:
The object is to determine the best of the possible moves by White at the beginning of the tree. Using the minimax procedure, the computer first scores each of the boards on the last level of the tree—the leaves. Imagine that, in the following diagram, a computer has examined each of the leaves and has scored each board according the relative advantage for White. The highest numbers represent the better positions for White:
Now, according to minimax logic, the computer assigns scores to branches higher up on the tree—the moves happening earlier. First it determines the best board position from each of the leaf choices, and assigns those values to the next level up.
Then the computer considers how Black will move. Black will want the least advantaged position for White—the lowest score. So it moves those low scores up to the top level:
Now White must decide how to move. White will choose the move with the highest score—the board represented by the score of seven.
Needless to say, if chess trees were anywhere near as simple as the demonstration above, they wouldn’t be necessary in the first place. Minimax enabled computers to sort through chess trees with millions of branches on the fourth and fifth branch levels, billions on the sixth and seventh levels, and trillions on the eighth level. In tackling such logical complexity, the technique emerged as far more than a computer chess tool. It became a foundation stone of modern game theory, applicable to war gaming, economics, coalition building, and a wide variety of other systems involving two or more competitive parties. It helped jump-start artificial intelligence research, and has since enabled us to look scientifically at human endeavors such as voter participation. Minimax enabled social scientists to mathematize large chunks of public life.
In the nearer term, it also quickly put computer chess programmers in a serious bind. By opening up chess to a nearly endless series of calculations, minimax made chess computing both possible and impossible. The equations could be continually improved to make better and better chess decisions, but it simply took too long for computers to analyze all the possibilities. Then, in 1956, American computer scientist and AI pioneer John McCarthy—he had actually coined the term artificial intelligence one year earlier—came up with an ingenious revision of minimax called alpha-beta pruning that allowed a computer to ignore certain leaves on a tree whose evaluations wouldn’t make a difference in the final result. Like the minimax concept, the idea wasn’t based on any particular insight into chess, but was a simple matter of logic: certain leaf evaluations are irrelevant if other leaf values on that same branch have already taken that particular branch out of contention. A computer instructed not to bother calculating such nonactionable leaves could accomplish its work in much less time.
Alpha-beta pruning was, in a sense, the first true piece of artificial intelligence: an algorithm (a step-by-step procedure for solving a problem) that helped machines logically rule out certain options in a crude analogue to how humans do the same thing through intuition. An expert human player could tell with a quick glance and a split-second subconscious memory sweep which moves could be ignored and which should be prominently considered. Alpha-beta pruning was the computer’s way of setting similar priorities.
In 1966, MIT’s Richard Greenblatt introduced yet another critical innovation called transposition tables. This technique allowed a computer to cache (temporarily remember) the value of certain chess positions so that when they came up again they wouldn’t have to be fully reevaluated. It took advantage of the observation that identical chess positions were often produced by moves in different sequence. These two separate openings, for example, produce exactly the same board position, since they are the same moves in different order:
1. e4 e5 2. f3 c6
1. f3 c6 2. e4 e5
Such duplications of positions are called transpositions. A program that could recognize and remember transpositions could trim the necessary number of calculations. This was the earliest glimpse of what Turing had fantasized as machine learning, since a cached position would enable the computer to remember certain truths as a game progressed—and potentially even from game to game. With this new way for computers to remember game positions, it would henceforth no longer be possible, for example, for a human player to beat a computer exactly the same way twice in a row. (How many players wish they could say as much for themselves?) Computers could essentially remember their mistakes.
None of these improvements in searching and pattern recognition were restricted to chess, of course. But for many years, from the 1970s through the mid-1990s, chess continued to be a choice vehicle for AI research and experimentation. At engineering schools like MIT and Carnegie Mellon, graduate students married software advances with faster and faster processors in a competitive race to build the ultimate chess computer—a machine that could beat the world chess champion. Teams on campuses formed around one chess project or another, with students custom-designing, redesigning, and re-redesigning tiny silicon “chess chips.” The progress was slower than many had hoped for, but it was steady. In 1978 the leading machine of the time, known as Chess 4.7, developed by David Slate and Larry Atkin at Northwestern University, forced Scottish master player David Levy into a draw—a first. A few years later, the leading computers started winning the occasional game against expert players, and in 1988 the Carnegie Mellon machine HiTech became the first computer to be ranked as a grandmaster. When Garry Kasparov first played and defeated Deep Blue in 1996, beating the machine with three wins, one loss, and two draws, the world champion reported that he had detected stirrings of genuine thought in the machine. The following year, after some major processor and programming upgrades, Deep Blue defeated Kasparov, with two wins, one loss, and three draws. The result sent a quick chill around the world. Much soul searching began.
Was this the end of chess? The end of us? What did Deep Blue’s victory mean?
Some were quick to point out that the stunning achievement was limited to a mere board game. Deep Blue didn’t know to stop at a red light, and couldn’t string two words together or offer anything else in the way of even simulated intelligence. Others didn’t think that even the chess win was so amazing. MIT linguist Noam Chomsky scoffed that a computer beating a grandmaster at chess was about as momentous “as the fact that a bulldozer can lift more than some weight lifter.” It was simply another case in the long history of technology, he argued, of humans inventing machines that could perform highly specialized tasks with great efficiency. Specialization did not intelligence make.
Chomsky seemed to have a point. Deep Blue was no Hal. Over the course of many decades, chess computing had not actually enabled computers to think very much like humans at all. “Turing’s expectation was that chess-programming would contribute to the study of how human beings think,” says Jack Copeland, director of the Turing Archive for the History of Computing at the University of Canterbury. “In fact, little or nothing about human thought processes has been learned from the series of projects that culminated in Deep Blue.”
Thinking like humans, though, had never really been the intention of the AI community. That had been Turing’s original dream, but the practical consensus from the very beginning was to suss out a new kind of intelligence. And in fact, they had done just that. As the twenty-
first century began, machines were able to make all sorts of intelligent actions that went far beyond mere calculations. “There are today hundreds of examples of narrow AI deeply integrated into our information-based economy,” explains Ray Kurzweil, author of The Age of Spiritual Machines. “Routing emails and cell phone calls, automatically diagnosing electrocardiograms and blood cell images, directing cruise missiles and weapon systems, automatically landing airplanes, conducting pattern-recognition-based financial transactions, detecting credit card fraud, and a myriad of other automated tasks are all successful examples of AI in use today.”
Add to that list: speech recognition, hazardous-duty robots, swimming pool antidrowning detectors, the Mars Sojourner explorer vehicle, and bits and pieces of most contemporary cars, televisions, and word processors. Looking at it under the hood, machine-based intelligence may look entirely different from human intelligence, but it is intelligence, proponents argue. “Believe me, Fritz is intelligent,” Frederic Friedel, cofounder of ChessBase software, says of one of his company’s most popular programs. “It is a kind of intelligence. If you look at anyone playing against a computer, within minutes they say things like, ‘Oh God, he’s trying to trap my Queen,’ and ‘Tricky little bloke,’ and ‘Ah, he saw that.’ They’re talking about it as if it is a human being. And it is behaving exactly like someone who’s trying to trick you, trying to trap your Queen. It seems to smell the danger.”