The Perfect Bet

Page 18

by Adam Kucharski

Turing did not think only about the mathematical theory of games. He also wondered how games could be used to investigate artificial intelligence. According to Turing, it did not make sense to ask “can machines think?” He said the question was too vague, the range of answers too ambiguous. Rather, we should ask whether a machine is capable of behaving in a way that is indistinguishable from a (thinking) human. Can a computer trick someone into believing it is human?

To test whether an artificial being could pass for a real person, Turing proposed a game. It would need to be a fair contest, an activity that both humans and machines could succeed at. “We do not wish to penalise the machine for its inability to shine in beauty competitions,” Turing said, “nor to penalise a man for losing in a race against an aeroplane.”

Turing suggested the following setup. A human interviewer would talk with two unseen interviewees, one of them human and the other a machine. The interviewer would then try to guess which was which. Turing called it the “imitation game.” To avoid the participants’ voices or handwriting influencing things, Turing suggested that all messages be typed. While the human would be trying to help the interviewer by giving honest answers, the machine would be out to deceive its interrogator. Such a game would require a number of different skills. Players would need to process information and respond appropriately. They would have to learn about the interviewer and remember what has been said. They might be asked to perform calculations, recall facts, or tackle puzzles.

At first glance, Watson appears to fit the job description well. While playing Jeopardy! the machine had to decipher clues, gather knowledge, and solve problems. But there is a crucial difference. Watson did not have to play like a human to win Jeopardy! It played like a supercomputer, using its faster reaction times and vast databases to beat its opponents. It did not show nerves or frustration, and it didn’t have to. Watson wasn’t there to persuade people that it was human; it was there to win.

The same was true of Deep Blue. When it played chess against Garry Kasparov, it played in a machine-like way. It used vast amounts of computer power to search far into the future, examining potential moves and evaluating possible strategies. Kasparov pointed out that this “brute force” approach did not reveal much about the nature of intelligence. “Instead of a computer that thought and played chess like a human, with human creativity and intuition,” he later said, “they got one that played like a machine.” Kasparov has suggested that poker might be different. With its blend of probability and psychology and risk, the game should be less vulnerable to brute force methods. Perhaps it could even be the sort of game that chess and checkers never could, a game that needed to be learned rather than solved?

Turing saw learning as a crucial part of artificial intelligence. To win the imitation game, a machine would need to be advanced enough to pass convincingly as a human adult. Yet it did not make sense to focus only on the polished final creation. To create a working mind, it was important to understand where a mind comes from. “Instead of trying to produce a programme to simulate the adult mind,” Turing said, “why not rather try to produce one which simulates the child’s?” He compared the process to filling a notebook. Rather than attempting to write everything out manually, it would be easier to start with an empty notebook and let the computer work out how it should be filled.

IN 2011, A NEW type of game started appearing among the slot machines and roulette tables of Las Vegas casinos. It was an artificial version of Texas hold’em poker: the chips reduced to two dimensions, the cards dealt on a screen. In the game, players faced a single computer opponent in a two-player form of the game, commonly known as “heads-up poker.”

Ever since von Neumann looked at simplified two-player games, heads-up poker has been a favorite target of researchers. This is mainly because it is much easier to analyze a game involving a pair of players than one with several opponents. The “size” of the game—measured by counting the total possible sequences of actions a player could make—is considerably smaller with only two players. This makes it much easier to develop a successful bot. In fact, when it comes to the “limit” version of heads-up poker, in which maximum bets are capped, the Vegas machines are better than most human players.

In 2013, journalist Michael Kaplan traced the origin of the machines in an article for the New York Times. It turned out that the poker bots owed much to a piece of software created by Norwegian computer scientist Fredrik Dahl. While studying computer science at the University of Oslo, Dahl had become interested in backgammon. To hone his skills, he created a computer program that could search for successful strategies. The program was so good that he ended up putting it on floppy disks, which he sold for $250 apiece.

Having created a skilled backgammon bot, Dahl turned his attention to the far more ambitious project of building an artificial poker player. Because poker involved incomplete information, it would be much harder for a computer to find successful tactics. To win, the machine would have to learn how to deal with uncertainty. It would have to read its opponent and weigh large numbers of options. In other words, it would need a brain.

IN A GAME LIKE poker, an action might require several decision-making steps. An artificial brain can therefore require multiple linked neurons. One neuron might evaluate the cards on display. Another might consider the amount of money on the table; a third might examine other players’ bets. These neurons won’t necessarily lead directly to the final decision. The results might flow into a second layer of neurons, which combine the first round of decision making in a more detailed way. The internal neurons are known as “hidden layers” because they lie between the two visible chunks of information: what goes into the neural network and what comes out.

FIGURE 7.1. Illustration of a simple neural network.

Neural networks are not a new idea; the basic theory for an artificial neuron was outlined in the 1940s. However, the increased availability of data and computing power means that they are now capable of some impressive feats. As well as enabling bots to learn to play games, they are helping computers to recognize patterns with remarkable accuracy.

In autumn 2013, Facebook announced an AI team that would specialize in developing intelligent algorithms. At the time, Facebook users were uploading over 350 million new photos every day. The company had previously introduced a variety of features to deal with this avalanche of information. One of them was facial recognition: the company wanted to give users the option to automatically detect—and identify—faces in their photos. In spring 2014, the Facebook AI team reported a substantial improvement in the company’s facial recognition software, known as DeepFace.

The artificial brain behind DeepFace consists of nine layers of neurons. The initial layers do the groundwork, identifying where the face is in the picture and centering the image. Subsequent layers then pick out features that give a lot of clues about identity, such as the area between the eyes and eyebrows. The final neurons pull together all the separate measurements, from eye shape to mouth position, and use them to label the face. The Facebook team trained the neural network using multiple photos of four thousand different people. It was the largest facial data set ever assembled; on average, there were over a thousand pictures of each face.

With the training finished, it was time to test the program. To see how well DeepFace performed when given new faces, the team asked it to identify photos taken from “Labeled Faces in the Wild,” a database containing thousands of human faces in everyday situations. The photos are a good test of facial recognition ability; the lighting isn’t always the same, the camera focus varies, and the faces aren’t necessarily in the same position. Even so, humans appear to be very good at spotting whether two faces are the same: in an online experiment, participants correctly matched the faces 99 percent of the time.

But DeepFace was not far behind. It had trained for so long, and had its artificial neurons rewired so many times, that it could spot whether two photos were of the same person
with over 97 percent accuracy. Even when the algorithm had to analyze stills from YouTube videos—which are often smaller and blurrier—it still managed over 90 percent accuracy.

Dahl’s poker program also took a long time to build experience. To train his software, Dahl set up lots of bots and got them to face off against each other in game after game. The computer programs sat through billions of hands, betting and bluffing, their artificial brains developing while they played. As the bots improved, Dahl found that they began to do some surprising things.

IN HIS LANDMARK 1952 paper “Computing Machinery and Intelligence,” Turing pointed out that many people were skeptical about the possibility of artificial intelligence. One criticism, put forward by mathematician Ada Lovelace in the nineteenth century, was that machines could not create anything original. They could only do what they were told. Which meant a machine would never take us by surprise.

Turing disagreed with Lovelace, noting that “machines take me by surprise with great frequency.” He generally put these surprises down to oversight. Perhaps he’d made a hurried calculation or a careless assumption while constructing a program. From rudimentary computers to high-frequency financial algorithms, this is a common problem. As we’ve seen, erroneous algorithms can often lead to unexpected negative outcomes.

Sometimes the error can work to a computer’s advantage, however. Early in the chess match between Deep Blue and Kasparov, the machine produced a move so puzzling, so subtle, and so—well—intelligent that it threw Kasparov. Rather than grab a vulnerable pawn, Deep Blue moved its rook into a defensive position. Kasparov had no idea why it would do that. By all accounts, the move influenced the rest of the match, persuading the Russian grandmaster that he was facing an opponent far beyond anything he’d played before.

In fact, Deep Blue had no reason for choosing that particular move. Having eventually run into a situation in which it had no rules for—as predicted by Gödel’s incompleteness theorem—the computer had acted randomly instead. Deep Blue’s game-changing show of strategy was not an ingenious move; it was simple good luck.

Turing admitted that such surprises are still the result of human actions, with the outcomes coming from rules humans have defined (or failed to define). But Dahl’s poker bots did not produce surprising actions because of human oversight. Rather, surprises were the result of the programs’ learning process. During the training games, Dahl noticed that one of the bots was using a tactic known as “floating.” After the three flop cards are shown, a floating player calls the opponent’s bets but does not raise them. The floating player loiters, playing out the round without influencing the stakes. Once the fourth turn card is revealed, the player makes a move and raises aggressively, with the hope of scaring the opponent into folding. Dahl had not come across such a technique before, but the strategy is familiar to most good poker players. It also requires a lot of skill to pull off successfully. Not only do players need to judge the cards on display, they need to read their opponents correctly. Some are easier to scare off than others; the last thing a floating player wants is to raise aggressively and then end up in a showdown.

At first glance, such skills seem inherently human. How could a bot teach itself a strategy like this? The answer is that it is inevitable, because sometimes a play relies more on cold logic than we might think. It was just as von Neumann found with bluffing. The strategy was not a mere quirk of human psychology; it was a necessary tactic when following an optimal poker strategy.

In his New York Times article, Kaplan mentions that people often refer to Dahl’s machine in human terms. They give it nicknames. They call it him. They even admit to talking to the metal box as if it’s a real player, as if there’s a person sitting behind the glass. When it comes to Texas hold’em, it appears that the bot has succeeded in making people forget that it’s a computer program. If Turing’s test involved a game of poker rather than a series of questions, Dahl’s machine would surely pass.

Perhaps it is not particularly strange that people tend to treat poker bots as independent characters rather than viewing them as the property of the people who programmed them. After all, the best computer players are generally much better than their creators. Because the computer does all the learning, the bot doesn’t need to be handed much information initially. Its human maker can therefore be relatively ignorant about game strategies, yet still end up with a strong bot. “You can do amazing things with very little knowledge,” as Jonathan Schaeffer put it. In fact, despite having some of the best poker bots in the world, the Alberta poker group has limited human talent when it comes to the game. “Most of our group aren’t poker players at all,” researcher Michael Johanson said.

Although Dahl had created a bot that could learn to beat most players at limited-stakes poker, there was a catch. Las Vegas gaming rules stipulate that gaming machines have to behave the same against all players. They can’t tailor their playing style for opponents who are skilled or inexperienced. The rules meant that Dahl’s bot had to sacrifice some of its cunning before it was allowed on the casino floor. From a bot’s point of view, having to follow a fixed strategy can make things more difficult. Having a rigid adult brain—rather than the flexible one of a child—prevents the machine from learning how to exploit weaknesses. This removes a big advantage, because it turns out that humans have plenty of flaws that can be exploited.

IN 2010, AN ONLINE version of rock-paper-scissors appeared on the New York Times website. It’s still there if you want to try it. You’ll be able to play against a very strong computer program. Even after a few games, most people find that the computer is pretty hard to beat; play lots of games, and the computer will generally end up in the lead.

Game theory suggests that if you follow the optimal strategy for rock-paper-scissors, and choose randomly between the three available options, you should expect to come out even. But when it comes to rock-paper-scissors, it seems that humans aren’t very good at doing what’s optimal. In 2014, Zhijian Wang and colleagues at Zhejiang University in China reported that people tend to follow certain behavior patterns during games of rock-paper-scissors. The researchers recruited 360 students, divided them into groups, and asked each group to play three hundred rounds of rock-paper-scissors against each other. During the games, the researchers found that many students adopted what they called a “win-stay lose-shift” strategy. Players who’d just won a round would often stick with the same action in the next round, while the losing players had a habit of switching to the option that beat them. They would swap rock for paper, for instance, or scissors for rock. Over many rounds, the players generally chose the three different options a similar number of times, but it was clear they weren’t playing randomly.

The irony is that even truly random sequences can contain seemingly nonrandom patterns. Remember those lazy journalists in Monte Carlo who made up the roulette numbers? There were a lot of obstacles they’d have had to overcome to create results that appeared random. First, they would have had to make sure black and red came up similarly often in the results. The journalists actually managed to get this bit right, which meant the data passed the initial round of Karl Pearson’s “Is it random?” test. However, the reporters came unstuck when it came to runs of colors, because they switched between red and black more often than a truly random sequence would.

Even if you know how randomness should look, and try to alternate between colors—or rock, paper, scissors—correctly, your ability to generate random patterns will be limited by your memory. If you had to read a list of numbers and immediately recite them, how many could you manage? Half a dozen? Ten? Twenty?

In the 1950s, cognitive psychologist George Miller noted that most young adults could learn and recite around seven numbers at a time. Try memorizing one local phone number and you’ll probably be fine; attempt to remember two, and it gets tricky. This can be problematic if you’re trying to generate random moves in a game; how can you ensure you use all options equally often if you can
only remember the last few moves? In 1972, Dutch psychologist Willem Wagenaar observed that people’s brains tend to concentrate on a moving “window” of about six to seven previous responses. Over this interval, people could alternate between options reasonably “randomly.” However, they were not so good at switching between options over longer time intervals. The size of the window, around six to seven events long, could well be a consequence of Miller’s earlier observation.

In the years since Miller published his work, researchers delved further into human memory capacity. It turns out that the value Miller jokingly referred to as the “magical number seven” is not so magical after all. Miller himself noted that when people had to remember only binary numbers—such as zeros and ones—they could recite a sequence of about eight digits. In fact, the size of the data “chunks” humans can remember depends on the complexity of the information. People might be able to recall seven numbers, but there is evidence they can recite only six letters or so, and five one-syllable words.

In some cases, people have learned to increase the amount of information they can recall. In memory championships, the best competitors can memorize over a thousand playing cards in an hour. They do this by changing the format of the data chunks they remember; rather than thinking in terms of raw numbers, they try to memorize images as part of a journey. Cards become celebrities or objects; the sequence becomes a series of events in which their card characters feature. This helps the competitors’ brains shelve and retrieve the information more efficiently. As discussed in the previous chapter, memorizing cards also helps in blackjack, with card counters “bucketing” information to reduce the amount they have to store. Such storage problems have interested researchers looking at artificial minds as well as those working on human ones. Nick Metropolis said Stanislaw Ulam “often mused about the nature of memory and how it was implemented in the brain.”

‹ Prev Next ›