Before building a Jeopardy machine, Ferrucci and his team had to carry this vision one step further: They had to make a case that a market existed outside the rarefied world of Jeopardy for advanced question-answering technology. IBM’s biggest division, after all, was Global Services, which included one of the world’s largest consultancies. It sold technical and strategic advice to corporations all over the world. Could the consultants bundle this technology into their offerings? Would this type of machine soon be popping up in offices and answering customers’ questions on the phone?
Ferrucci envisioned a Jeopardy machine spawning a host of specialized know-it-alls. With the right training, a technology that could understand everyday language and retrieve answers in a matter of seconds could fit just about anywhere. Its first job would likely be in call centers. It could answer tax questions, provide details about bus schedules, ask about the symptoms of a laptop on the fritz and walk a customer through a software update. That stuff was obvious. But there were plenty of other jobs. Consider publicly traded companies, Ferrucci said. They had to comply with a dizzying assortment of rules and regulations, everything from leaks of inside information in e-mails to the timely disclosure of earnings surprises or product failures to regulators and investors. A machine with Watson’s skills could stay on top of these compliance matters, pointing to possible infractions and answering questions posed in ordinary English. A law firm could call on such a machine to track down the legal precedent for every imaginable crime, complaint, or trademark.
Perhaps the most intriguing opportunity was in medicine. While IBM was creating the Jeopardy machine, one of the top medical shows on television featured a nasty genius named Gregory House. In the beginning of most episodes a character would collapse, tumbling to the ground during a dance performance, a lovers’ spat, or a kindergarten class. Each one suffered from a different set of symptoms, many of them gruesome. In the course of the following hour, amid the medical team’s social and sexual dramas, House and his colleagues would review the patient’s worsening condition. There had to be a pattern. Who could find it and match it to a disease, ideally before the patient died? Drawing from their own experience, the doctors each mastered a diverse set of data. The challenge was to correlate that information to the ever-changing list of symptoms on the white board in House’s office. Toward the end of the show, House would often notice some detail—perhaps a lyric in a song or an unlikely bruise. And that would lead his magnificent mind straight to a case he remembered or a research paper he’d read about bee stings or tribal rites in New Guinea. By the end of the show, the patient was headed toward recovery.
An advanced question-answering machine could serve as a bionic Dr. House. Unlike humans, it could stay on top of the tens of thousands of medical research papers published every year. And, just as in Jeopardy, it could come up with lists of potential answers, or diagnoses, for each patient’s ills. It could also direct doctors toward the evidence it had considered and provide its reasoning. The machine, lacking common sense, would be far from perfect. Just as the Jeopardy computer was certain to botch a fair number of clues, the diagnoses coming from a digital Dr. House would sometimes be silly. So people would still run the show, but they’d be assisted by a powerful analytical tool.
In those early days, only a handful of researchers took part in the Jeopardy project at IBM. They could fit easily into Ferrucci’s office at the research center in Hawthorne, New York, about thirty-five miles north of New York City (and a fifteen-minute drive from corporate headquarters, in Armonk). But to build a knowledge machine, Ferrucci knew, would require extensive research and development. In a sense, a Jeopardy machine would represent an entire section of the human brain. To build it, he would need specialists in many aspects of cognition. Some would be experts in language, others in the retrieval of information. Some would attempt to program the machine with judgment, writing algorithms to steer it toward answers. Others would guide it in so-called machine learning, so that it could train itself to pick the most statistically promising combinations of words and pay more attention to trustworthy sources. Experts in hardware, meanwhile, would have to build a massive computer, or a network of them, to process all of this work. Assembling these efforts on a three-year timetable amounted to a daunting management challenge. The cost of failure would be humiliation, for both the researchers and their company.
Other complications came from the West Coast, specifically the Robert Young building on the Sony lot in Culver City, a neighborhood just south of Hollywood. Unlike chess, a treasure we all share, the Jeopardy franchise belonged to Sony Pictures Entertainment, an arm of the Japanese consumer electronics giant. The Jeopardy executives, led by a canny negotiator named Harry Friedman, weren’t about to let IBM use their golden franchise and their millions of viewers on its own terms. Over the years, the two companies jousted over the terms of the game, the placement of logos, access to stars such as Ken Jennings and Brad Rutter, and the writing of Jeopardy clues. They even haggled over the computer’s speed on the buzzer and whether IBM should take measures to slow it to a human level. These disagreements echoed until the eve of the match. At one point, only months before the showdown, Jeopardy’s executives appeared to be on the verge of pulling the plug on the entire venture. That would have left IBM’s answering computer, the product of three intense years of research, scrounging for another game to play. This particular disagreement was resolved. But the often conflicting dictates of promotion, branding, science, and entertainment forged a fragile and uneasy alliance.
The Jeopardy project also faced harsh critics within IBM’s own scientific community. This was to be expected in a field—Artificial Intelligence—where the different beliefs about knowledge, intelligence, and the primacy of the human brain bordered on the theological. How could there be any consensus in a discipline so vast? While researchers in one lab laboriously taught machines the various meanings of the verb “to do,” futurists just down the hall insisted that computers would outrace human intelligence in a couple of decades, controlling the species. Beyond its myriad approaches and outlooks, the field could be divided into two camps, idealists and pragmatists. The idealists debated the nature of intelligence and aspired to build computers that could think conceptually, like human beings, perhaps surpassing us. The pragmatists created machines to carry out tasks. Ferrucci, who had promised to have a television-ready computer by early 2011, fell firmly into the second camp—and his team attracted plenty of barbs for it. The Jeopardy machine would sidestep the complex architecture of the brain and contrive to answer questions without truly understanding them. “It’s just another gimmick,” said Sajit Rao, a professor in computer science at MIT who’s attempting to teach computers to conceptualize forty-eight different verbs. “It’s not addressing any fundamental problems.” But as Ferrucci countered, teaching a machine to answer complex questions on a broad range of subjects would represent a notable advance, whatever the method.
IBM’s computer would indeed come to answer a dizzying variety of questions—and would raise one of its own. With machines like this in our future, what do we need to store in our own heads? This question, of course, has been recurring since the dawn of the Internet, the arrival of the calculator, and even earlier. With each advance, people have made internal adjustments and assigned ever larger quantities of memory, math, geography, and more to manmade tools. It makes sense. Why not use the resources at hand? In the coming age, it seems, forgoing an effective answering tool will be like volunteering for a lobotomy.
In a sense, many of us living through this information revolution share something with the medieval monks who were ambushed by the last one. They spent years of their lives memorizing sacred texts that would soon be spilling off newfangled printing presses. They could have saved lots of time, and presumably freed up loads of capacity, by archiving those texts on shelves. (No need to discuss here whether the monks were eager for “free time,” a concept dangerously close to Sloth, the fourth of
the Seven Deadly Sins.) In the same way, much of the knowledge we have stuffed into our heads over the years has been rendered superfluous by new machinery.
So what does this say about Ken Jennings and Brad Rutter, the humans preparing to wage cognitive war with Watson? Are they relics? Sure, they might win this round. But the long-term prognosis is grim. Garry Kasparov, the chess master who fell to IBM’s Deep Blue, recently wrote that the golden age of man-machine battles in chess lasted from 1994 to 2004. Before that decade, machines were too dumb; after it, the roles were reversed. While knowledge tools, including Watson, relentlessly advance, our flesh-and-blood brains, some argue, have stayed more or less the same for forty thousand years, treading evolutionary water from the Cro-Magnon cave painters to Quentin Tarantino.
A few decades ago, know-it-alls like Ken Jennings seemed to be the model of human intelligence. They aced exams. They had dozens of facts at their fingertips. In one quiz show that predated Jeopardy, College Bowl, teams of the brainiest students would battle one another for the honor of their universities. Later in life, people turned to them in boardrooms, university halls, and cocktail parties for answers. Public education has been designed, in large part, to equip millions with a ready supply of factual answers. But if Watson can top them, what is this kind of intelligence worth?
Physical strength has suffered a similar downgrade. Not so long ago, a man with superhuman strength played a valuable role in society. He was a formidable soldier. When villagers needed boulders moved or metal bent, he got the call. After the invention of steam engines and hydraulic pumps, however, archetypal strongmen were shunted to jobs outside the productive economy. They turned to bending metal in circuses or playing noseguard on Sunday afternoons. For many of us, physical strength, once so vital, has become little more than a fashion statement. Modern males now display muscles as mating attire, much the way peacocks fan their otherwise useless feathers.
It would be all too easy to dismiss human foes of the IBM machine as cognitive versions of circus strongmen: trivia wunderkinds. But from the very beginning, Ferrucci saw that the game required far more than the simple regurgitation of facts. It involved strategy, decision making, pattern recognition, and a knack for nuance in the language of the clues. As the computer grew from a whimsical idea into a Jeopardy behemoth, it underwent an entire education, triumphing in some areas, floundering in others. Its struggles, whether in untangling language or grappling with abstract ideas, highlighted the areas in which humans maintain an edge. It is in the story of Watson’s development that we catch a glimpse of the future of human as well as machine intelligence.
The secret is wrapped up in the nature of knowledge itself. What is it? For humans, knowledge is an entire universe, a welter of sensations and memories, desires, facts, skills, songs and images, words, hopes, fears, and regrets, not to mention love. But for those hoping to build intelligent machines, it has to be simpler. Broadly speaking, it falls into three categories: sensory input, ideas, and symbols. Consider the color blue. Sensory perception is the raw material of knowledge. It’s something that computers and people alike can perceive, each in their own fashion. Now think of the word “sky.” Those three letters are a symbol for the biggest piece of blue in our world. Computers can handle such symbols. They can find references to “sky” in documents and, when programmed, correlate it with others, such as “blue,” “clouds,” and “heaven.” A computer can master both sensory data and symbols. It can count, categorize, search, and store them. But how about this snippet from Lord Byron: “Friendship is love without his wings.” That sentence represents the third realm of knowledge: ideas. How can a machine make sense of them? In these early years of the twenty-first century, ideas remain the dominion of people—and the frontier for thinking machines.
David Ferrucci’s mission was to explore that frontier. Like many in his profession, Ferrucci grew up watching Star Trek on television. The characters on the show, humans and pointy-eared Vulcans alike, spoke to their computer as if it were one of them. No formatting was necessary, no key words, no programming language. They spoke English. The computer understood the meaning and context of the questions. It consulted vast databases and came back with an immediate answer. True, it might not produce original ideas. But it was an extravagantly well-informed shipmate. That was the computer Ferrucci wanted to build.
As he served the last drops of his wine, Ferrucci was talking about the world he was busy creating, one in which people and their machines often appeared to switch roles. He didn’t know, he said, whether engineers would ever be able to “create a sentient being.” But when he looked at his fellow humans through the eyes of a computer scientist, he saw patterns of behaviors that often appeared to be programmed. He mentioned the zombielike commutes, the retreat to the same chair, the hand reaching for the TV remote, and the near-identical routines, from toothbrushing to feeding the animals. “It’s more interesting,” he said, “when humans delve inside themselves and say, ‘Why am I doing this? And why is it relevant and important to be human?’” His machine would nudge people toward that line of inquiry. Even with an avatar for a face and a robotic voice, the Jeopardy machine would invite comparisons to the other two contestants on the stage. This was inevitable. And whether it won or lost on a winter evening in 2011, the computer might lead millions of spectators to reflect on the nature, and probe the potential, of their own humanity.
1. The Germ of the Jeopardy Machine
THE JEOPARDY MACHINE’S birthplace—if a computer can stake such a claim—was the sprawling headquarters of the global research division named after its flesh-and-blood ancestor, IBM’s founder, Thomas J. Watson. In 1957, when IBM presided over the rest of the infant computer industry, the company cleared woods on a hill in Yorktown Heights, New York, about forty miles north of midtown Manhattan, and hired the Finnish-American architect Eero Saarinen to design a lab. If computing was the future, as seemed inevitable, it was on this hill that a good part of it would be dreamed up, modeled mathematically, and prototyped. Saarinen was a natural choice to express this sparkling future in glass and rock. A year earlier, he had designed the winged TWA Terminal for the new Idlewild Airport (later called JFK). Before that, he’d drawn up the majestic Gateway Arch that would loom over St. Louis. In Yorktown, it was as if he had laid the Gateway Arch on its side. The building, with three stories of glass walls, curved along the top of the hill. For visitors strolling the wide corridors decades later, the combination of the structure’s rough stone and the broad vistas of rolling hills still delivered just the right message of wealth, vision, and permanence.
The idea for a Jeopardy machine, at least according to one version of the story, dates back to an autumn day in 2004. For several years, top executives at the company had been pushing researchers to come up with the next Grand Challenge. In the ’90s, the challenge had been to build a computer that would beat a grand champion in chess. This produced Deep Blue. Its 1997 victory over Garry Kasparov turned into a global event and fortified IBM’s reputation as a giant in cutting-edge computing. (This grew more important as consumer and Web companies, from Microsoft to Yahoo!, threatened to steal the spotlight—and the young brainpower. Google was still just a couple of grad students at Stanford.) Later, in another Grand Challenge in the first years of the new century, IBM produced Blue Gene, the world’s fastest supercomputer.
What would the next challenge be? On that fall day, a senior manager at IBM Research named Charles Lickel drove north from his lab, up the Hudson, to the town of Poughkeepsie, and spent the day with a small team he managed. That evening, the group went to the Sapore Steakhouse in nearby Fishkill, where they could order venison, elk, or buffalo, or split a whopping fifty-two-ounce porterhouse steak for two. There, something strange happened. At seven o’clock, many of the diners stood up from their tables, their food untouched, and filed into the bar, which had a television set. “The dining room emptied,” Lickel said. People were packed in there, three rows deep, to see whether K
en Jennings, who had won more than fifty straight matches on Jeopardy, would win again. He did. A half hour later, the crowd returned to their food, raving about the question-answering phenom. As Lickel noted, their steaks had to have been stone cold.
Though he hadn’t watched much Jeopardy since he was a kid, that scene in the bar gave him an idea for the next Grand Challenge. What if an IBM computer could beat Ken Jennings? (Other accounts have it that the vision for a Jeopardy computer was already circulating along the corridors of the Yorktown lab. The original idea, it turns out, is tough to trace.)
In any event, Lickel pushed the idea. In the first meeting, it provoked plenty of dissent. Chess was nearly as clean and timeless as mathematics itself, a cerebral treasure handed down through the ages. Jeopardy, by contrast, looked questionable from the get-go. Produced by a publicly traded company, Sony, and subject to ratings and advertisers, it was in the business of making money and pleasing investors. It was Hollywood, for crying out loud. “There was a lot of doubt in the room,” Lickel said. “People wanted something more obviously scientific.” A second argument was perhaps more compelling: people playing Jeopardy would in all likelihood annihilate an IBM machine. “They all grabbed me after the meeting,” Lickel recalled, “and said, ‘Charles, you’re going to regret this.’”
In the end, it was up to Paul Horn. A former professor of physics at the University of Chicago, Horn had headed IBM’s three-thousand-person research arm since 1996. “If you think about smart machines,” he later said, “Blue Gene by some measures had the raw computing power of the human brain, at least within an order of magnitude or two.” Horn discussed those early days in his sun-splashed office at New York University, where he took up residence after his 2008 retirement from IBM. He had a black beard, and a tiny ponytail poked out from the back of his head.
Final Jeopardy Page 2