by John Markoff
In the hothouse community of technical workers that is Silicon Valley, however, it is difficult to keep secrets. It was obvious that something was afoot. Within a year after the final DARPA Grand Challenge event in 2007, Sebastian Thrun had taken a leave from Stanford and gone to work full-time at Google. His departure was never publicly announced, or even mentioned in the press, but among the Valley’s digerati, Thrun’s change of venue was noted with intense interest. A year later, while he was with colleagues in a bar in Alaska at an artificial intelligence conference, he spilled out a few tantalizing comments. Those words circulated back in Silicon Valley and made people wonder.
In the end, however, it was a high school friend of one of the low-paid drivers the company had hired to babysit its robotic Prius fleet who inadvertently spilled the beans. One of the kids I went to high school with is being paid fifteen dollars an hour by Google to sit in a car while it drives itself! a young college student blurted to me. At that point the secret became impossible to contain. The company was parking its self-driving cars in the open lots on the Google campus.
The Google engineers had made no effort to conceal the sensors attached to the roof of the ungainly-looking creatures, which looked even odder than their predecessor, Stanford’s Stanley. Rather than an array of sensors mounted above the windshield, each Prius had a single 360-degree lidar, mounted a foot above the center of the car’s roof. The coffee-can-sized mechanical laser, made by Velodyne, a local high-tech company, made it possible to easily create a real-time map of the surrounding environment for several hundred feet in all directions. It wasn’t cheap—at the time the lidar alone added $70,000 to the vehicle cost.
How did the odd-looking Toyotas, also equipped with less obtrusive radars, cameras, GPS, and inertial guidance sensors, escape discovery for as long as they did? There were several reasons. The cars were frequently driven at night, and the people who saw them confused them with a ubiquitous fleet of Google Street View cars, which had a large camera on a mast above the roof taking photographs that were used to build a visual map of the surrounding street as the car drove. (They also recorded people’s Wi-Fi network locations, which then could be used as beacons to improve the precision in locating Google’s Android smartphones.)
The Street View assumption usually hid the cars in plain sight, but not always. The Google engineer who had the pleasure of the first encounter with law enforcement was James Kuffner, a former CMU roboticist who had been one of the first members of the team. Kuffner had made a name for himself at Carnegie Mellon working on both navigation and a variety of humanoid robot projects. His expertise was in motion planning, figuring out how to teach machines to navigate in the real world. He was bitten by the robot car bug as part of Red Whittaker’s DARPA Grand Challenge team, and when key members of that group began to disappear into a secret Google project code-named Chauffeur, he jumped at the chance.
Late one night they were testing the robotic Prius in Carmel, one of the not-quite-urban driving areas they were focusing on closely. They were testing the system late at night because they were anxious to build detailed maps with centimeter accuracy, and it was easier to get baseline maps of the streets when no one was around. After passing through town several times with their distinctive lidar prominently displayed, Kuffner was sitting in the driver’s seat when the Prius was pulled over by a local policeman suspicious about the robot’s repeated passes.
“What is this?” he asked, pointing to the roof.
Kuffner, like all of the Google drivers, had been given strict instructions how to respond to this inevitable confrontation. He reached behind him and handed a prewritten document to the officer. The police officer’s eyes widened as he read it. Then he grew increasingly excited and kept the Google engineers chatting late into the night about the future of transportation.
The incident did not lead to public disclosure, but once I discovered the cars in the company’s parking lots while reporting for the New York Times, the Google car engineers relented and offered me a ride.
From a backseat vantage point it was immediately clear that in the space of just three years, Google had made a significant leap past the cars of the Grand Challenge. The Google Prius replicated much of the original DARPA technology, but with more polish. Engaging the autopilot made a whooshing Star Trek sound. Technically, the ride was a remarkable tour de force. A test drive began with the car casually gliding away from Google’s campus on Mountain View city streets. Within a few blocks, the car had stopped at both stop signs and stoplights and then merged onto rush-hour traffic on the 101 freeway. At the next exit the car then drove itself off the freeway onto a flyover overpass that curved gracefully over the 101. What was most striking to the first-time passenger was the car’s ability to steer around the curve exactly as a human being might. There was absolutely nothing robotic about AI’s driving behavior.
When the New York Times published the story, the Google car struck Detroit like a thunderbolt. The automobile industry had been adding computer technology and sensors to cars at a maddeningly slow pace. Even though cruise control had been standard for decades, intelligent cruise control—using sensors to keep pace with traffic automatically—was still basically an exotic feature in 2010. A number of automobile manufacturers had outposts in Silicon Valley, but in the wake of the publicity surrounding the Google car, the remaining carmakers rushed to build labs close by. Nobody wanted to see a repeat of what happened to personal computer hardware makers when Microsoft Windows became an industry standard and hardware manufacturers found that their products were increasingly low-margin commodities while much of the profit in the industry flowed to Microsoft. The automotive industry now realized that it was facing the same threat.
At the same time, the popular reaction to the Google car was mixed. There had long been a rich science-fiction tradition of Jetsons-like futuristic robot cars. They had even been the stuff of TV series like Knight Rider, a 1980s show featuring a crime fighter assisted by an artificially intelligent car. There was also a dark-side vision of automated driving, perhaps best expressed in Daniel Suarez’s 2009 sci-fi thriller Daemon, in which AI-controlled cars not only drove themselves, but ran people down as well. Still, the general perception was a deep well of skepticism about whether driverless cars would ever become a reality. However, Sebastian Thrun had made his point abundantly clear that humans are terrible drivers, largely the consequence of human fallibility and inattention. By the time his project was discovered, Google cars had driven more than a hundred thousand miles without an accident, and over the next several years that number would rise above a half-million miles. A young Google engineer, Anthony Levandowski, routinely commuted from Berkeley to Mountain View, a distance of fifty miles, in one of the Priuses, and Thrun himself would let a Google car drive him from Mountain View to his vacation home in Lake Tahoe on weekends.
Today, partially autonomous cars are already appearing on the market, and they offer two paths toward the future of transportation—one with smarter and safer human drivers and one in which humans will become passengers.
Google had not disclosed how it planned to commercialize its research, but by the end of 2013 more than a half-dozen automakers had already publicly stated their intent to offer autonomous vehicles. Indeed, 2014 was the year that the line was first crossed commercially when a handful of European car manufacturers including BMW, Mercedes, Volvo, and Audi announced an optional feature—traffic jam assist, the first baby step toward autonomous driving. In Audi’s case, while on the highway, the car will drive autonomously when traffic is moving at less than forty miles per hour, staying in its lane and requiring driver intervention only as dictated by lawyers fearful that passengers might go to sleep or otherwise distract themselves. In late 2014 Tesla announced that it would begin to offer an “autopilot” system for its Model S, making the car self-driving in some highway situations.
The autonomous car will sharpen the dilemma raised by the AI versus IA dichotomy. While there
is a growing debate over the liability issue—who will pay when the first human is killed by a robot car—the bar that the cars must pass to improve safety is actually incredibly low. In 2012 a National Highway Transportation Safety Administration study estimated that the deployment of electronic stability control (ESC) systems in light vehicles alone would save almost ten thousand lives and prevent almost a quarter million injuries.6 Driving, it would seem, might be one area of life where humans should be taken out of the loop to the greatest degree possible. Even unimpaired humans are not particularly good drivers, and we are worse when distracted by the gadgets that increasingly surround us. We will be saved from ourselves by a generation of cheap cameras, radars, and lidars that, when coupled with pattern-sensing computers, will wrap an all-seeing eye around our cars, whether we are driving or are being driven.
For Amnon Shashua, the aha moment came while seated in a university library as a young computer science undergraduate in Jerusalem. Reading an article written in Hebrew by Shimon Ullman, who had been the first Ph.D. student under David Marr, a pioneer in vision research, he was thrilled to discover that the human retina was in many ways a computer. Ullman was a computer scientist who specialized in studying vision in both humans and machines. The realization that computing was going on inside the eye fascinated Shashua and he decided to follow in Ullman’s footsteps.
He arrived at MIT in 1996 to study artificial intelligence when the field was still recovering from an earlier cycle of boom-and-bust. Companies had tried to build commercial expert systems based on the rules and logic approach of early artificial intelligence pioneers like Ed Feigenbaum and John McCarthy. In the heady early days of AI it had seemed that it would be straightforward to simply bottle the knowledge of a human expert, but the programs were fragile and failed in the marketplace, leading to the collapse of a number of ambitious start-ups. Now the AI world was rebounding. Progress in AI, which had been relatively stagnant for its first three decades, finally took off during the 1990s because statistical techniques made classification and decision-making practical. AI experiments hadn’t yet seen great results because the computers of the era were still relatively underpowered for the data at hand. The new ideas, however, were in the air.
As a graduate student Shashua would focus on a promising approach to visually recognizing objects based on imaging them from multiple views to capture their geometry. The approach was derived from the world of computer graphics, where Martin Newell had pioneered a new modeling approach as a graduate student at the University of Utah—which was where much of computer graphics was invented during the 1970s. A real Melitta teapot found in his kitchen inspired Newell’s approach. One day, as he was discussing the challenges of modeling objects with his wife over tea, she suggested that he model that teapot, which thereafter became an iconic image in the early days of computer graphics research.
At MIT, Shashua studied under computer vision scientists Tommy Poggio and Eric Grimson. Poggio was a scientist who stood between the worlds of computing and neuroscience and Grimson was a computer scientist who would later become MIT’s chancellor. At the time there seemed to be a straight path from capturing shapes to recognizing them, but programming the recognition software would actually prove daunting. Even today the holy grail of “scene understanding”—for example, not only identifying a figure as a woman but also identifying what she might be doing—is still largely beyond reach, and significant progress has been made only in niche industries. For example, many cars can now identify pedestrians or bicyclists in time to automatically slow before a collision.
Shashua would become one of the masters in pragmatically carving out those niches. In an academic world where brain scientists debated computational scientists, he would ally himself with a group who took the position that “just because airplanes don’t flap their wings, it doesn’t mean they can’t fly.” After graduate school he moved back to Israel. He had already founded a successful company, Cognitens, using vision modeling to create incredibly accurate three-dimensional models of parts for industrial applications. The images, accurate to hair-thin tolerances, gave manufacturers ranging from automotive to aerospace the ability to create digital models of existing parts, enabling checking their fit and finish. The company was quickly sold.
Looking around for another project, Shashua heard from a former automotive industry customer about an automaker searching for stereovision technology for computer-assisted driving. They knew about Shashua’s work in multiple-view geometry and asked if he had ideas for stereovision. He responded, “Well, that’s fine but you don’t need a stereo system, you can do it with a single camera.” Humans can tell distances with one eye shut under some circumstances, he pointed out.
The entrepreneurial Shashua persuaded General Motors to invest $200,000 to develop demonstration software. He immediately called a businessman friend, Ziv Aviram, and proposed that they start a new company. “There is an opportunity,” he told his friend. “This is going to be a huge field and everybody is thinking about it in the wrong way and we already have a customer, somebody who is willing to pay money.” They called the new company Mobileye and Shashua wrote software for the demonstration on a desktop computer, soon showing one-camera machine vision that seemed like science fiction to the automakers at that time.
Six months after starting the project, Shashua heard from a large auto industry supplier that General Motors was about to offer a competitive bid for a way to warn drivers that the vehicle was straying out of its lane. Until then Mobileye had been focusing on far-out problems like vehicle and pedestrian detection that the industry thought weren’t solvable. However, the parts supplier advised Shashua, “You should do something now. It’s important to get some real estate inside the vehicle, then you can build more later.”
The strategy made sense to Shashua, and so he put one of his Hebrew University students on the project for a couple of months. The lane-keeping software demonstration wasn’t awful, but he realized it probably wasn’t as good as what companies who’d started earlier could show, so there was virtually no way that the fledgling company would win.
Then he had a bright idea. He added vehicle detection to the software, but he told GM that the capability was a bug and that they shouldn’t pay attention. “It will be taken out in the next version, so ignore it,” he said. That was enough. GM was ecstatic about the safety advance that would be made possible by the ability to detect other vehicles at low cost. The automaker immediately canceled the bidding and committed to fund the novice firm’s project developments. Vehicle detection would facilitate a new generation of safety features that didn’t replace drivers, but rather augmented them with an invisible sensor and computer safety net. Technologies like lane departure warning, adaptive cruise control, forward collision warning, and anticollision braking are now rapidly moving toward becoming standard safety systems on cars.
Mobileye would grow into one of the largest international suppliers of AI vision technology for the automotive industry, but Shashua had bigger ideas. After creating Cognitens and Mobileye, he took a postdoctoral year at Stanford in 2001 and shared an office with Sebastian Thrun. Both men would eventually pioneer autonomous driving. Shashua would pursue the same technologies as Thrun, but with a more pragmatic, less “moon shot” approach. He had been deeply influenced by Poggio, who pursued biological approaches to vision, which were alternatives to using the brute force of increasingly powerful computers to recognize objects.
The statistical approach to computing would ultimately work best when both powerful clusters of computers, such as Google’s cloud, and big data sets were available. But what if you didn’t have those resources? This is where Shashua would excel. Mobileye had grown to become a uniquely Israeli technology firm, located in Jerusalem, close to Hebrew University, where Shashua teaches computer science. A Mobileye-equipped Audi served as a rolling research platform. Unlike the Google car, festooned with sensors, from the outside the Mobileye Audi looked normal, apart fro
m a single video camera mounted unobtrusively just in front of the rearview mirror in the center of the windshield. The task at hand—automatic driving—required powerful computers, hidden in the car’s trunk, with some room left over for luggage.
Like Google, Mobileye has significant ambitions that are still only partially realized. On a spring afternoon in 2013, two Mobileye engineers, Gaby Hayon and Eyal Bagon, drove me several miles east of Jerusalem on Highway 1 until they pulled off at a nondescript turnout where another employee waited in a shiny white Audi A7. As we got in the A7 and prepared for a test drive, Gaby and Eyal apologized to me. The car was a work in progress, they explained. Today Mobileye supplies computer vision technology to automakers like BMW, Volvo, Ford, and GM for safety applications. The company’s third-generation technology is touted as being able to detect pedestrians and cyclists. Recently, Nissan gave a hint of things to come, demonstrating a car that automatically swerved to avoid a pedestrian walking out from behind a parked car.
Like Google, the Israelis are intent on going further, developing the technology necessary for autonomous driving. But while Google might decide to compete with the automobile industry by partnering with an upstart like Tesla, Shashua is exquisitely sensitive to the industry culture exemplified by its current customers. That means that his vision system designs must cost no more than several hundred dollars for even a premium vehicle and less than a hundred for a standard Chevy.