by Paul Scharre
The ability to do target identification is the key missing link in building a DIY autonomous weapon. An autonomous weapon is one that can search for, decide to engage, and engage targets. That requires three abilities: the ability to maneuver intelligently through the environment to search; the ability to discriminate among potential targets to identify the correct ones; and the ability to engage targets, presumably through force. The last element has already been demonstrated—people have armed drones on their own. The first element, the ability to autonomously navigate and search an area, is already available outdoors and is coming soon indoors. Target identification is the only piece remaining, the only obstacle to someone making an autonomous weapon in their garage. Unfortunately, that technology is not far off. In fact, as I stood in the basement of the building watching Shield AI’s quadcopter autonomously navigate from room to room, autonomous target recognition was literally being demonstrated right outside, just above my head.
DEEP LEARNING
The research group asked that they not be named, because the technology was new and untested. They didn’t want to give the impression that it was good enough—that the error rate was low enough—to be used for military applications. Nor, it was clear, were military applications their primary intention in designing the system. They were engineers, simply trying to see if they could solve a tough problem with technology. Could they send a small drone out entirely on its own to autonomously find a crashed helicopter and report its location back to the human?
The answer, it turns out, is yes. To understand how they did it, we need to go deep.
Deep learning neural networks, first mentioned in chapter 5 as one potential solution to improving military automatic target recognition in DARPA’s TRACE program, have been the driving force behind astounding gains in AI in the past few years. Deep neural networks have learned to play Atari, beat the world’s reigning champion at go, and have been behind dramatic improvements in speech recognition and visual object recognition. Neural networks are also behind the “fully automated combat module” that Russian arms manufacturer Kalashnikov claims to have built. Unlike traditional computer algorithms that operate based on a script of instructions, neural networks work by learning from large amounts of data. They are an extremely powerful tool for handling tricky problems that can’t be easily solved by prescribing a set of rules to follow.
Let’s say, for example, that you wanted to write down a rule set for how to visually distinguish an apple from a tomato without touching, tasting, or smelling. Both are round. Both are red and shiny. Both have a green stem on top. They look different, but the differences are subtle and evade easy description. Yet a three-year-old child can immediately tell the difference. This is a tricky problem with a rules-based approach. What neural networks do is sidestep that problem entirely. Instead, they learn from vast amounts of data—tens of thousands or millions of pieces of data. As the network churns through the data, it continually adapts its internal structure until it optimizes to achieve the correct programmer-specified goal. The goal could be distinguishing an apple from a tomato, playing an Atari game, or some other task.
In one of the most powerful examples of how neural networks can be used to solve difficult problems, the Alphabet (formerly Google) AI company DeepMind trained a neural network to play go, a Chinese strategy game akin to chess, better than any human player. Go is an excellent game for a learning machine because the sheer complexity of the game makes it very difficult to program a computer to play at the level of a professional human player based on a rules-based strategy alone.
The rules of go are simple, but from these rules flows vast complexity. Go is played on a grid of 19 by 19 lines and players take turns placing stones—black for one player and white for the other—on the intersection points of the grid. The objective is to use one’s stones to encircle areas of the board. The player who controls more territory on the board wins. From these simple rules come an almost unimaginably large number of possibilities. There are more possible positions in go than there are atoms in the known universe, making go 10100 (one followed by a hundred zeroes) times—literally a googol—more complex than chess.
Humans at the professional level play go based on intuition and feel. Go takes a lifetime to master. Prior to DeepMind, attempts to build go-playing AI software had fallen woefully short of human professional players. To craft its AI, called AlphaGo, DeepMind took a different approach. They built an AI composed of deep neural networks and fed it data from 30 million games of go. As explained in a DeepMind blog post, “These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like connections.” Once the neural network was trained on human games of go, DeepMind then took the network to the next level by having it play itself. Our goal is to beat the best human players, not just mimic them,” as explained in the post. “To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks, and adjusting the connections using a trial-and-error process known as reinforcement learning.” AlphaGo used the 30 million human games of go as a starting point, but by playing against itself could reach levels of game play beyond even the best human players.
This superhuman game play was demonstrated in the 4–1 victory AlphaGo delivered over the world’s top-ranked human go player, Lee Sedol, in March 2016. AlphaGo won the first game solidly, but in game 2 demonstrated its virtuosity. Partway through game 2, on move 37, AlphaGo made a move so surprising, so un-human, that it stunned professional players watching the match. Seemingly ignoring a contest between white and black stones that was under way in one corner of the board, AlphaGo played a black stone far away in a nearly empty part of the board. It was a surprising move not seen in professional games, so much so that one commentator remarked, “I thought it was a mistake.” Lee Sedol was similarly so taken by surprise he got up and left the room. After he returned, he took fifteen minutes to formulate his response. AlphaGo’s move wasn’t a mistake. European go champion Fan Hui, who had lost to AlphaGo a few months earlier in a closed-door match, said at first the move surprised him as well, and then he saw its merit. “It’s not a human move,” he said. “I’ve never seen a human play this move. So beautiful.” Not only did the move feel like a move no human player would never make, it was a move no human player probably would never make. AlphaGo rated the odds that a human would have made that move as 1 in 10,000. Yet AlphaGo made the move anyway. AlphaGo went on to win game 2 and afterward Lee Sedol said, “I really feel that AlphaGo played the near perfect game.” After losing game 3, thus giving AlphaGo the win for the match, Lee Sedol told the audience at a press conference, “I kind of felt powerless.”
AlphaGo’s triumph over Lee Sedol has implications far beyond the game of go. More than just another realm of competition in which AIs now top humans, the way DeepMind trained AlphaGo is what really matters. As explained in the DeepMind blog post, “AlphaGo isn’t just an ‘expert’ system built with hand-crafted rules; instead it uses general machine learning techniques to figure out for itself how to win at Go.” DeepMind didn’t program rules for how to win at go. They simply fed a neural network massive amounts of data and let it learn all on its own, and some of the things it learned were surprising.
In 2017, DeepMind surpassed their earlier success with a new version of AlphaGo. With an updated algorithm, AlphaGo Zero learned to play go without any human data to start. With only access to the board and the rules of the game, AlphaGo Zero taught itself to play. Within a mere three days of self-play, AlphaGo Zero had eclipsed the previous version that had beaten Lee Sedol, defeating it 100 games to 0.
These deep learning techniques can solve a variety of other problems. In 2015, even before DeepMind debuted AlphaGo, DeepMind trained a neural network to play Atari games. Given only the pixels on the screen and the game score as input and told to maximize the score, the neural network was able to learn to play A
tari games at the level of a professional human video game tester. Most importantly, the same neural network architecture could be applied across a vast array of Atari games—forty-nine games in all. Each game had to be individually learned, but the same neural network architecture applied to any game; the researchers didn’t need to create a customized network design for each game.
The AIs being developed for go or Atari are still narrow AI systems. Once trained, the AIs are purpose-built tools to solve narrow problems. AlphaGo can beat any human at go, but it can’t play a different game, drive a car, or make a cup of coffee. Still, the tools used to train AlphaGo are generalizable tools that can be used to build any number of special-purpose narrow AIs to solve various problems. Deep neural networks have been used to solve other thorny problems that have bedeviled the AI community for years, notably speech recognition and visual object recognition.
A deep neural network was the tool used by the research team I witnessed autonomously find the crashed helicopter. The researcher on the project explained that he had taken an existing neural network that had already been trained on object recognition, stripped off the top few layers, then retrained the network to identify helicopters, which hadn’t originally been in its image dataset. The neural network he was using was running off of a laptop connected to the drone, but it could just as easily have been running off of a Raspberry Pi, a $40 credit-card sized processor, riding on board the drone itself.
All of these technologies are coming from outside the defense sector. They are being developed at places like Google, Microsoft, IBM, and university research labs. In fact, programs like DARPA’s TRACE are not necessarily intended to invent new machine learning techniques, but rather import existing techniques into the defense sector and apply them to military problems. These methods are widely available to those who know how to use them. I asked the researcher behind the helicopter-hunting drone: Where did he get the initial neural network that he started with, the one that was already trained to recognize other images that weren’t helicopters? He looked at me like I was either half-crazy or stupid. He got it online, of course.
NEURAL NETS FOR EVERYONE
I feel I should confess that I’m not a technologist. In my job as a defense analyst, I research military technology to make recommendations about where the U.S. military should invest to keep its edge on the battlefield, but I don’t build things. My undergraduate degree was in science and engineering, but I’ve done nothing even remotely close to engineering since then. To claim my programming skills were rusty would be to imply that at one point in time they existed. The extent of my computer programming knowledge is a one-semester introductory course in C++ in college.
Nevertheless, I went online to check out the open-source software database the researcher pointed me to: TensorFlow. TensorFlow is an open-source AI library developed by Google AI researchers. With TensorFlow, Google researchers have taken what they have been learning with deep neural networks and passed it on to the rest of the world. On TensorFlow, not only can you download already trained neural networks and software for building your own, there are reams of tutorials on how to teach yourself deep learning techniques. For users new to machine learning, there are basic tutorials on classic machine learning problems. These tools make neural networks accessible to computer programmers with little to no experience in machine learning. TensorFlow makes neural networks easy, even fun. A tutorial called Playground (playground.tensorflow.org) allows users to modify and train a neural network through a point-and-click interface in the browser. No programming skills are required at all.
Once I got into Playground, I was hooked. Reading about what neural networks could do was one thing. Building your own and training it on data was entirely another. Hours of time evaporated as I tinkered with the simple network in my browser. The first challenge was training the network to learn to predict the simple datasets used in Playground—patterns of orange and blue dots across a two-dimensional grid. Once I’d mastered that, I worked to make the leanest network I could, composed of the fewest neurons in the fewest number of layers that could still accurately make predictions. (Reader challenge: once you’ve mastered the easy datasets, try the spiral.)
With the Playground tutorial, the concept of neural nets becomes accessible to someone with no programming skills at all. Using Playground is no more complicated than solving an easy-level Sudoku puzzle and within the range of an average seven-year-old. Playground won’t let the user build a custom neural net to solve novel problems. It’s an illustration of what neural nets can do to help users see their potential. Within other parts of TensorFlow, though, lie more powerful tools to use existing neural networks or design custom ones, all within reach of a reasonably competent programmer in Python or C++.
TensorFlow includes extensive tutorials on convolutional neural nets, the particular type of neural network used for computer vision. In short order, I found a neural network available for download that was already trained to recognize images. The neural network Inception-v3 is trained on the ImageNet dataset, a standard database of images used by programmers. Inception-v3 can classify images into one of 1,000 categories, such as “gazelle,” “canoe,” or “volcano.” As it turns out, none of the categories Inception-v3 is trained on are those that could be used to identify people, such as “human,” “person,” “man,” or “woman.” So one could not, strictly speaking, use this particular neural network to power an autonomous weapon that targets people. Still, I found this to be little consolation. ImageNet isn’t the only visual object classification database used for machine learning online and others, such as the Pascal Visual Object Classes database, include “person” as a category. It took me all of about ten seconds on Google to find trained neural networks available for download that could find human faces, determine age and gender, or label human emotions. All of the tools to build an autonomous weapon that could target people on its own were readily available online.
This was, inevitably, one of the consequences of the AI revolution. AI technology was powerful. It could be used for good purposes or bad purposes; that was up to the people using it. Much of the technology behind AI was software, which meant it could be copied practically for free. It could be downloaded at the click of a button and could cross borders in an instant. Trying to contain software would be pointless. Pandora’s box has already been opened.
ROBOTS EVERYWHERE
Just because the tools needed to make an autonomous weapon were widely available didn’t tell me how easy or hard it would be for someone to actually do it. What I wanted to understand was how widespread the technological know-how was to build a homemade robot that could harness state-of-the-art techniques in deep learning computer vision. Was this within reach of a DIY drone hobbyist or did these techniques require a PhD in computer science?
There is a burgeoning world of robot competitions among high school students, and this seemed like a great place to get a sense of what an amateur robot enthusiast could do. The FIRST Robotics Competition is one such competition that includes 75,000 students organized in over 3,000 teams across twenty-four countries. To get a handle on what these kids might be able to do, I headed to my local high school.
Less than a mile from my house is Thomas Jefferson High School for Science and Technology—“TJ,” for short. TJ is a math and science magnet school; kids have to apply to get in, and they are afforded opportunities above and beyond what most high school students have access to. But they’re still high school students—not world-class hackers or DARPA whizzes.
In the Automation and Robotics Lab at TJ, students get hands-on experience building and programming robots. When I visited, two dozen students sat at workbenches hunched over circuit boards or silently tapping away at computers. Behind them on the edges of the workshop lay discarded pieces of robots, like archeological relics of students’ projects from semesters prior. On a shelf sat “Roby Feliks,” the Rubik’s Cube solving robot. Nearby, a Raspberry Pi processor sat ato
p a plastic musical recorder, wires running from the circuit board to the instrument like some musical cyborg. Somewhat randomly in the center of the floor sat a half-disassembled robot, the remnants of TJ’s admission to the FIRST competition that year. Charles Dela Cuesta, the teacher in charge of the lab, apologized for the mess, but it was exactly what I imagined a robot lab should look like.
Dela Cuesta came across as the kind of teacher you pray your own children have. Laid back and approachable, he seemed more like a lovable assistant coach than an aloof disciplinarian. The robotics lab had the feel of a place where students learn by doing, rather than sitting and copying down equations from a whiteboard.
Which isn’t to say that there wasn’t a whiteboard. There was. It sat in a corner amid a pile of other robotic projects, with circuit boards and wires draped over it. Students were designing an automatic whiteboard with a robot arm that could zip across the surface and sketch out designs from a computer. On the whiteboard were a series of inhumanly straight lines sketched out by the robot. It was at this point that I wanted to quit my job and sign up for a robotics class at TJ.
Dela Cuesta explained that all students at TJ must complete a robotics project in their freshmen year as part of their required coursework. “Every student in the building has had to design a small robot that is capable of navigating a maze and performing some sort of obstacle avoidance,” he said. Students are given a schematic of what the maze looks like so they get to choose how to solve the problem, whether to preprogram the robot’s moves or take the harder path of designing an autonomous robot that can figure it out on its own. After this required class, TJ offers two additional semesters of robotics electives, which can be complemented with up to five computer science courses in which students learn Java, C++, and Python. These are vital programming tools for using robot control systems, like the Raspberry Pi processor, which runs on Linux and takes commands in Python. Dela Cuesta explained that even though most students come into TJ with no programming experience, many learn fast and some even take computer science courses over the summer to get ahead. “They can pretty much program in anything—Java, Python. . . . They’re just all over the place,” he said. Their senior year, all students at TJ must complete a senior project in an area of their choosing. Some of the most impressive robotics projects are those done by seniors who choose to make robotics their area of focus. Next to the whiteboard stood a bicycle propped up on its kickstand. A large blue box sat inside the frame, wires snaking out of it to the gear shifters. Dela Cuesta explained it was an automatic gear shifter for the bike. The box senses when it is time to shift and does so automatically, like an automatic transmission on a car.