You Look Like a Thing and I Love You

Home > Other > You Look Like a Thing and I Love You > Page 8
You Look Like a Thing and I Love You Page 8

by Janelle Shane


  The first thing we do is come up with the bits that the evolutionary algorithm can vary, deciding what about our robot we want to be constant and what the algorithm is free to play with. We could make these variable elements very limited, with a fixed body design, and just allow the program to change the way the robot moves around. Or we could allow the algorithm to build a body design completely from scratch, starting from random blobs. Let’s say that the owners of this building are insisting on a humanlike robot design for sci-fi-aesthetic reasons. No messy jumble of crawling blocks (which is what an evolutionary algorithm’s creatures tend to look like, given absolute freedom). Within a basic humanlike form, there’s still a lot we could vary, but let’s keep it simple and say that the algorithm will be allowed to vary the size and shape of a few basic body parts, with each one having a simple range of motion. In evolutionary terms, this is the robot’s genome.

  The next thing we need to do is define the problem we’re trying to solve in such a way that there’s a single number we can optimize. In evolutionary terms, this number is the fitness function: a single number that will describe how fit an individual robot is for our task. Since we’re trying to build a robot that can direct humans down one hallway or the other, let’s say that we’re trying to minimize the number of humans that take the left-hand fork. The closer that number is to zero, the higher the fitness.

  We’ll also need a simulation, because there’s no way we’re building thousands of robots to order or hiring people to walk down a hall thousands of times. (Not using real humans is also a safety consideration—for reasons that will be clear later.) So let’s say it’s a simulated hall in a world with simulated gravity and friction and other simulated physics. And of course we need simulated people with simulated behaviors, including walking, lines of sight, crowding, and various phobias, motivations, and levels of cooperativeness. The simulation itself is a really hard problem, so let’s just say we’ve solved it already. (Note: in actual machine learning, it’s never this easy.)

  One handy way of getting a ready-made simulation that can train an AI is to use video games. That’s partly why there are so many researchers training AIs to play Super Mario Bros. or old Atari games—these old video games are small, quick-to-run programs that can test various problem-solving skills. Just like human video-game players, though, AIs tend to find and exploit bugs in the games. More about this in chapter 5.

  We let the algorithm randomly create our first generation of robots. They’re… very random. A typical generation produces hundreds of robots, each with a different body design.

  Then we test each robot individually in our simulated hallway. They don’t do well. People walk right past them as they flop on the ground and flail ineffectually. Maybe one of them falls a bit more to the left than the others and blocks that hallway slightly, and a few of the more timid humans decide to take the right hallway instead. It scores slightly better than the other robots.

  Now it’s time to build the next generation of robots. First, we’ll choose which robots are going to survive to reproduce. We could save just the very best robot, but that would make the population pretty uniform, and we wouldn’t get to try out some other robot designs that might end up being better if evolution gets a chance to tweak them. So we’ll save some of the best robots and throw out the rest.

  Next, we have lots of choices about how the surviving robots are going to reproduce. They can’t simply make identical copies of themselves, because we want them to be evolving toward something better. One option we have is mutation: pick a random robot and randomly vary something about it.

  Another option we might decide to use is crossover: two robots produce offspring that are random combinations of the two parents.

  We also have to decide how many offspring each robot can have (should the most successful robots have the most offspring?), which robots can cross with which other robots (or if we use crossover at all), and whether we’re going to replace all the dead robots with offspring or with a few randomly generated robots. Tweaking all these options is a big part of building an evolutionary algorithm, and sometimes it’s hard to guess which options—which hyperparameters—are going to work best.

  Once we’ve built the new generation of robots, the cycle begins again as we test their crowd-controlling abilities in the simulation. More of them are now flopping over to the left because they’re descended from that first marginally successful robot.

  After many more generations of robots, some distinct crowd-control strategies start to emerge. Once the robots learn to stand up, the original “fall to the left and be kinda in the way” strategy has evolved into a “stand in the left hallway and be even more annoying” strategy. Another strategy also emerges—the “point vigorously to the right” strategy. But none of the strategies is perfectly solving our problem yet: each robot is still letting plenty of people leak into the left hallway.

  After many more generations, a robot emerges that is very good at preventing people from entering the left hallway. Unfortunately, by a stroke of bad luck, it just so happens that the solution it found was “murder everyone.” Technically that solution works because all we told it to do was minimize the number of people entering the left hallway.

  Because of a problem with our fitness function, evolution directed the algorithm toward a solution that we hadn’t anticipated. Unfortunate shortcuts happen in machine learning all the time, although not usually this dramatically. (Fortunately for us, in real life, “kill all humans” is usually very impractical. Don’t give autonomous algorithms deadly weapons is the message here.) Still, this is why we used simulated humans rather than real humans in our thought experiment.

  We’ll have to start over again, this time with a fitness function that, rather than minimizing the number of humans in the left-hand hallway, maximizes the number of humans who take the right-hand hallway.

  Actually, we can take a (somewhat gory) shortcut and just change the fitness function rather than completely starting over. After all, our robots have learned many useful skills besides murdering people. They’ve learned to stand, detect people, and move their arms in a scary manner. Once our fitness function changes to maximizing the number of survivors who enter the right-hand hallway, the robots should quickly learn to forsake their murdering ways. (Recall that this strategy of reusing a solution from a different but related problem is called transfer learning.)

  So we start with the group of murdering robots and sneakily swap the fitness function on them. Suddenly, murdering isn’t working very well at all, and they don’t know why. In fact, the robot that was the worst at murdering is now at the top of the heap, because some of its screaming victims managed to escape down the right-hand hallway. Over the next few generations, the robots quickly become ever worse at murdering.

  Eventually, maybe they only look like they might want to murder you, which would scare most humans into entering the right-hand hallway. By starting with a population of murderbots, we do restrict the path that evolution is likely to take. Had we started over instead, we might have evolved robots that stood at the end of the right-hand hallway and beckoned people or even robots whose hands evolved into signs that said FREE COOKIES. (The “free cookies” robot would be hard to evolve, though, because getting the sign merely partially right wouldn’t work at all, and it would be hard to reward a solution that was only getting close. In other words, it’s a needle-in-the-haystack solution.)

  All murderbots aside, the most likely path that evolution would have taken is the “fall down and be in the way” robot getting ever more annoyingly in the way. (Falling down is pretty easy to do, so if an evolved robot can solve a problem by falling down, it will tend to do that.) Through that path we may arrive at a robot that solves the problem perfectly by causing 100 percent of humans to enter the right-hand hallway (murdering none of them in the process). The robot looks like this:

  Yes, we have evolved: a door.

  That’s the other thing about AI. It
can sometimes be a needlessly complicated substitute for a commonsense understanding of the problem.

  Evolutionary algorithms are used to evolve all kinds of designs, not just robots. Car bumpers that dissipate force when they crumple, proteins that bind to other medically useful proteins, flywheels that spin just so—these are all problems that people have used evolutionary algorithms to solve. The algorithm doesn’t have to stick to a genome that describes a physical object, either. We could have a car or bicycle with a fixed design and a control program that evolves. I mentioned earlier that the genome can even be the weights of a neural network or the arrangement of a decision tree. Different kinds of machine learning algorithms are often combined like this, each playing to its strength.

  When we consider the huge array of life that has arisen on our planet via evolution, we get an idea of the magnitude of possibility that’s available to us by using virtual evolution at a massively accelerated speed. Just as real-life evolution has managed to produce wonderfully complex creatures and allow them to take advantage of the weirdest, most specific food sources, evolutionary algorithms continue to surprise and delight us with their ingenuity. Of course, sometimes evolutionary algorithms can be a little too creative—as we’ll see in chapter 5.

  GENERATIVE ADVERSARIAL NETWORKS (GANS)

  AIs can do amazing things with images, turning a summer scene into a winter one, generating faces of imaginary people, or changing a photo of someone’s cat into a cubist painting. These showy image-generating, image-remixing, and image-filtering tools are usually the work of GANs (generative adversarial networks). They’re a subvariety of neural networks, but they deserve their own mention. Unlike the other kinds of machine learning in this chapter, GANs haven’t been around very long—they were only introduced by Ian Goodfellow and other Université de Montréal researchers in 2014.10

  The key thing about GANs is they’re really two algorithms in one—two adversaries that learn by testing each other. One, the generator, tries to imitate the input dataset. The other, the discriminator, tries to tell the difference between the generator’s imitation and the real thing.

  To see why this is a helpful way of training an image generator, let’s go through a hypothetical example. Suppose we want to train a GAN to generate images of horses.

  The first thing we’ll need is lots of example pictures of horses. If they all show the same horse in the same pose (maybe we’re obsessed with that particular horse), the GAN will learn more quickly than if we give it a huge variety of colors and angles and lighting conditions. We can also simplify things by using a plain, consistent background. Otherwise the GAN will spend a long time trying to learn when and how to draw fences, grass, and parades. Most of the GANs that can generate photorealistic faces, flowers, and foods were given very limited, consistent datasets—pictures of just cat faces, for example, or bowls of ramen photographed only from the top. A GAN trained just on photos of tulip heads may produce very convincing tulips but will have no idea about other kinds of flowers or even any concept that tulips have leaves or bulbs. A GAN that can generate photorealistic human head shots won’t know what’s below the neck, what’s on the back of the head, or even that human eyes can close. So this is all to say that if we’re going to make a horse-generating GAN, we’ll have better success if we make its world a very simple one and only give it pictures of horses photographed from the side against a plain white background. (Conveniently, this is also about the extent of my drawing ability.)

  Now that we have our dataset (or, in our case, now that we’ve imagined one), we’re ready to start training the two parts of the GAN, the generator and the discriminator. We want the generator to look at our set of horse pictures and figure out some rules that will let it make pictures similar to them. Technically what we are asking the generator to do is warp random noise into pictures of horses—that way, we can get it to generate not just one single horse picture but also a different horse for every random noise pattern.

  At the beginning of the training, though, the generator hasn’t learned any rules about drawing horses. It starts with our random noise and does something random to it. As far as it knows, that is how you draw a horse.

  How can we give the generator useful feedback on its terrible drawings? Since this is an algorithm, it needs feedback in the form of a number, some kind of quantitative rating that the generator can work on improving. One useful metric would be the percentage of instances in which it makes a drawing that’s so good that it looks just like a real horse. A human could easily judge this—we’re pretty good at telling the difference between a smear of fur and a horse. But the training process is going to require many thousands of drawings, so it’s impractical to have a human judge rate them all. And a human judge would be too harsh at this stage—they would look at two of the generator’s scribbles and rate them both as “not horse,” even if one of them is actually ever so imperceptibly more horselike than the other. If we give the generator feedback on how often it manages to fool a human into thinking one of its drawings is real, then it will never know if it’s making progress because it will never fool the human.

  This is where the discriminator comes in. The discriminator’s job is to look at the drawings and decide if they’re real horses from the training set. At the beginning of training, the discriminator is just about as awful at its job as the generator is: it can barely tell the difference between the generator’s scribbles and the real thing. The generator’s almost imperceptibly horselike scribbles might actually succeed in fooling the discriminator.

  Through trial and error, both the generator and the discriminator get better.

  The GAN is, in a way, using its generator and discriminator to perform a Turing test in which it is both judge and contestant. The hope is that by the time training is over, it’s generating horses that would fool a human judge as well.

  Sometimes people will design GANs that don’t try to match the input dataset exactly but instead try to make something “similar but different.” For example, some researchers designed a GAN to produce abstract art, but they wanted art that wasn’t a boring knockoff of the art in the training data. They set up the discriminator to judge whether the art was like the training data yet not identifiable as belonging to any particular category. With these two somewhat contradictory goals, the GAN managed to straddle the line between conformity and innovation.11 And consequently, its images were popular—human judges even rated the GAN’s images more highly than human-painted images.

  MIXING, MATCHING, AND WORKING TOGETHER

  We learned that GANs work by combining two algorithms—one that generates images and one that classifies images—to reach a goal.

  In fact, a lot of AIs are made of combinations of other, more specialized machine learning algorithms.

  Microsoft’s Seeing AI app, for example, is designed for people with vision impairments. Depending on which “channel” a user selects, the app can do things like

  • recognize what’s in a scene and describe it aloud,

  • read text held up to a smartphone’s camera,

  • read denominations of currency,

  • identify people and their emotions, and

  • locate and scan bar codes.

  Each one of these functions—including its crucial text-to-speech function—is likely powered by an individually trained AI.

  Artist Gregory Chatonsky used three machine learning algorithms to generate paintings for a project called It’s Not Really You.12 One algorithm was trained to generate abstract art, and another algorithm’s job was to transform the first algorithm’s artwork into various painterly styles. Finally, the artist used an image recognition algorithm to give the images titles such as Colorful Salad, Train Cake, and Pizza Sitting on a Rock. The final artwork was a multialgorithm collaboration planned and orchestrated by the artist.

  Sometimes the algorithms are even more tightly integrated, using multiple functions at once without human intervention. For example,
researchers David Ha and Jürgen Schmidhuber used evolution to train an algorithm inspired by the human brain to play one level of the computer game Doom.13 The algorithm consisted of three algorithms working together. A vision model was in charge of perceiving what was going on in the game—were there fireballs in view? Were there walls nearby? It transformed the 2-D image of pixels into the features it had decided were important to keep track of. The second model, a memory model, was in charge of trying to predict what would happen next. Just as the text-generating RNNs in this book look at past history to predict what letter or word is likely to come next, the memory model was an RNN that looked at previous moments in the game and tried to predict what would happen next. If there had been a fireball moving to the left a few moments earlier, it’s probably going to still be there in the next image, just a bit farther to the left. If the fireball had been getting bigger, it’s probably going to continue to get bigger (or it may hit the player and cause a huge explosion). Finally, the third algorithm was the controller, whose job was to decide what actions to take. Should it dodge to the left to avoid being hit by the fireball? Maybe that would be a good idea.

  So the three parts worked together to see fireballs, realize they were approaching, and dodge out of the way. The researchers chose each subalgorithm’s form so that it would be optimized for its specific task. This makes sense, since we learned in chapter 2 that machine learning algorithms do best when they have a very narrow task to work on. Choosing the correct form for a machine learning algorithm, or breaking a problem into tasks for subalgorithms, is a key way programmers can design for success.

 

‹ Prev