CHAPTER 8
Is an AI brain like a human brain?
Machine learning algorithms are just lines of computer code, but as we’ve seen, they can do things that seem very human—learning by testing strategies, taking lazy shortcuts to solve problems, or avoiding the test altogether by deleting the answers. Furthermore, the designs of many machine learning algorithms are inspired by real-life examples. As we learned in chapter 3, neural networks are loosely based on the neurons of the human brain, and evolutionary algorithms are based on biological evolution. It turns out that many of the phenomena that turn up in brains or in living organisms also turn up in the AIs that imitate them. Sometimes they even emerge independently, without a programmer deliberately programming them in.
AI DREAM WORLDS
Picture throwing a sandwich hard against the wall. (If it helps, picture it as one of the terrible rejected sandwiches in chapter 3.) If you concentrate, you’ll probably be able to vividly picture every step of the process: the smooth or knobbly feel of the bread slices between your fingers; the texture of the crust if you’re chucking a baguette or a roll. You can probably picture how much the bread will give under your fingers—maybe your fingers will be pressed into it a little bit, but they won’t go all the way through. You may also picture the trajectory your arm makes as you draw back for the throw and the point in the swing at which you’ll release the sandwich. You know that it’ll leave your hand under its own momentum and that it might wobble or spin slightly as it flies through the air. You can even predict where it’ll hit the wall, how hard, how the bread might deform or split, and what will happen to the filling. You know that it won’t rise like a balloon or disappear or flash green and orange. (Well, not unless it’s a peanut butter, helium, and alien-artifact sandwich.)
In short, you have internal models of sandwiches, the physics of throwing things, and walls. Neuroscientists have studied these internal models, which govern our perceptions of the world and our predictions about the future. When a batter swings at a ball, the batter’s arms have started moving well before the ball leaves the pitcher’s hand—the ball isn’t even in the air long enough for the nerve impulses to travel to the batter’s muscles. Instead of judging the flight of the ball, the batter relies on an internal model of how a pitch behaves to time their swing. Many of our fastest reflexes work the same way, relying on internal models to predict the best reaction.
People who build AIs to navigate real or simulated landscapes, or to solve other tasks, often set them up with internal models as well. Part of the AI may be designed to observe the world, extract the important bits of information, and use them to build or update the internal model. Another part of the AI will use the model to predict what will happen if it takes various actions. Yet another part of the AI will decide which outcome is the best. As the AI trains, it gets better at all three tasks. Humans learn in a very similar way—constantly making and updating assumptions about the world around them.
Some neuroscientists believe that dreaming is a way of using our internal models for low-stakes training. Want to test out scenarios for escaping from an angry rhinoceros? It is far safer to test them out in a dream than by poking at a real rhino. Based on this principle, machine learning programmers sometimes use dream training to help their algorithms learn faster. In chapter 3, we looked at an algorithm—really three AIs in one—whose goal was to stay alive as long as possible in one level of the computer game Doom.1 By combining visual perception of the game screen, memory of what happened in the past, and a prediction of what will happen next, the programmers built an algorithm that could make an internal model of the game level and use it to decide what to do. Just as in the example of the human baseball player, internal models are some of our best tools for training algorithms to learn to take action.
The particular twist here, however, was having the AI train not in the real game but inside the model itself—that is, having the AI test out new strategies in its own dream version of the game rather than the real thing. There are some advantages to doing it this way: because the AI has mostly learned to build its model out of the most important details, the dream version is less computationally intensive to run. This process also speeds up training because the AI can focus on these important details and ignore the rest. Unlike human dreaming, AI dreaming allows us to look at the internal model, as if we were peeking into the AI’s dream. What we see is a sketchy, blurry version of the game level. We can gauge how important the AI finds each feature of the game by the detail with which it’s rendered in the dream world. In this case, the fireball-throwing monsters are barely sketched in, but the fireballs themselves are rendered in realistic detail. The brick patterns on the walls, interestingly, are also there in the internal model—perhaps they’re important for judging how close to the wall a player is.
And sure enough, in this pared-down version of the universe, the AI can hone its prediction-making and decision-making skills, eventually getting good enough to avoid most of the fireballs. The skills it learns in the dream world are transferrable to the real computer game as well, so it gets better at the real thing by training in its internal model.
Not all the AI’s dream-tested strategies worked in the real world, however. One of the things it learned was how to hack its own dream—just like all those AIs in chapter 6 that hacked their simulations. By moving in a certain way, the AI discovered that it could exploit a glitch in its internal model that would prevent the monsters from firing any fireballs at all. This strategy, of course, failed in the real world. Human dreamers can sometimes be similarly disappointed when they wake and discover they can no longer fly.
REAL BRAINS AND FAKE BRAINS THINKING ALIKE
The Doom-playing AI had an internal model of the world because its programmers chose to design it with one. But there are cases in which neural networks have independently arrived at some of the same strategies that neuroscientists have discovered in animal brains.
In 1997, researchers Anthony Bell and Terrence Sejnowski trained a neural network to look at various natural scenes (“trees, leaves, and so on”) and see what features it could detect. Nobody told it what specifically to look for, just that it should separate out things that were different. (This kind of free-form analysis of a dataset is called unsupervised learning.) The network ended up spontaneously developing a bunch of edge-detecting and pattern-detecting filters that resemble the kinds of filters scientists have found in human and other mammalian vision systems. Without being specifically told to do so, the artificial neural network arrived at some of the same visual processing tricks that animals use.2
There have been other cases like this. Google DeepMind researchers discovered that when they built algorithms that were supposed to learn to navigate, they spontaneously developed grid-cell representations that resemble those in some mammal brains.3
Even brain surgery works on neural networks, in a manner of speaking. Remember that in chapter 3 I described how researchers looked at the neurons in an image-generating neural network (a GAN) and were able to identify individual neurons that generated trees, domes, bricks, and towers. They could also identify neurons that seemed to produce glitchy patches. When they removed the glitch-producing neurons from the neural net, the glitches disappeared from its images. They also found that they could deactivate the neurons that were generating certain objects, and, sure enough, those objects would disappear from the images.4
CONVERGENT EVOLUTION
Virtual nervous systems aren’t the only things that can resemble their real-life counterparts. Digital versions of evolution can come up with behaviors that have also evolved in real organisms—like cooperation, competition, deception, predation, and even parasitism. Even some of the strangest strategies of digitally evolved AIs have been found to have real-life equivalents.
In one virtual arena called PolyWorld, where simulated organisms could compete for food and resources, some creatures evolved the rather grim strategy of eating their children. P
roducing children consumed no resources in that world, but the children were a free source of food.5 And yes, real-life organisms have evolved a version of this as well. Some insects, amphibians, fish, and spiders produce unfertilized trophic eggs specifically for their offspring to eat. Sometimes the eggs are supplemental food, and in other cases, as in the case of Canthophorus niveimarginatus, a burrowing bug, the young are dependent on the eggs as a food source.6 Some ants and bees even produce trophic eggs as food for their queens. It’s not just eggs that are consumed by their siblings. Some sharks give birth to live young—and the ones that make it to birth survived by eating their siblings in utero.
CATASTROPHIC FORGETTING
Remember from chapter 2 that the narrower its task, the smarter an AI seems. And you can’t start with artificial narrow intelligence, teach it to do task after task, and end up with artificial general intelligence. If we try to teach a narrow AI a second task, it’ll forget the first one. You’ll end up with a narrow AI that has only learned whatever you taught it last.
I see this in action all the time when I’m training text-generating neural networks.
For example, here’s some output from a neural net I trained on a bunch of Dungeons & Dragons spells. It did its job pretty well—these are pronounceable, plausible spells and might even fool people into thinking they’re real. (I did search through the output for the best ones.)
Find Faithful
Entangling Stone
Bestow Missiles
Energy Secret
Resonating Mass
Mineral Control Spell
Holy Ship
Night Water
Feather Fail
Hail to the Dave
Delay Tail
Stunker’s Crack
Combustive Blaps
Blade of the Darkstone
Distracting Sphere
Love Hatter
Seed of Dance
Protection of Person of Ability
Undead Snow
Curse of King of Furch
Then I trained the same neural network on a new dataset: the names of pie recipes. Would I get a neural net that could produce both pies and spells? After just a little bit of training it did look like it might be beginning to happen as the D&D spells began to take on a distinctive flavor.
Discern Pie
Detect Cream
Tart of Death
Summon Fail Pie
Death Cream Swarm
Easy Apple Cream Tools
Bear Sphere Transport Pie
Crust Hammer
Glow Cream Pie
Switch Minor Pie
Wall of Tart
Bomb Cream Pie
Crust Music
Arcane Chocolation
Tart of Nature
Mordenkainen’s Pie
Rary’s Or Tentacle Cheese Cruster
Haunting Pie
Necroppostic Crostility
Tartle of the Flying Energy Crum
Alas, as training continued, the neural net quickly began to forget about the spells it had learned. It became good at generating pie names. In fact, it became great at generating pie names. But it was no longer a wizard.
Baked Cream Puff Cake
Reese’s Pecan Pie
Eggnog Peach Pie #2
Apple Pie With Fudge Treats
Almond-Blackberry Filling
Marshmallow Squash Pie
Cromberry Yas
Sweet Potato Piee
Cheesy Cherry Cheese Pie #2
Ginger Impossible Strawberry Tart
Coffee Cheese Pie
Florid Pumpkin Pie
Meat-de-Topping
Baked Trance Pie
Fried Cream Pies
Parades Or Meat Pies Or Cake #1
Milk Harvest Apple Pie
Ice Finger Sugar Pie
Pumpkin Pie With Cheddar Cookie
Fish Strawberry Pie
Butterscotch Bean Pie
Caribou Meringue Pie
This quirk of neural networks is known as catastrophic forgetting.7 A typical neural network has no way of protecting its long-term memory. As it learns new tasks, all its neurons are up for grabs, reconnected away from spell writing and put to use for pie inventing instead. Catastrophic forgetting is one thing that determines which problems are practical to solve with today’s AIs, and it shapes how we think about getting AI to do things.
Researchers are working on solving catastrophic forgetting, including trying to build in a kind of long-term memory made up of protected neurons, similar to the way human brains safely store long-term memories for decades.
Larger neural networks may be a bit more resilient against catastrophic forgetting, perhaps because their abilities are spread out among so many trained cells that not all of them are repurposed during transfer learning. A large algorithm like GPT-2 (the big text-generating neural network from chapter 2) is still able to generate Harry Potter fanfiction even after I’ve trained it for a long time on recipes. All I have to do is prompt it with a snippet of a story about Harry and Snape, and the recipe-trained GPT-2 remembers how to fill in the rest of the story. Amusingly, it has a tendency to steer the story toward food-related conversations. Prompt it with a paragraph from a horror novel and eventually the character will start sharing recipes and reminiscing about a “chocolate-covered, butter-and-cheese sandwich” and a conversation between Luke Skywalker and Obi-Wan Kenobi will soon turn to a discussion of Alderaanian fish sauce. In just a few paragraphs, a story that started with Snape confronting Harry about stolen potions became this dinner conversation about how to improve a soup recipe.
“I have to wonder though, if you actually ate this soup with a little fish in it. The soup is so full of flavor that there wasn’t even a single taste.”
“We ate this with a whole bunch of it.” Hermione pointed out. “We’re all eating this with a fish in it. It must be pretty good.”
“I think so,” Harry agreed. I have tried it with oyster sizzlers, with lobster, with shrimp and on lobster tails. It is very good.”
“I think it really was just a recipe for oyster sizzlers.”
“What was this? “Ron said from the kitchen.”
“That’s a very special soup to me because it’s so different. You have to start with the flavor and then gradually add other ingredients.”
Even if an AI gets large enough to handle several closely related tasks at once, it might end up doing each of them somewhat badly—remember the cat-generating neural net from chapter 4 that struggled to handle a variety of cat poses?
So far, the most common solution to catastrophic forgetting has been compartmentalization: every time we want to add a new task, we use a new AI. We end up with several independent AIs, each of which can do only one thing. But if we connect them all together and come up with a way of figuring out which AI we need at any given time, we will technically have an algorithm that can do more than one thing. Recall the Doom-playing AI that was really three AIs in one—one observing the world, one predicting what will happen next, and one deciding the best action to take.
Some researchers see catastrophic forgetting as one of the major obstacles stopping us from building a human-level intelligence. If an algorithm can only learn one task at a time, how can it take on the huge variety of conversational, analytical, planning, and decision-making tasks that humans do? It may be that catastrophic forgetting will always limit us to single-task algorithms. On the other hand, if enough single-task algorithms could coordinate themselves like ants or termites, they could solve complex problems by interacting with one another. Future artificial general intelligences, if they exist, could be more like a swarm of social insects than like humans.
BIAS AMPLIFICATION
In chapter 7 we saw some of the many ways that AIs can learn bias from their training data. It only gets worse.
Machine learning algorithms not only pick up bias from their training data, they also tend to become more biased than their training data
. From their perspective, they have only discovered a useful shortcut rule that helps them match the humans in their training data more often.
You can see how shortcut rules might be helpful. An image recognition algorithm might not be great at recognizing handheld objects, but if it also sees things like kitchen counters and cabinets and a stove, it might guess that the human in the picture is holding a kitchen knife, not a sword. In fact, even if it has no idea how to tell the difference between a sword and a kitchen knife, that doesn’t matter as long as it knows to mostly guess “kitchen knife” when the scene is a kitchen. It’s an example of the class imbalance problem from chapter 6, in which a classifying algorithm sees many more examples of one kind of input than another and learns that it can get a lot of accuracy for free by assuming the rare cases never happen.
Unfortunately, when class imbalance interacts with biased datasets, it often results in even more bias. Some researchers at the University of Virginia and the University of Washington looked at how often an image-classifying algorithm thought that humans photographed in kitchens were women versus how often they thought they were men.8 (Their research, and the original human-labeled dataset, focused on a binary gender, though the authors noted that this is an incomplete definition of the gender spectrum.) In the original human-labeled pictures, the pictures showed a man cooking only 33 percent of the time. Clearly the data already had gender bias. When they trained an AI on these pictures, however, they found that the AI labeled only 16 percent of the images as “man.” It had decided that it could increase its accuracy by assuming that any human in the kitchen was a woman.
You Look Like a Thing and I Love You Page 15