You Look Like a Thing and I Love You

Page 18

by Janelle Shane

The AIs that autocomplete search-engine queries learn on the fly, and that can lead to weird results when humans are in the mix. The problem with humans is that if search-engine autocomplete makes a really hilarious mistake, humans will tend to click on it, which just makes the AI even more likely to suggest it to the next human. This famously happened in 2009 with the phrase “Why won’t my parakeet eat my diarrhea?”7 Humans found this suggested question so hilarious that soon the AI was suggesting it as soon as people began typing “Why won’t.” Probably a human at Google had to manually intervene to stop the AI from suggesting that phrase.

As I mentioned in chapter 7, there are also dangers if predictive-policing algorithms learn on the job. If an algorithm sees that there are more arrests in a particular neighborhood than there are in others, it will predict that there will be more arrests there in the future, too. If the police respond to this prediction by sending more officers to the area, it may become a self-fulfilling prophecy: more police on the streets means that even if the actual crime rate is no higher than it is in other neighborhoods, the police will witness more crimes and make more arrests. When the algorithm sees the new arrest data, it may predict an even higher arrest rate in that neighborhood. If the police respond by increasing their presence in the neighborhood, then the problem will only escalate. Of course, it doesn’t require an AI to be susceptible to this kind of feedback loop—very simple algorithms and even humans fall for this as well.

Here’s a very simple feedback loop in action: in 2011 a biologist named Michael Eisen noticed something odd when a researcher in his lab tried to buy a particular textbook about fruit flies.8 The book was out of print but not terribly rare; there were used copies available on Amazon for around $35. The two new copies available, however, were priced at $1,730,045.91 and $2,198,177.95 (plus $3.99 shipping). When Eisen checked again the next day, both books had increased in price, to nearly $2.8 million. Over the next few days, a pattern emerged: in the morning, the company that sold the less expensive book would increase its price so that it was exactly 0.9983 times the price of the more expensive book. In the afternoon, the expensive book’s price would increase to become exactly 1.270589 times the price of the cheaper book. Both companies were apparently using algorithms to set their book prices. It was clear that one company wanted to charge as much as it could while still having the cheapest book available. But what was the motivation of the company that sold the more expensive book? Eisen noticed that that company had very good feedback scores and theorized that it was counting on this to induce some customers to pay a slightly higher price for the book—at which point it would order the book from the cheaper company and ship it to the customer, pocketing the profit. After about a week the spiraling prices dropped back to normal. Apparently some human had noticed the problem and corrected it. But companies use unsupervised algorithmic pricing all the time. Once, when I checked Amazon, there were several coloring books being offered for $2,999 apiece.

So the book prices were the products of simple rules-based programs. But machine learning algorithms can make trouble in even more exciting new ways. A 2018 paper showed that two machine learning algorithms in a situation like the book-pricing setup above, each given the task of setting a price that maximizes profits, can learn to collude with each other in a way that’s both highly sophisticated and highly illegal. They can do this without explicitly being taught to collude and without communicating directly with each other—somehow, they manage to set up a price-fixing scheme just by observing each other’s prices. This has only been demonstrated in a simulation so far, not in a real-world pricing scenario. But people have estimated that a large portion of online prices are being set by autonomous AIs, so the prospect of widespread price fixing is worrying. Collusion is great for sellers—if everyone cooperates to set high prices, then profits go up—but it’s bad for consumers. Even without meaning to, sellers could potentially be using AI to do things that it’s illegal for them to do explicitly.9 This is just another face of the mathwashing phenomenon I brought up in chapter 7. Humans will have to make sure that their AIs aren’t being tricked by bad actors or accidentally becoming bad actors themselves.

LET THE AI HANDLE THIS ONE

Human-level performance is the gold standard for a lot of machine learning algorithms. After all, much of the time their task is to imitate examples of humans doing stuff: labeling pictures, filtering emails, naming guinea pigs. And in cases where their performance is more or less at a human level, they can (with supervision) be used to replace humans for tasks that are repetitive or boring. We’ve seen in earlier chapters that some news organizations are using machine learning algorithms to automatically create boring but acceptable articles on local sports or real estate. A project called Quicksilver automatically creates draft Wikipedia articles about female scientists (who have been noticeably underrepresented on Wikipedia), saving volunteer editors time. People who need to write audio transcripts or translate text use the (admittedly buggy) machine learning versions as a starting point for their own translations. Musicians can employ music-generating algorithms, using them to put together a piece of original music to exactly fit a commercial slot for which the music doesn’t have to be exceptional, just inexpensive. In many cases, the human role is to be an editor.

And there are some jobs for which it’s even preferable not to use humans. People are more likely to open up about their emotions or disclose potentially stigmatizing information if they think they’re talking to a robot as opposed to a human.10,11 (On the other hand, healthcare chatbots could potentially miss serious health concerns).12 Bots have also been trained to look through disturbing images and flag potential crimes (though they tend to mistake desert scenes for human flesh).13 Even crime itself may be more easily committed by a robot than a human. In 2016, Harvard student Serena Booth built a robot that was meant to test some theories about whether humans trust robots too much.14 Booth built a simple remote-controlled robot and had it drive up to students, asking to be allowed access to a key card–controlled dorm. Under those circumstances, only 19 percent of people let it into the dorm (interestingly, that number was a bit higher when the students were in groups). However, if the same robot said it was delivering cookies, 76 percent let it in.

As I mentioned above, some AIs may also be good at crime because of the mathwashing phenomenon. An AI’s decisions can be based on complex relationships between several variables, some of which may be proxies for information that it’s not supposed to have, like gender or race. That adds a layer of obfuscation that may—intentionally or not—be allowing it to get away with breaking laws.

There are also plenty of cases in which AI is preferable because it exceeds human performance. For one, it’s usually much faster than humans. In some multiplayer computer games, when AI plays against humans, the AI has to be slowed down to give the humans a fighting chance. AI is also more consistent, if terrible at handling the unexpected. Can AI also be fairer? Potentially. An AI-powered system, at least, can be tested for fairness by running lots of test decisions and looking for statistical correlations that shouldn’t be there. By carefully adjusting the training data to make its statistics match the world as it should be rather than the world as it is, it would be possible in many cases to train an AI whose decisions are fair—at least, much fairer than your average human’s.

ALGORITHMIC CREATIVITY?

Will the music, movies, and novels of the future be written by AI? Maybe at least partially.

AI-generated art can be striking, weird, and unsettling: infinitely morphing tulips; glitchy humans with half-melted faces; skies full of hallucinated dogs. A T. rex may turn into flowers or fruit; the Mona Lisa may take on a goofy grin; a piano riff may turn into an electric guitar solo. AI-generated text may take on the quality of surrealist performance art.

When prompted with the following text: “My 10 favorite animals are: 1.” the neural network GPT-2 added this list:

My 10 favorite anima
ls are:

1. zebras with a white scar on the back;

2. insiduous spiders and octopus;

3. frog with large leaves, hopefully black;

4. cockatiel with scales;

5. razorbill with wings hanging about 4 inches from one’s face and a heart tattoo on a frog.

Like AI problem solving, AI creativity could probably best be described as “AI-aided.”

For a GAN to produce a painting, it first needs a dataset, and a human chooses what that dataset should be. Some of the most interesting GAN results occur when artists give the algorithms their own paintings, or their own photography, to learn from. The artist Anna Ridler, for example, spent a spring taking ten thousand photos of tulips, then used her photos to train a GAN that produced an endless series of nearly photorealistic tulips, each tulip’s stripiness tied to the price of Bitcoin. The artist and software engineer Helena Sarin has produced interesting GAN remixes of her own watercolors and sketches, morphing them into cubist or weirdly textured hybrids. Other artists are inspired to choose existing datasets—like public-domain Renaissance portraits or landscapes—and see what a GAN might make with them. Curating a dataset is also an artistic act—add more styles of painting, and a hybrid or corrupted artwork might result. Prune a dataset to a single consistent angle, style, or type of lighting, and the neural net will have an easier time matching what it sees to produce more realistic images. Start with a model trained on a large dataset, then use transfer learning to focus in on a smaller but more specialized dataset, for even more ways to fine-tune the results.

People who train text-generating algorithms also can control their results via their datasets. Science fiction writer Robin Sloan is one of a few writers experimenting with neural network–generated text as a way of injecting some unpredictability into his writing.15 He built a custom tool that responds to his own sentences by predicting the next sentence in the sequence based on its knowledge of other science fiction stories, science news articles, and even conservation news bulletins. Demonstrating his tool in an interview with the New York Times, Sloan fed it the sentence “The bison are gathered around the canyon,” and it responded with “by the bare sky.” It wasn’t a perfect prediction in the sense that there was something noticeably off about the algorithm’s sentence. But for Sloan’s purposes, it was delightfully weird. He’d even rejected an earlier model he’d trained on 1950s and 1960s science fiction stories, finding its sentences too clichéd.

Like collecting the datasets, training the AI is an artistic act. How long should training last? An incompletely trained AI can sometimes be interesting, with weird glitches or garbled spelling. If the AI gets stuck and begins to produce garbled text or strange visual artifacts like multiplying grids or saturated colors (a process known as mode collapse), should the training start over? Or is this effect kinda cool? As in other applications, the artist will also have to watch to make sure the AI doesn’t copy its input data too closely. As far as an AI knows, an exact copy of its dataset is just what it’s being asked for, so it will plagiarize if it possibly can.

And finally, it’s the human artist’s job to curate the AI’s output and turn it into something worthwhile. GANs and text-generating algorithms can create virtually infinite amounts of output, and most of it isn’t very interesting. Some of it is even terrible—remember that many text-generating neural nets don’t know what their words mean (I’m looking at you, neural net that suggested naming cats Mr. Tinkles and Retchion). When I train neural nets to generate text, only a tiny fraction—a tenth or a hundredth—of the results are worth showing. I’m always curating the results to present a story or some interesting point about the algorithm or the dataset.

In some cases, curating the output of an AI can be a surprisingly involved process. I used BigGAN in chapter 4 to show how image-generating neural nets struggle when trained on images that are too varied—but I didn’t talk about one of its coolest features: generating images that are a blend of multiple categories.

Think of “chicken” as a point in space and “dog” as a point in space. If you take the shortest path between them, you pass other points in space that are somewhere between the two, in which chickendogs have feathers, floppy ears, and lolling tongues. Start at “dog” and travel toward “tennis ball,” and you’ll pass through a region of fuzzy green spheres with black eyes and boopable noses. This huge multidimensional visual landscape of possibility is called latent space. And once BigGAN’s latent space was accessible, artists began to dive in to explore. They quickly found coordinates where there were overcoats covered in eyes and trench coats covered in tentacles, angular-faced dog-birds with both eyes on one side of their faces, picture-perfect hobbit villages complete with ornate rounded doors, and flaming mushroom clouds with cheerful puppy faces. (ImageNet has a lot of dogs in it, as it turns out, so the latent space of BigGAN is also full of dogs.) Methods of navigating latent space become themselves artistic choices. Should we travel in straight lines or curves? Should we keep our locations close to our origin point or allow ourselves to veer off into extreme far-flung corners? Each of these choices drastically affects what we see. The rather utilitarian categories of ImageNet blend into utter weirdness.

Is all this art AI-generated? Absolutely. But is the AI the thing doing the creative work? Not by a long shot. People who claim that their AIs are the artists are exaggerating the capabilities of the AIs—and selling short their own artistic contributions and those of the people who designed the algorithms.

CONCLUSION

Life among our artificial friends

Over the course of these pages, we’ve seen lots of different ways that AI can surprise us.

Given a problem to solve, and enough freedom in how to solve it, AIs can come up with solutions that their programmers never dreamed existed. Tasked with walking from point A to point B, an AI may decide instead to assemble itself into a tower and fall over. It may decide to travel by spinning in tight circles or twitching along the floor in a writhing heap. If we train it in simulation, it may hack into the very fabric of its universe, figuring out ways to exploit physics glitches to attain superhuman abilities. It will take instructions literally: when told to avoid collisions, it will refuse to move; when told to avoid losing a video game, it will find the Pause button and freeze the game forever. It will find patterns hidden in its training data, even patterns its programmers didn’t expect. Some of the patterns may be ones we didn’t want it to emulate, like bias. Modular AIs may cascade together, cooperating to accomplish tasks that no single AI could tackle alone, acting like a phone full of apps or even a swarm of bees.

As AI becomes ever more capable, it still won’t know what we want. It will still try to do what we want. But there will always be a potential disconnect between what we want AI to do and what we tell it to do. Will it get smart enough to understand us and our world as another human does—or even to surpass us? Probably not in our lifetimes. For the foreseeable future, the danger will not be that AI is too smart but that it’s not smart enough.

On the surface, AI will seem to understand more. It will be able to generate photorealistic scenes, maybe paint entire movie scenes with lush textures, maybe beat every computer game we can throw at it. But underneath that, it’s all pattern matching. It only knows what it has seen and seen enough times to make sense of.

Our world is too complicated, too unexpected, too bizarre for an AI to have seen it all during training. The emus will get loose, the kids will start wearing cockroach costumes, and people will ask about giraffes even when there aren’t any present. AI will misunderstand us because it lacks the context to know what we really want it to do.

To take the best way forward with AI, we’ll have to understand it—understand how to choose the right problems for it to solve, how to anticipate its misunderstandings, and how to prevent it from copying the worst of what it finds in human data. There’s every reason to be optimistic about AI and every reason to be cautious. It all depends
on how well we use it.

And watch out for those hidden giraffes.

Acknowledgments

This book would not exist without the hard work, insight, and generosity of a bunch of people who I’m delighted to thank here.

A huge thanks to the team at Voracious, whose hard work turned my sprawling, meandering document into a thing that I love. Barbara Clark’s copyediting improved this book immeasurably, and it is lighter for the removal of a metric ton of actuallys. Thanks especially to my editor, Nicky Guerreiro, who emailed me out of the blue one day to say it was her fifth time stifling laughter in her open-plan office, and had I thought about how my blog might translate into a book? Without Nicky’s encouragement and keen insight, this book would not have the scope and courage that it does.

Warm thanks also to my agent, Eric Lupfer, at Fletcher and Company for cheerfully guiding a first-time author through the many steps of turning a blog into a book.

The first time I heard about machine learning was in 2002 when Erik Goodman gave a fascinating talk about evolutionary algorithms to the incoming freshmen at Michigan State University. I guess those anecdotes about algorithms breaking simulations and solving the wrong problem really stuck with me! Thanks for sparking that interest early—it has led me to so much joy.

‹ Prev Next ›