You Look Like a Thing and I Love You
Page 16
There’s another way in which machine learning algorithms can perform spectacularly worse than humans, and that’s because they’re susceptible to a weird, very cyberpunk sort of hacking.
ADVERSARIAL ATTACKS
Suppose you’re running security at a cockroach farm. You’ve got advanced image recognition technology on all the cameras, ready to sound the alarm at the slightest sign of trouble. The day goes uneventfully until, reviewing the logs at the end of your shift, you notice that although the system has recorded zero instances of cockroaches escaping into the staff-only areas, it has recorded seven instances of giraffes. Thinking this a bit odd, perhaps, but not yet alarming, you decide to review the camera footage. You are just beginning to play the first “giraffe” time stamp when you hear the skittering of millions of tiny feet.
What happened?
Your image recognition algorithm was fooled by an adversarial attack. With special knowledge of your algorithm’s design or training data, or even via trial and error, the cockroaches were able to design tiny note cards that would fool the AI into thinking it was seeing giraffes instead of cockroaches. The tiny note cards wouldn’t have looked remotely like giraffes to people—just a bunch of rainbow-colored static. And the cockroaches didn’t even have to hide behind the cards—all they had to do was keep showing the cards to the camera as they walked brazenly down the corridor.
Does this sound like science fiction? Okay, besides the part about the sentient cockroaches? It turns out that adversarial attacks are a weird feature of machine learning–based image recognition algorithms. Researchers have demonstrated that they could show an image recognition algorithm a picture of a lifeboat (which it identifies as a lifeboat with 89.2 percent confidence), then add a tiny patch of specially designed noise way over in one corner of the image. A human looking at the picture could tell that this is obviously a picture of a lifeboat with a small patch of rainbow static over in one corner. The AI, however, identifies the lifeboat as a Scottish terrier with 99.8 percent confidence.9 The researchers managed to convince the AI that a submarine was in fact a bonnet and that a daisy, a brown bear, and a minivan were all tree frogs. The AI didn’t even know that it had been fooled by that specific patch of noise. When asked to change a few pixels that would make the bonnet look like a submarine again, the algorithm changed pixels sprinkled throughout the image rather than targeting the guilty noise patch.
That tiny adversarial patch of static is the difference between a functioning algorithm and a mass cockroach breakout.
It’s easiest to design an adversarial attack when you have access to the inner workings of the algorithm. But it turns out that you can fool a stranger’s algorithm, too. Researchers at LabSix have found that they can design adversarial attacks even when they don’t have access to the inner connections of the neural network. Using a trial-and-error method, they could fool neural nets when they had access only to their final decisions and even when they were allowed only a limited number of tries (100,000, in this case).10 Just by manipulating the images they showed it, they managed to fool Google’s image recognition tool into thinking a photo of skiers was a photo of a dog instead.
Here’s how: starting with a photo of a dog, they replaced some of its pixels one by one with pixels from a photo of skiers, making sure to only pick pixels that didn’t seem to have an effect on how much the AI thought the photo looked like a dog. If you played this game with a human, past a certain point the human would start to see the skiers overlaid on the picture of the dog. Eventually, when most of the pixels were changed, the human would see only skiers and no dog. The AI, however, still thought the picture was a dog, even after so many pixels were replaced that humans would see an obvious photo of skiers. The AI seemed to base its decisions on a few crucial pixels, their roles invisible to humans.
So could you protect your algorithm against adversarial attacks if you didn’t let anyone play with it or see its code? It turns out that it might still be susceptible if the attacker knows what dataset it has been trained on. As we’ll see later, this potential vulnerability shows up in real-world applications like medical imaging and fingerprint scanning.
The problem is that there are just a few image datasets in the world that are both free to use and large enough to be useful for training image recognition algorithms, and many companies and research groups use them. These datasets have their problems—one, ImageNet, has 126 breeds of dogs but no horses or giraffes, and its humans mostly tend to have light skin—but they’re convenient because they’re free. Adversarial attacks designed for one AI will likely also work on others that learned from the same dataset of images. The training data seems to be the important thing, not the details of the way the AI was designed. This means that even if you kept your AI’s code secret, hackers may still be able to design adversarial attacks that fool your AI if you don’t go to the time and expense of creating your own proprietary dataset.
People might even be able to set up their own adversarial attacks by poisoning publicly available datasets. There are public datasets, for example, to which people can contribute samples of malware to train anti-malware AI. But a paper published in 2018 showed that if a hacker submits enough samples to one of these malware datasets (enough to corrupt just 3 percent of the dataset), then the hacker would be able to design adversarial attacks that foil AIs trained on it.11
It’s not entirely clear why the training data matters so much more to the algorithm’s success than the algorithm’s design. And it’s a bit worrying, since it means that the algorithms may in fact be recognizing weird quirks of their datasets rather than learning to recognize objects in all kinds of situations and lighting conditions. In other words, overfitting might still be a far more widespread problem in image recognition algorithms than we’d like to believe.
But it also means that algorithms in the same family—algorithms that learned from the same training data—understand each other strangely well. When I asked an image recognition algorithm called AttnGAN to generate a photo of “a girl eating a large slice of cake,” it generated something barely recognizable. Blobs of cake floated around a fleshy hair-topped lump studded with far too many orifices. The cake texture was admittedly well done. But a human would not have known what the algorithm was trying to draw.
But do you know who can tell what AttnGAN was trying to draw? Other image recognition algorithms that were trained on the COCO dataset. Visual Chatbot gets it almost exactly right, reporting “a little girl is eating a piece of cake.”
The image recognition algorithms that were trained on other datasets, however, are mystified. “Candle?” guesses one of them. “King crab?” “Pretzel?” “Conch?”
The artist Tom White has used this effect to create a new kind of abstract art. He gives one AI a palette of abstract blobs and color washes and tells it to draw something (a jack-o’-lantern, for example) that another AI can identify.12 The resulting drawings look only vaguely like the things they’re supposed to be—a “measuring cup” is a squat green blob covered in horizontal scribbles, and a “cello” looks more like a human heart than a musical instrument. But to ImageNet-trained algorithms, the pictures are uncannily accurate. In a way, this artwork is a form of adversarial attack.
Of course, as in our earlier cockroach scenario, adversarial attacks are often bad news. In 2018 a team from Harvard Medical School and MIT warned that adversarial attacks in medicine could be particularly insidious—and profitable.13 Today, people are developing image recognition algorithms to automatically screen X-rays, tissue samples, and other medical images for signs of disease. The idea is to save time by doing high-throughput screening so humans don’t have to look at every image. Plus, the results could be consistent from hospital to hospital, everywhere the software is implemented—so they could be used to decide which patients qualify for certain treatments or to compare various drugs to one another.
That’s where the motivation for hacking comes in. In the United States, insurance
fraud is already lucrative, and some healthcare providers are adding unnecessary tests and procedures to increase revenue. An adversarial attack would be a handy, hard-to-detect way to move some patients from category A to category B. There’s also temptation to tweak the results of clinical trials so a profitable new drug gets approved. And since a lot of medical image recognition algorithms are generic ImageNet-trained algorithms that have had a little extra training time on a specialized medical dataset, they’re relatively easy to hack. This doesn’t mean it’s hopeless to use machine learning in medicine—it just means that we may always need a human expert spot-checking the algorithm’s work.
Another application that may be particularly vulnerable to adversarial attack is fingerprint reading. A team from New York University Tandon and Michigan State University showed that it could use adversarial attacks to design what it called a masterprint—a single fingerprint that could pass for 77 percent of the prints in a low-security fingerprint reader.14 The team was also able to fool higher-security readers, or commercial fingerprint readers trained on different datasets, a significant portion of the time. The masterprints even looked like regular fingerprints—unlike other spoofed images that contain static or other distortions—which made the spoofing harder to spot.
Voice-to-text algorithms can also be hacked. Make an audio clip of a voice saying “Seal the doors before the cockroaches get in,” and you can overlay noise that a human will hear as subtle static but that will make a voice-recognition AI hear the clip as “Please enjoy a delicious sandwich.” It’s possible to hide messages in music or even in silence.
Resume screening services might also be susceptible to adversarial attack—not by hackers with algorithms of their own but by people trying to alter their resumes in subtle ways to make it past the AI. The Guardian reports: “One HR employee for a major technology company recommends slipping the words ‘Oxford’ or ‘Cambridge’ into a CV in invisible white text, to pass the automated screening.”15
It’s not like machine learning algorithms are the only technology that’s vulnerable to adversarial attacks. Even humans are susceptible to the Wile E. Coyote style of adversarial attack: putting up a fake stop sign, for example, or drawing a fake tunnel on a solid rock wall. It’s just that machine learning algorithms can be fooled by adversarial attacks that humans would never even register. And as AI becomes more widespread, we may be in for an arms race between AI security and increasingly sophisticated and difficult-to-detect hacks.
An example of an adversarial attack that’s targeted at humans with touch screens: some advertisers have put fake specks of “dust” on their banner ads, hoping that humans will accidentally click on the ads while trying to brush them off.16
MISSING THE OBVIOUS
Without a way to see what AIs are thinking, or to ask them how they came to their conclusions (people are working on this), usually our first clue that something has gone wrong is when the AI does something weird.
An AI shown a sheep with polka dots or tractors painted on its sides will report seeing the sheep but will not report anything unusual about it. When you show it a sheep-shaped chair with two heads, or a sheep with too many legs, or with too many eyes, the algorithm will also merely report a sheep.
Why are AIs so oblivious to these monstrosities? Sometimes it’s because they don’t have a way to express them. Some AIs can only answer by outputting a category name—like “sheep”—and aren’t given an option for expressing that yes, it is a sheep, but something is very, very wrong. But there may often be another reason. It turns out that image recognition algorithms are very good at identifying scrambled images. If you chop an image of a flamingo into pieces and rearrange the pieces, a human will no longer be able to tell that it’s a flamingo. But an AI may still have no trouble seeing the bird. It’s still able to see an eye, a beak tip, and a couple of feet, and even though those aren’t in the right spot relative to one another, the AI is only looking for the features, not how they’re connected. In other words, the AI is acting like a bag-of-features model. Even AIs that theoretically are capable of looking at large shapes, not just tiny features, seem to often act like simple bag-of-features models.17 If the flamingo’s eyes are on its ankles, or if its beak is lying several meters away, the AI sees nothing out of the ordinary.
Basically, if you’re in a horror movie where zombies start appearing, you might want to grab the controls from your self-driving car.
More worryingly, the AI in a self-driving car may miss other rare, but more realistic, road hazards. If the car in front of it is on fire, fishtailing on ice, or carrying a Bond villain who just dropped a load of nails on the road, a self-driving car won’t register anything wrong unless it’s been specifically prepared for this problem.
Could you design an AI to count eyes or identify flaming cars? Absolutely. An “on fire or not” AI could probably be pretty accurate. But to ask an AI to identify flaming cars and regular cars and drunk drivers and bicycles and escaped emus—this becomes a really broad task. Remember that the narrower the AI, the smarter it seems. Dealing with all the world’s weirdness is a task that’s beyond today’s AI. For that, you’ll need a human.
CHAPTER 9
Human bots (where can you not expect to see AI?)
Throughout this book we’ve learned that AIs can perform at the level of a human only in very narrow, controlled situations. When the problem gets broad, the AI starts to struggle. Responding to one’s fellow social media users is an example of a broad, tricky problem, and this is why what we call “social media bots”—rogue accounts that spread spam or misinformation—are unlikely to be implemented with AI. In fact, spotting a social media bot may be easier for an AI than being a social media bot. Instead, people who build social media bots are likely to use traditional rules-based programming to automate a few simple functions. Anything more sophisticated than that is likely to be a poorly paid human being instead of an actual AI. (There’s a certain irony to the idea of a human stealing a robot’s job.) In this chapter, I’ll talk about instances in which what we think of as bots are really human beings—and where you’re unlikely to see AI anytime soon.
A HUMAN IN BOT CLOTHING
People often give AIs tasks that are too hard. Sometimes, the programmers only find out there’s a problem when their AIs try and fail. Other times, they don’t realize that their AI is solving a different, easier, problem than the one they had hoped it would solve (for example, relying on the length of a medical case file rather than its contents to identify problem cases).1 Still other programmers just pretend that they’ve figured out how to solve the problem with AI while secretly using humans to do it instead.
This latter phenomenon, claiming human performance as AI, is far more common than you’d think. The attraction of AI for many applications is its ability to scale to huge volumes, analyzing hundreds of images or transactions per second. But for very small volumes, it’s cheaper and easier to use humans than to build an AI. In 2019, 40 percent of European startups classified in the AI category didn’t use any AI at all.2
Sometimes using humans is only a temporary solution. A tech company may first build a human-powered mockup of its software while it works out things like user interfaces and workflow or while it gauges investor interest. Sometimes the human-powered mockup is even generating examples that will be used as training data for the eventual AI. This “fake it till you make it” approach can sometimes make a lot of sense. It can also be a risk—a company might end up demonstrating an AI that it can’t actually build. Tasks that are doable for humans might be really hard, or even impossible, for AI. Humans have a sneaky habit of doing broad tasks without even realizing it.
What happens then? One solution companies sometimes use is to have a human employee waiting to swoop in if an AI begins to struggle. That’s the way today’s self-driving cars generally work: the AI can handle maintaining speed or even steering on long stretches of highway or during long hours of slow-speed stop-and-go traf
fic. But a human has to be ready to help at a moment’s notice if there’s something the AI is unsure about. This is called the pseudo-AI or hybrid AI approach.
Some companies see pseudo-AI as a temporary bridge as they work on an AI solution they’ll be able to scale. It may not always be as temporary as they’d hope. Remember Facebook M from chapter 2, a personal-assistant AI app that would send the tricky questions to human employees? Though the idea was to eventually phase out the use of humans, the assistant job turned out to be too broad for the AI to ever figure out.
Other companies embrace the pseudo-AI approach as a way to combine the best of AI speed and human flexibility. Multiple companies have offered hybrid image recognition, where if the AI is unsure about an image, it gets sent to humans to categorize. A meal-delivery service uses AI-powered robots—but bicycle-riding humans bring food from the restaurants to the robots, and the AI only has to help the robots navigate for five to ten seconds between waypoints set by remote human drivers.3 Other companies are advertising hybrid AI chatbots: customers who begin by talking to an AI will be transferred to a human once the conversation gets tricky.
This can work well if customers know when they’re dealing with a human. But sometimes customers who thought their expense reports,4 personal schedules,5 and voice mails6 were being handled by an impersonal AI were shocked to learn that human employees were seeing their sensitive information—as were the human employees when they saw that they were being sent people’s phone numbers, addresses, and credit card numbers.