The One Device

Page 25

by Brian Merchant

“It was an assistant; it did language understanding. It didn’t do speech recognition,” he says. “It did some natural-language understanding as you type into it. But it was much more focused on things like scheduling and making a dossier on people when you meet them and things like that.

“It was a cool, cool project but it was made for people typing on computers.”

Gruber was introduced to the project when it was still in its “early prototype brainstorming phase” and he met with the two co-founders. “And I said, that’s a really good idea but this is a consumer play.… We need to make an interface for this,” Gruber says. “My little tiny team inside of Siri created that conversational interface. So the whole way you see it now, the same paradigm everyone uses is these conversational threads, there’s content in between.” It’s not just command and response, designed to be as efficient as possible. Siri talks to you. “There’s dialogue to disambiguate. It’s just this this notion of a verbal, to-and-fro assistant that came out of there.”

The project had begun the year after the first iPhone launched, and as the Siri project took shape, it was clear that it would be aimed at smartphones. “The Siri play was mobile from the beginning,” he says. “Let’s do an assistant, let’s make it mobile. And then let’s add speech, when speech is ready.… By the second year, the speech-recognition technology had gotten good enough we could license it.”

Now Gruber and his colleagues had to think about how people might talk to an AI interface, something that had never really existed in the consumer market before. They would have to think about how to train people to know what a workable command was in Siri’s eyes, or ears.

“We had to teach people what you could and couldn’t say to it, which is still a problem, but we did a better job as a start-up than I think that we currently do,” Gruber says. Siri would often be sluggish because it would take time to process commands and formulate a response. “The idea that Siri talks back to you and does snappy things and so on, that was an outgrowth of a problem of how do you deal with the fact that Siri can’t know most things? So you fall back on either web search or on a thing that looks like Siri knows something when it doesn’t.” Siri is, basically, buying time. “Like, Siri pretends to talk to you as if it knows you as a person who doesn’t really know that, but it’s a good illusion.” And that illusion becomes less necessary the more adapted to your voice Siri becomes.

They also had to think about how best to foster engagement, to get people interested in returning to Siri. “That’s the thing—you want engagement,” Gruber says. “So we use a relatively straightforward way of doing conversation, but we focus a lot on content, not just form.

“If you were given a thing to ask questions to, what would the top ten things be? And people ask, like, ‘What’s the meaning of life?’ And ‘Will you marry me?’ And all that stuff. And very quickly, we saw what were going to be the top questions and then we wrote really good answers. I hired a brilliant guy to write dialogue.” Gruber couldn’t give me his name, because he still works at Apple, but all signs point to Harry Sadler, whose LinkedIn page lists him as Manager, Siri Conversational Interaction Design. Today, an entire team writes Siri’s dialogue. And they spend a lot of time fine-tuning its tone.

“We designed the character not to be gender-specific, not to even be species-specific. To try to pretend like humans are this funny species,” Gruber says. It “finds them humorous, and curious.” Originally, Siri was more colorful—it dropped f-bombs, teased users more aggressively, had more of a bombastic personality. But it was an open question. What do we want our artificially intelligent personal assistant to sound like? Whom do we want to talk to every day, and how do we want to be talked to?

“I mean, it’s a great problem, right?” he says. “You have this giant audience of people, and you just have to write snappy little things and they’ll love it. Imagine you’re writing a book, and you’re developing a character. And you think about, what does this character do? Well, it’s an assistant that doesn’t really know human culture, is curious about it, but it does its very best, it’s professional. You can insult it, but it’s not going to take shit. But it’s not going to fight you either… that’s the thing it has to be, because Apple’s not going to put out quotable offensive things, even comebacks, even though we can write them. So that was really a fine art, to write that stuff.”

Whoever designed it, Gruber credits him with perfecting that character. “He eventually owned it—he created the dialogue tone. As the writer. He really understood that you need a personality.”

Still, someone would need to give voice to that personality. That someone was Susan Bennett, a sixty-eight-year-old voice actor who lived in the Atlanta suburbs. In 2005, Bennett spent every day of July recording every utterable word and vowel for a company called ScanSoft. It was arduous, tedious work. “There are some people that just can read hour upon hour upon hour, and it’s not a problem. For me, I get extremely bored,” Bennett said. It was also hard to retain the android-esque monotone for hours at a time. “That’s one of the reasons why Siri might sometimes sound like she has a bit of an attitude.” ScanSoft rebranded itself as Nuance, and the pre-Apple Siri bought its voice-recognition system—and Siri’s voice—for the app. She had no idea she was about become the voice of AI—she didn’t find out she was Siri until 2011, when someone emailed her. Apple won’t confirm Bennett’s involvement, though speech analysts have reported that it’s her. “I had really ambivalent feelings,” she said. “I was flattered to be chosen to basically be the voice of Apple in North America, but having been chosen without my knowledge was strange. Especially since my voice was on millions and millions of devices.”

“Even the name was very carefully culturally tested,” Gruber says. “It’s pronounceable and nonoffensive and had good connotations in all languages that we’ve seen, and that’s one of the reasons Apple kept it, I think, because it’s just a good name.”

According to Kittlaus, Siri, which apparently means “beautiful victorious counselor” in Norwegian, was the name he wanted to give his daughter. But he had a son, so instead, Siri was born. So what is it—not she; Siri is definitely not a she, or a he—that was getting born?

“You can think of it any way you want, but basically it’s not human. If you look at the lines, like ‘What’s your favorite color?’ And it goes… ‘You can’t see it in your spectrum’ or something like that. It would be kind of like what an AI would do if you made one, it didn’t grow up with a body—it has a different set of sensors. So it’s kind of it’s like it’s trying to explain to mere mortals what it knows.”

AI, which was born as a fictional conceit, has become embodied as one too.

In 2010, after they’d settled on the name and with the speech-recognition technology ready for prime time, they launched the app. It was an immediate success. “That was pretty cool,” Gruber says. “We saw it touched a nerve when we launched in the App Store as a start-up and hit top of its category in one day.”

It did not take long for Apple to come knocking. “We got called real quickly after that from Apple,” Gruber says. That call came directly from Steve Jobs himself. Siri was one of the final acquisitions he would oversee before his death. Apple snapped up the app for a reported $200 million—about as much as DARPA spent on the entire five-year CALO program that laid its groundwork.

At first, Siri was notorious for misinterpreting commands, and the novelty of a voice-activated AI perhaps overpowered its utility. In 2014, Apple plugged Siri into a neural net, allowing it to harness machine learning techniques and deep neural networks, while retaining many of its previous techniques, and it slowly improved its performance.

So just how smart can Siri get? “There’s no excuse for it not having super-powers,” Gruber says. “You know, it never sleeps, it can access the internet ten times faster than you, or whatever powers that you’d want a virtual assistant to have, but it doesn’t know you.”

Gruber
says Siri can’t offer emotional intelligence—yet. He says they need to find a theory to program first. “You can’t just say, oh, ‘Be a better girlfriend’ or ‘Be a better listener.’ That’s not a programmable statement. So what you can say is ‘Watch behaviors of the humans’ and ‘Here’s the things that you want to watch for that makes them happy, and here’s one thing that is bad, and do things to make them happier.’ AIs will do that.”

Right now, Siri is limited to performing the basic functions of the devices it lives in. “It does a lot of things, but it doesn’t do all the things that an assistant can do. We think of it as, well, what do people do with their Apple devices? You navigate and you play music and that’s all the things that Siri is good at right now.” And Gruber and company are looking carefully at the kind of queries it routinely gets—queries that now number in the two-billion-per-week range. “If you’re in the AI game, that’s like Nirvana, right?” Gruber says. “So we now know a lot about what people want in life and what they want to say to a computer and what they say to an assistant.

“We don’t give it to anyone outside the company—there’s a strong privacy policy. So we don’t even keep most of that data on the servers, if at all, for very long.… Speech recognition has gotten much better because we actually look at the data and run experiments on it.”

He too is fully aware of Siri’s shortcomings. “Right now the illusion breaks down when either you have speech-recognition issue, or you have a question that isn’t a common question or a request with an uncommon way of saying it.… How chatty can it get? How companion-like could it really be? Who’s the audience for that? Is it kids? Is it shut-ins?

“But there are certain things you see it doesn’t do right now. Like, you can’t say, ‘Hey, Siri, don’t forget my room is 404’ and ‘Remind me when I’m hungry to eat or when I’m thirsty to drink water.’ It can’t do those things. It doesn’t know the world, it doesn’t see the world like we do. But if it’s hooked up to sensors that do, then there’s no reason why it can’t.”

And how does Gruber want to change Siri? “My preference is that first, it needs to be more natural in the way it speaks to you.” He hopes to drive Siri to behave more like us. “So I want it to be a lot more humanlike and to not beep and make all this silly ‘It’s your turn now’ and all that. That’ll come. That’ll just come naturally. I am interested in focusing in on human needs.… That’s why we did text-interface and hands-free. So there’s a real genuine need for people not to be texting while driving. There’s also a need for people to deal with complexity. That’s hard to do right now. So Siri is kind of a GUI-buster, like it can break through all these complicated interfaces, and you can just say, ‘Remind me when I get home to call my mother’ and it can know when you get home, go, ‘Here’s a note. Click here to call your mother.’ Yeah, it could know when you’re home if you tell it in your address book, then it knows when you’re there from GPS. It knows your mom, it knows who your mom is, it knows what her number is, all that stuff.”

SIRI will then be accessing even more of our most personal data, I note. I ask him if he worries that Siri or any other AI could do something malignant with that sort of information And, I guess, generally, is the father of Siri worried at all about the dawn of true artificial intelligence?

“I’m not afraid of general intelligence in a computer,” Gruber says. “It will happen and I like it. I’m looking forward to it. It’s like being afraid of nuclear power—you know, if we designed nuclear technology knowing what we know now, we could make it safe, probably.” Yet critics like Elon Musk and Stephen Hawking have raised concerns that AI could evolve more quickly than we could control it—that it could pose an existential threat to humanity.

“Oh, it’s great,” Gruber says of the discussion. “We’re kind of at the stage now where the Elon Musks of the world are saying, ‘Look, this is going to be powerful enough to destroy the earth. Now how do we want to deal with the technology?’ And I don’t like the way that we have dealt with nuclear, but it hasn’t killed us. I think we can do a lot better, but we have managed to thread that gauntlet, where we made it through the Cold War and didn’t kill ourselves.” Anyway, we don’t have to worry about Siri. “Siri wasn’t really about general intelligence, it’s about intelligence at the interface. So to me that’s the big problem. Our intel, our interfaces are hard to use and needlessly so.”

There’s also plenty of room for AI to do good—which, as a matter of fact, is why Gruber’s here. He’d come on the TED cruise to see if there were any ways he could harness his expertise to help benefit ocean conservation. So far, he’d met with teams to discuss using pattern-recognition software and Google Earth to catch poachers and polluters.

“Those are kind of the superpowers that only science fiction was talking about a few years ago,” he says. So, I ask, does the co-creator of Siri use his own AI? How?

“Oh, yeah, all the time,” he says. “I use it twenty to thirty times a day. I mean, I get up: What’s the traffic? Open an app by name. I text people back and forth by Siri. Call people by name. Get in the car. Read notifications to me, respond to texts, obviously do navigation. So, the car. Find out where I’m going. Gas on the way to work. ‘Siri, where is this gas station? Take me to work.’ Get to work. ‘Siri, what’s my next meeting?’ You know, ‘Change my two o’clock to three o’clock.’ I mean, all that stuff and just all day long.”

And then, I figure, it’s time for the million-dollar question: “Do you know Siri better than it knows you? Or does Siri know you better?”

“That’s a fun question. I’m afraid we’re in the phase of the technology where I know Siri better than it knows me,” Gruber says. “But I’d like to turn that table around soon.”

CHAPTER 11

A Secure Enclave

What happens when the black mirror gets hacked

Half an hour after I showed up at Def Con, my iPhone got hacked.

The first rule of attending the largest hacker conference in North America is to disable Wi-Fi and Bluetooth on all your devices. I had done neither. Soon, my phone had joined a public Wi-Fi network, without my permission.

I had trouble with Safari when I tried to use Google; instead of search results, the page froze in the process of, it seemed, loading another page altogether.

The good thing about getting hacked at Def Con, though, is that you are surrounded by thousands of information-security pros, most of whom will happily and eloquently tell you exactly how you got “pwned.”

“You probably got Pineapple’d,” Ronnie Tokazowski, a security engineer for the West Virginia cybersecurity company PhishMe, tells me at the kind of absurd, faux-outdoors, French-themed buffet you can find only in a Las Vegas casino. We’re joined by veteran hacker (and magician) Terry Nowles and a father and son from Minnesota; dad’s a dentist, Don’s into Def Con.

“The way the Wi-Fi Pineapple works is whenever your phone sends a beacon to look for an access point, instead of the Wi-Fi point saying, ‘I’m that connection,’ the wireless Pineapple will say, ‘Yes, that’s me, go ahead and connect,’” Tokazowski said. “Once you’re connected to the Pineapple, they can then mill your connection, they can reroute your traffic elsewhere, they can break your traffic, they can sniff passwords.”

“They can see what I’m doing on my phone, basically,” I say.

“Yeah.”

“Could they then actually change anything on my phone?”

“They would be able to sniff the traffic,” he says, meaning intercept the data passing through the network. “Once you’re connected to the network, they could start trying to throw attacks at your phone… But for the most part, the Pineapple is more for sniffing traffic.” If I logged on to Gmail, for instance, the hackers could force me to go somewhere else, a site of their choosing. Then they could launch a man-in-the-middle attack. “If you went to Facebook and went to your bank account, they’d be able to see that information too,” he says. “So, yeah, you just want
to be careful not to connect to any Wi-Fi.”

Okay, but how common is this, really?

“Pineapples?” Ronnie says. “I can go buy one for a hundred, a hundred twenty bucks. They’re very, very, very common. Especially here.”

Def Con is one of the largest and most notorious hacker gatherings in the world. For one weekend a year, twenty thousand hackers descend on Las Vegas to attend talks from the field’s luminaries, catch up with their contemporaries, bone up on the latest exploits and system vulnerabilities, and hack the shit out of one another.

It’s also one of the best places to head if you want to wade into the security issues that confront iPhones and iPhone users the world over. As more people start regarding smartphones as their primary internet devices and conducting more of their sensitive affairs on them, smartphones are increasingly going to become targets of hackers, identity thieves, and incensed ex-lovers.

Earlier, Def Con’s sister conference, the smaller, more expensive, and more corporate-friendly Black Hat, had made a surprise announcement that Apple’s head of security engineering and architecture, Ivan Krsti´c, would give a rare public talk about iOS security.

In December 2015, Syed Rizwan Farook and Tashfeen Malik, a married couple who say they were acting on behalf of ISIS, shot and killed fourteen people and seriously wounded twenty-two more at a Christmas party at the San Bernardino County Department of Public Health, where Farook worked. The spree was declared an act of terror and was, at the time, the worst domestic attack since 9/11.

During the FBI’s investigation, the agency recovered an iPhone 5c. It was owned by the county—and thus was public property—but it was issued to Farook, who had locked it with a personal passcode. The FBI couldn’t open the phone.

You probably have a passcode on your phone (and if you’re one of the 34 percent of smartphone users who don’t use a password, you should!), ranging from four numbers (weak) to the new default of six characters or longer. If you input the wrong code, the screen will do that shake/buzz thing that sort of resembles a torpedo hit in old sci-fi movies. Then it makes you wait eighty milliseconds before trying again. Every time you get it wrong, the software forces you to wait longer before your next attempt, until you’re locked out completely.

‹ Prev Next ›