Book Read Free

The Silicon Jungle

Page 27

by Shumeet Baluja


  What was Rajive supposed to say? He could say I told you so, I told you this would eventually happen. He could try to explain it again to Alan, but he wouldn’t pay enough attention to understand. Instead, he gave Alan a way out, something that wouldn’t look bad when this mess made its way back to DC—and what he would say was true, too—just not the whole truth. “We don’t have all the data we need. The data is much harder to come by when people aren’t just handing it to you.” No amount of resources, politicking, summits, or anything else was going to give them all the data Sebastin had access to.

  “And people are willing to give all this data to Sebastin, and not to us?” Alan shook his head in disgust as he turned from Rajive.

  “Alan, people didn’t give the information to Sebastin. They gave it to Ubatoo. Sebastin just had to find someone at Ubatoo to hand it to him.”

  “So how did he convince Ubatoo to give him all this information and we can’t convince them to give us the time of day?”

  “ACCL is the darling of Silicon Valley right now. There’s no deep secret to it—people just want to help them.”

  “Idiots. All of them, idiots,” Alan said dismissively. “Which idiot in Ubatoo has been helping your friend, Sebastin?”

  Rajive flipped through his files before responding about “his friend.” “He has two contacts inside Ubatoo. The head of their data-mining division, a Dr. Atiq Asad—Asad was a former Berkeley professor, now a vice president at Ubatoo. Sebastin also worked with Stephen Thorpe, an intern—but the extent of his involvement is still not known. I’ll read his file to figure out his role before we touch down.”

  “What do you think, Rajive? What happened to Sebastin? Have a change of heart? I guess a few million dollars just aren’t enough for some people?”

  “He already has a small fortune, so I suspect that the original reason he worked with us was a sense of patriotic duty. Maybe he decided a vacation home was more important. I don’t know.”

  “Patriotic duty? Sebastin? You’re giving him too much credit, Rajive. I bet he’s like the rest—just a greedy S.O.B. from the start who would do anything for a quick buck, no matter how small the amount or how stupid the act,” Alan countered. With that off his chest, he visibly relaxed, sat back and chewed on some more ice.

  “About tomorrow—what do we do if Sebastin and ACCL are dead ends?” Alan asked.

  “If I’m right, nobody else at ACCL will have any clue as to what’s going on. Then it’s time to talk with Dr. Asad and the intern. We need to start assessing what the full extent of the damage is.”

  “I can’t wait to see Asad’s face when they realize they’ve refused to give us access to their data, but they’ve been handing it over to some terrorist who’s selling it to the highest bidder. This is going to be one hell of a train wreck.” And this was the reason that Alan had to come on this trip. There was a chance, though it was Alan and Rajive’s job to avoid it from happening, that this was about to become a very high-profile media frenzy. “Ubatoo hands data over to Unknown Terrorists,” “Ubatoo spurns U.S.—Helps Terrorists Instead.” There were a lot of bad, very public, endings to this story if not dealt with appropriately. Personally, Alan would have been fine with some press; the more the merrier. If handled with a little finesse and creativity, he was confident he could spin this story into a huge public-relations win for himself and even NCTC, but in today’s world with president-mandated interagency cooperation and the explicit order to project a single unified focus—nothing doing. Everything had to be kept quiet. Rajive was a good man, but he wasn’t ready for this.

  “For their sakes, I hope Sebastin hasn’t sold his list to anyone yet—it’ll be ugly for Dr. Asad if he did,” Rajive said, following up on Alan’s thought.

  “I presume that the team will be adequately briefed before we get there?”

  Rajive didn’t reply. It was more of a command than a question. With that, they leaned back in their chairs, loosened their ties and settled in for the few hours left on their journey. Within a few minutes, Alan was asleep. Rajive, though, still had to prepare for tomorrow.

  -A TINKER BY ANY

  OTHER NAME-

  August, 2008.

  An enormous number of people had found their way onto “The List.” When people talked about being on a watch list, what they usually meant, though they likely didn’t know it, was either TIDE, the Terrorist Identities Datamart Environment, or the FBI’s domestic terrorist watch list. Although TIDE, maintained by the NCTC, had been repeatedly subject to newer cleverer acronyms, the main characteristics stayed constant:

  1. You don’t want to be on it.

  2. If you’re on it, you probably don’t know it.

  3. There were many reasons you could be put on it and only a tiny number of reasons for you to ever be removed from it.

  4. If you’re on it, you probably know others who are on it.

  5. If you’re not on it, you probably know others who are on it.

  6. There are a lot of people on it, and it will double in size before it gets a new acronym.

  It was #6 that troubled those who worked with this list, not the fact that its name would change. The list grew too quickly. The problem was not in finding people to add to the list, it was in finding reasons to keep people off the list.

  Within three years of the 9/11 atrocities, in not-so-fringe segments of society, on college campuses in parts of the U.S., and in much of Europe, it was, sad to say, entirely fashionable to empathize with the terrorists, even while fastidiously denouncing their tactics. Youths, Muslim or not, were finding that the “angry teenager rebelling against the establishment” phase of life that had only twenty years ago meant a spiked haircut and too much colored hair gel, had come to mean “rebelling against American values.” For too many, it meant at least empathizing with the world’s frustrations toward America.

  Fortunately, these frustrations typically amounted to bouts of angst and even self-discovery, but rarely to any action. Unfortunately for anyone in the business of eavesdropping, it was typically all talk. This meant the triggers for the first level of alarms that were in place to mark a person as “interesting” were perpetually being tripped. Further, with the easy availability of all types of information on the Internet, too many details and trigger words were readily available for the fastidious rebel to stumble upon. These words, when said by a person who had just activated a level-one alarm, would set off the second level of alarms. The more well-versed and intelligently you spoke of your dissatisfaction and frustrations with America, the more alarms you triggered.

  The result was that the watch lists were enormous and expanding rapidly. The gargantuan amounts of data on “these people” far exceeded the capacity to analyze it. Data collection inside the NSA, FBI, CIA, NCTC, and everywhere else, was not the problem. The problem was knowing what to do with it. Until that was addressed, the mounds of evidence that may or may not be contained in the data collected were left untapped.

  Insights on how to tackle this problem came from, curiously, one of the academic-outreach programs that the NSA had sponsored. In exchange for funding to support a graduate student, professors would reconfigure their work, or at least their student’s work, to match the goals of the funding agency. In fact, such had been the case for Molly; her advisor’s funding had come directly from the Department of Defense, one of the largest funders of anthropological research.

  At the same time that Harry Chaff walked into Professor Aore’s lab at Georgia Tech, the blue skies of the last few precious days between summer’s end and the start of classes swiftly turned to a dismal grey. Harry Chaff was the newly appointed Dean of the College of Computing and had been making his rounds of the faculty members’ labs, getting to know his department. Dean Chaff loudly and awkwardly introduced himself from the doorway and added, “I’d like to hear about the work you’re doing, is right now a good time?”

  Had it been a few days later, when classes were in full swing, Professor Aore would have
had the ready excuse of preparing for classes to avoid being disturbed. Instead, Professor Aore had no choice but to drop everything to accommodate Dean Chaff. Within ten minutes, he found himself demoing his latest research, trying to impress. Professor Aore, his students, and Professor Mikens from the linguistics department had been working on a new approach to online video recommendations. Based on a user’s personal video viewing history (for example, DVDs rented or online videos watched), their newly created system would recommend other videos that the user would enjoy. The system, though steeped in the well-studied graph-theoretic mathematical techniques of finding stationary points in large graphs, was constructed with a far more accessible goal: to sell their invention to a dot-com and reap a portion of the lucrative rewards that were so easily handed out in Silicon Valley.

  Professor Aore started his demo to Dean Chaff with a few compelling examples. “Imagine you’ve just seen a few movies and want to find another one to watch.” He clicked on a few videos, The Break-Up, The Holiday, and Monster-in-Law.” Then he sat back and let his system take over.

  The screen sputtered out a few cryptic lines of gibberish, at the bottom of which appeared the automatically recommended movie titles. Professor Aore happily explained, “We’ve figured out what your tastes are and we’ve recommended that you watch Bridget Jones’s Diary.” Professor Aore watched as Dean Chaff sat rocking slowly back and forth in his chair, expressionless.

  There really wasn’t more to show at this point. Professor Aore tried to spell out the accomplishments for Dean Chaff, “To make this simple recommendation, we figured out that you’re most like a twenty-five to thirty-five-year-old female,” he said, pointing to some words in the lines of gibberish that had just scrolled by. “Then, we looked at the actors in these movies and matched them to similar actors.” He pointed to another line, “We looked at the movies’ genres, directors, viewers’ reviews, professional reviews, and then we synthesized all of this information to give you movies you would most like,” he concluded. Still nothing from Dean Chaff, just a slow back and forth in his squeaky chair.

  Professor Aore waited expectantly before continuing. “Why don’t you tell me some movies you like, and we’ll see what it comes up with?”

  To this, Dean Chaff responded quickly, with no hesitation, “In alphabetical order, my top three favorite movies are Apollo 13, Close Encounters of the Third Kind, and Contact.” Professor Aore hurriedly found the movies and entered them into his system.

  A moment later, Professor Aore was reading the results aloud. “It decided that you are likely a male—88 percent probability, and that you are over forty-five years old—64 percent probability. It recommended 2001: A Space Odyssey for you.”

  Dean Chaff nodded his head approvingly in time with his rocking. “I’ve already seen that one, though,” he replied. “What else?”

  Professor Aore scanned further down the page, “Battlestar Galactica?”

  The rocking and nodding was discernibly faster. “Really? The original or the remake?”

  Professor Aore returned to his screen, trying to contain his quickly growing annoyance. The anger soon turned to alarm when he realized that nothing on his screen indicated which version of the movie was recommended. “Remake,” Professor Aore guessed, voice cracking as he failed to suppress his anxiety.

  “Well done!” Dean Chaff replied enthusiastically. He was impressed enough to give it further thought—but not by the underlying mathematics or even the potential for a sale to a dot-com. Instead, he imagined an untapped pipeline carrying copious amounts of funding straight from the NSA to Georgia Tech’s College of Computing. “Does this thing work for anything other than movies?”

  “Of course it does.” Professor Aore stated defensively, as if merely questioning the system’s broad usefulness was a direct insult cast upon the intellect of generations of his family. “The mathematics behind it are solid. Naturally, it’ll work on any type of data.”

  “Perfect, I’d like you and Dr. Mikens to apply for a grant with the NSA—to help them catch terrorists. I think that this is just what they need . . .”

  Though Dean Chaff never had it in him to find the right words to explain his notions enough to enthrall Professor Aore the way he would have liked, it was his job to find the connections between people and projects that others would normally miss. What the NSA needed, simply put, was to figure out the problem of ranking. Given that a million people were in the TIDE list, they needed to figure out which ones should be prioritized higher than others—which ones should make it to the top? Which ones were more likely to act rather than to just complain? What the NCTC needed was their own version of tinkers to help rank potential terrorists. And this, like ranking movies that were similar to each other, is exactly what Dean Chaff hoped Professor Aore’s system could do. Though all of this should have been done from the start within the NSA or the NCTC, to do so demanded in-house experts with access to enormous amounts of data, years of hands-on experience and innovation, and most of all unwavering dedicated focus and support. And as Rajive had described in his presentation, these were not always cultivated in-house as they should have been.

  Within a few months, Professor Aore and Professor Mikens had adapted their algorithm and submitted a grant that had been funded directly by the NSA. They had proposed completing a tightly directed morphological analysis of well-publicized web sites of known terrorism organizations and of terrorist supporters. They had matched the words, phrases, and idioms found on those web sites to pamphlets, newspapers, and other web sites from around the world. Based on the ones that matched closely, they derived a list of books, authors, pamphlets, and web sites that should be actively monitored. It was no different than finding and recommending similar movies. With this information, the NSA only had to find the people who read those pamphlets, had those books, or visited those web sites, and use this information to help prioritize the one million and growing names in the TIDE list.

  Dean Harry Chaff may have been awkward, too abrupt, and unable to stay still, but his vague notion proved right. At the completion of their grant, the NSA offered Professors Aore and Mikens follow-up funding. This time, they were asked to create a tool that could be deployed autonomously behind the secure walls of the NSA, with no access given to the professors. The NSA wanted to repeat the same process on a confidential list of currently monitored, but far less publicized, web sites. These were the web sites on which the activities (postings, conversations, e-mails of the participants) were of interest to both the NSA and NCTC.

  Enter Rajive.

  Rajive took charge of the project. In a joint collaboration between the NSA and NCTC, their system was created and deployed behind secure walls.

  The content of the web sites was analyzed, and, using the good professor’s program, it was automatically matched to a list of books. From this, CL-72 was born, a list of books that were the closest matches to the documents that their program was asked to analyze. This was just one of the many CLs to emerge from the program. Each CL detailed some of the attributes and activities to look out for, besides just books and reading patterns, when hunting down would-be terrorists.

  Perhaps most interesting was the fact that nobody could accurately articulate why any of the books or other attributes were actually on the list. The two professors who created the program didn’t know anything about the conversations, e-mails, and web sites upon which the program was run (they had just handed over the program; nobody inside the NSA or NCTC told them the actual data that was going to be used) and therefore had no idea of what lists were created, nor the content of the lists (nor did they even know the existence of such lists). The government scientists, who actually ran the computer program, hadn’t taken the time yet to fully understand the specifics of the algorithms. Needless to say, the expectations weren’t high for the project.

  “I have a meeting in an hour, so just give me the highlights. Tell me what I should tell everyone about your project. I’ll have about ten m
inutes,” Alan said impatiently.

  Sure. I’ll spoon feed you months of work so you can gloss over it at your meeting. No problem, Rajive thought. “The most important thing you can tell them is that Professor Aore’s and Mikens’s work is completed, and it’s deployed. We have 119 candidate lists, CLs, that we expect to get out of this. Once we have the people who match the lists, we can re-rank all the people in TIDE and any other terrorist lists we may have floating around—according to their importance to us.”

  “Sound like a good start. What do you call this program?”

  “We don’t have a name for it,” Rajive replied.

  “I can’t present it without a name. Just make one up.”

  “How about Tide-Sorter?”

  Alan contorted his face. “Tide-Sorter? Tide-Sorter? Come on Rajive, sounds like a laundry detergent. You’re the whiz-kid. Think of a decent name. Give me an acronym. We need an acronym.”

  “I’ll think about it.”

  “Fine. You were saying something about 119? 119 what?”

  “We have 119 attributes of potential terrorists that we examine. The more attributes the would-be terrorists match, the more likely they are to be of high interest to us. Like CL-45, for example; it’s a list of talks, lectures, and concerts. If the suspect went to those talks, his importance gets notched higher. CL-72 is a list of sixty books—read the books on the list, and that will up your importance to us, too. Some other examples—CL-11, that’s a list of travel destinations. That’s a particularly good one. Go to places on that list—like Iraq, Afghanistan, that will add a few more points to your name. We have 119 of these lists.”

 

‹ Prev