Where Wizards Stay Up Late Page 17 Read online free by Matthew Lyon

Home > Other > Where Wizards Stay Up Late > Page 17

Where Wizards Stay Up Late Page 17

Fourth was Utah. By now it was December—prime ski season. There also happened to be a Network Working Group meeting scheduled at the site. Keen skiers all, the whole BBN team, even Frank Heart, went to Salt Lake City to plug in the IMP. (Ironically, Barker was the only one excluded from the Utah trip—a fact he would not let the others forget for many years.)

The layout of the growing number of communications links was becoming an interesting problem. For one thing, there was not a point-to-point link between every pair of sites. For reasons of economy, Roberts decided that no direct link was needed between UCLA and Utah, or between Santa Barbara and Utah, so that all traffic destined for Utah had to go through the IMP at SRI. That was fine as long as it was up and running. If it crashed, the network would divide and Utah would be cut off until SRI was brought back on-line. As it would turn out, the four-node network that Roberts designed was not a robust web of redundant connections.

Disruptions in the system also manifested themselves in less obvious ways. This was made clear very early, when the students at Santa Barbara began doing exactly what Heart had feared they might: fiddling with their new toy. And their attitude was, Why not? They had never had to worry about outside connections, and it didn’t occur to them that something they did in their computer lab might have an effect elsewhere. “We merrily thought the IMP was ours to play with,” recalled Roland Bryan, a Santa Barbara researcher. “We were testing it out, turning it on and off, resetting it, reloading it, and trying again.” As a result, people who were taking network measurements, and who counted on the network path through Santa Barbara, would have their experiment thrown off. “Although we did not hurt the links between other sites, we were disrupting the data traffic analysis being carried on by BBN and UCLA,” Bryan said. “We didn’t think about the fact that every time we did that, someone out there would suffer.”

By the end of 1969, the Network Working Group still hadn’t come up with a host-to-host protocol. Under duress to show something to ARPA at a meeting with Roberts in December, the group presented a patched-together protocol—Telnet—that allowed for remote log-ins. Roberts was not pleased with the limited scope of the effort. Though Telnet was clearly useful and fundamental in that it let one terminal reach multiple remote computers, a remote login program by itself didn’t solve the problem of letting two computers work together. Moreover, Telnet was a way to use the network, not a lower-level building block. Roberts sent them back to keep trying. After another year of meetings and several dozen RFCs, in the summer of 1970 the group reemerged with a preliminary version of a protocol for basic, unadorned host-to-host communications. When the “glitch-cleaning committee” finished its work a year later, the NWG at last produced a complete protocol. It was called the Network Control Protocol, or NCP.

In January 1970, Bob Kahn decided that with the first four nodes working, it was time to test his various scenarios in which the network could suffer congestive failure. The kind of lockup that most worried him, the scenario he had suggested to Crowther months earlier, would be caused by congestion at a destination IMP. He had speculated that storage buffers would become so full that the packets necessary for reassembling messages wouldn’t be able to flow into a destination IMP, which itself would be filled with dismembered message parts awaiting completion.

To test that hypothesis and appease Kahn, Heart suggested that Kahn and Dave Walden fly out to Los Angeles to put the network through its paces. Kahn had several experiments in mind. He wanted to send all possible permutations of traffic from IMP to IMP, changing the size of the packets and the frequency at which they were sent, in an attempt to induce deadlock. Walden went along because he was the hands-on programmer who knew how to manipulate the code and make the packets do what Kahn wanted them to do. Walden took charge of reconfiguring the IMPs to send traffic in specific patterns. He could elongate or truncate the packets, send them out every three seconds or every half second. The IMP software, the algorithms, the whole design was in for a major wringing out.

The first thing Kahn set up was a test to demonstrate that his fear of a reassembly lockup was well founded. Just as Kahn had predicted, by besieging the IMPs with packets, within a few minutes he and Walden were able to force the network into catatonia. “I think we did it in the first twelve packets,” Kahn recalled. “The whole thing came to a grinding halt.”

Kahn was vindicated. He and Walden stayed on for a number of days, continuing the experiments. For Walden, who had spent so many months cooped up in Cambridge writing code in a quasi vacuum, it was gratifying to see the network in operation, even if his goal now was to break it. He was having a blast. “I was hacking for pay,” Walden recalled. “I was driven to learn as much as I could as fast as I could.”

Kahn and Walden established a routine. Every morning, they got up and ate breakfast at the Sambo’s restaurant next to their hotel in Santa Monica. Walden used these mornings as an opportunity to indulge his native Californian’s taste for fresh-squeezed orange juice, still a rarity back in Boston. Then they drove to the UCLA campus and spent all day and much of the evening testing the limits of the IMPs. Sometimes they took a dinner break; sometimes they didn’t notice that dinnertime had come and gone. They took one night off to see the movie M*A*S*H, which had just been released.

Often they were joined by Cerf, and occasionally by Crocker and Postel as well. At one point in the testing, Cerf programmed the Sigma-7 to generate traffic to the IMP and used the host machine to gather data on the results. This was the first time he had worked closely with Kahn on a challenging project, cementing a professional link that would last for years to come.

By the end of the week, Kahn’s notebook was filled with data proving his case. When he and Walden returned to Cambridge, they shared their findings with Crowther and Heart. Crowther didn’t say much, but Kahn suspected that the battery of tests caused him to start thinking about the problem. “Somehow Crowther must have registered in the back of his mind that if two of us were coming back and reporting this problem, maybe there was an issue,” Kahn said. Back in the lab, Crowther built a simulation of what Kahn and Walden had done in the field and discovered for himself that the network could indeed lock up. He reported his findings to a slightly crestfallen Heart, who instructed Crowther to work with Kahn on fixing the problem. “Bob got to feel a lot better, and Frank got to feel a little worse,” Walden said of the entire episode. “Of course, Frank never thought the thing was perfect, but he always got discouraged when things didn’t go right.”

Heart had every reason to look past the few flaws that were beginning to show up in the nascent network. After all, problems with congestion control could be fixed. On a larger scale, the company had taken on a risky experiment, involving ideas and techniques never tested before. And it had worked. The hardware worked and the software worked. And the unique ways in which ARPA went about its business and its relationship with its contractor worked too.

Above all, the esoteric concept on which the entire enterprise turned—packet-switching—worked. The predictions of utter failure were dead wrong.

6

Hacking Away and Hollering

The network was real, but with only four nodes clustered on the West Coast, its topology was simple, the experiment small. East Coast computing powerhouses like MIT and Lincoln Laboratory, where so much was happening, weren’t connected. The very spot where Bob Taylor had daydreamed about a network, the ARPA terminal room in the Pentagon, wasn’t yet wired in. Nor was BBN itself. All were awaiting new machines, which Honeywell promised were in production as Christmas came in 1969.

The past twelve months had been rough on ARPA. The agency’s budget had reached a historic peak and gone into decline. The Vietnam War was consuming everything. In December 1969 ARPA had been pushed out of its headquarters in the Pentagon and forced to move into a leased office building in Arlington, Virginia. Director Stephen Lukasik called it “the American equivalent of being banished to Siberia.” The ARPA that onc
e rated a Pentagon-issue American flag behind the director’s desk was quietly stripped of such trappings. Despite low morale, ARPA officials kept their flag and displayed it in the new headquarters, hoping no one important would notice that it had only forty-eight stars.

Computing continued to be the one line in the agency’s budget that didn’t turn downward at the beginning of the 1970s. Larry Roberts was determined to win support from the top, and did. He was equally determined to get an additional dozen principal investigators across the country to buy into the idea of the ARPA network. He kept pushing. He kept steady pressure on new sites to prepare for the day when an IMP would arrive at their doors with a team from BBN to connect their host computers to the network. It wasn’t a question of if but when, and Roberts always posed it that way.

In Cambridge, the activity on Moulton Street began taking on an air of production—“the factory,” some called it. The majority of the effort moved into the large room at the back of the low building with the loading dock where deliveries from Honeywell were received; and there each new machine was set up to be debugged and tested before being shipped into the field. At the same time, Heart’s team continued apace improving the design of the IMP, developing, testing, and tweaking the software and hardware. The fifth, sixth, and seventh 516 computers arrived from Honeywell in the first months of the year.

In late March the first cross-country circuit in the ARPA experimental network was installed. The new 50-kilobit line connected UCLA’s computer center to BBN’s Moulton Street site, which became the fifth node in the network. And it wasn’t just a breakthrough symbolically from the West Coast to the East (frontier expansion in reverse); the transcontinental link was also an immediate boon to network maintenance and troubleshooting.

In the months before BBN had its own machine and was connected to the network, dealing with network problems in the four-node cluster out west was a job handled by people who were on site. More often than not, it meant someone in California or Utah would spend hours on the phone talking cross-country with someone at BBN, while the unfortunate soul who had volunteered to fix the problem shuttled between the telephone and the IMP to carry out the verbal instructions coming from Cambridge. During the early going, Heart’s team spent a lot of time on the phone and kept a more or less continuous presence in the field, parentally guiding the startup of the infant network. At one point, Walden flew to Utah, Stanford, Santa Barbara, and UCLA to hand-deliver a new software release.

But when an IMP was installed at BBN in the early spring of 1970, suddenly there was a way to ship data and status reports electronically from the West Coast IMPs directly back to BBN. Heart’s obsession with reliability had resulted in more than computers encased in heavy steel. His insistence on building robust computers—and on maintaining control over the computers he put in the field—had inspired the BBN team to invent a technology: remote maintenance and diagnostics. The BBN team had designed into the IMPs and the network the ability to control these machines from afar. This they used both for troubleshooting and sometimes actually fixing the message processors by remote control, as well as keeping a watchful eye on the IMPs twenty-four hours a day.

Horrified as he was at the prospect of graduate students messing with his IMPs, Heart had sought to build machines that could run largely unattended. He channeled his obsession into inventing this set of remarkably useful tools and system management techniques. Features for the remote control of the network had been built integrally throughout the IMP’s hardware and software designs.

At BBN, on the hardware side, a Teletype terminal with logging capabilities was added to the BBN message processor, along with special warning lights and an audible alarm to indicate network failures. In designing the IMPs, BBN made it possible to loop the host and modem interfaces of the machine, so they could conduct “loopback” tests. The loopback test, which could be performed remotely, connected an IMP’s output to its input, effectively isolating the IMP from the rest of the network. This generated test traffic through the interface and allowed BBN to check the returning traffic against the outgoing traffic generated by the IMP.

Loop tests were extremely important; they provided a way of isolating sources of trouble. By a process of elimination, looping one component or another, BBN could determine whether a problem lay with the phone lines, the modems, or an IMP itself. If test traffic completed the loop intact and error-free, then the problem was almost certainly in some exterior portion of a circuit—most likely in the phone company’s lines or in the modems. And the loopback tests were conducted often enough that two of the IMP Guys, Ben Barker and Marty Thrope, became expert at whistling down the line at just the right frequency to imitate the signals that the telephone company used to test the lines.

From the PDP-1 computer room at Moulton Street where BBN’s network monitoring equipment was installed, the IMP Guys could tell when a phone circuit anywhere in the network acted up. They could see, by the quality of messages and packets crossing a circuit, when the signal quality was being degraded, when the line was dropping bits, when it was introducing noise, or fading altogether. When the problem was in the phone lines or the modems, then the phone company would be called upon to fix it.

The engineers at BBN relished opportunities to spook the telephone company repair people with their ability to detect, and eventually predict, line trouble from afar. By examining the data, BBN could sometimes predict that a line was about to go down. The phone company’s repair offices had never heard of such a thing and didn’t take to it well. When BBN’s loopback tests determined there was trouble on a line, say, between Menlo Park (Stanford) and Santa Barbara, one of Heart’s engineers in Cambridge picked up the phone and called Pacific Bell. ”You’re having trouble with your line between Menlo Park and Santa Barbara,” he’d say.

“Are you calling from Menlo Park or Santa Barbara?” the Pacific Bell technician would ask.

”I’m in Cambridge, Massachusetts.”

“Yeah, right.”

Eventually, when BBN’s calls proved absolutely correct, the telephone company began sending repair teams out to fix whatever trouble BBN had spotted.

Due to the difficulty of remotely detecting component failures in the geographically dispersed system, the network software grew more complicated with time. Among the basic assumptions made by the IMP Guys was that the most effective way of detecting failures was through an active reporting mechanism. They designed their system so that each IMP periodically compiled a report on the status of its local environment—number of packets handled by the IMP, error rates in the links, and the like—and forwarded the report through the network to a centralized network operations center that BBN had set up in Cambridge. The center would then integrate the reports from all the IMPs and build a global picture of the most likely current state of the network.

In the first few months they called it the Network Control Center. But in fact it was really nothing more than a fairly small corner of an office at BBN. And the network monitoring was informal. The logging Teletype was connected via the network itself to all the IMPs in the field, and the terminal clacked away, taking reports from each IMP every fifteen minutes. Every once in a while, out of curiosity, someone at BBN would go in and look at the log that was running out of the machine, just to see what was going on with the network. No one had specific responsibility for scanning the log. No one checked, outside of business hours, so there were sometimes long periods in which line failures went undetected, especially at night. But if someone at a network site actually called in to say, “Hey there seems to be a problem,” then one of the IMP Guys would immediately go look at the Teletype log and try to figure out what was going on.

Heart’s team had designed the IMPs to run unattended as much as possible, bestowing on the IMPs the ability to restart by themselves after a power failure or crash. The “watchdog timer” was the crucial component that triggered self-corrective measures in the IMPs. The network as a whole “did a lot
of looking in its navel all the time,” said Heart, “sending back little messages telling us how it was feeling, and telling us what kind of things were happening where, so we could in fact initiate work” if necessary.

Not everything could be diagnosed or restored to working order remotely. There were times when people at BBN noticed that an IMP had just stopped running, making it necessary to reload the software. First, BBN would have to alert someone at the other end of the line and ask him to reinsert a paper tape or flip a few switches and push the reset button. So from Moulton Street they would ring the “butler” bell, the telephone switchbox attached to every IMP, hoping to reach someone. Since the IMPs were installed in large, highly active computer centers, there was no telling who might answer. It was almost like calling a pay phone. You might get an expert or you might get a janitor or some undergraduate who didn’t have a clue as to what was going on. Regardless of who picked up the phone, the technicians at BBN would try to talk that person through whatever fixes were necessary. Even people at the sites who actually knew a thing or two about the computers were asked, just the same, to follow BBN’s strict instructions.

“I can remember spending an occasional half hour to an hour with the telephone glued to my ear,” recalled Cerf, whose hearing impairment made him no friend of the telephone in the first place, “following instructions from someone at BBN saying, ‘Push this button. Flip this thing. Key these things in,’to try to figure out what had gone wrong and get the thing running again.” In a notebook, BBN kept a detailed list of sites, exact locations for each machine, and contacts at the various locations, which in at least one case included the building guard as a resource person of last resort.

‹ Prev Next ›