Grace Hopper and the Invention of the Information Age
Page 24
Just before the May ONR symposium, the Automatic Programming Department’s efforts focused on improving the A-2 compiler. The resultant A-3 wrote more efficient machine code, but it was far from user-friendly. According to Hopper, Laning and Zierler’s efforts opened her eyes to the full potential of pseudo-code6 and confirmed in her mind that automatic programming could bridge the gap between user and computer, thus eliminating the need for programmers outside of research and development laboratories.7
Hopper and her colleagues explored the possibility of an equation-based programming language, and by 1956 they had modified A-3 to the point where it could support a user-friendly source code. The resultant AT-3 compiler was later named MATH-MATIC. Hopper had finally freed programmers from UNIVAC’s awkward three-address coding format. MATH-MATIC pseudo-code was quite natural to write and further separated the user from the eccentricities of hardware.
MATH-MATIC also appeased more experienced programmers, for it had the ability to handle lower-level languages. That is, the compiler could process statements written in A-3 pseudo-code or even in C-10 machine code alongside the new source code. This flexibility enabled experienced programmers to manually modify compiler code output, thus ensuring the production of the most efficient final code possible. This was feasible because MATH-MATIC first translated its source code into A-3 pseudo-code as an intermediate step. A-3 was then translated into machine code.
In many respects MATH-MATIC outstripped the capabilities of the hardware for which it was designed. The 1,000-word internal memory was so limited that Hopper was forced to create an elaborate system of “virtual” memory. That is, MATH-MATIC made full use of the UNIVAC’s ten magnetic servo tapes, automatically orchestrating their use to make the machine appear as if it had more internal memory. For instance, if a generated object code was too large to fit into internal memory or on a single magnetic tape, the compiler automatically inserted the necessary control transfers and input/output statements to reload memory from other tapes as often as required.8
Sadly, for all their creative efforts, MATH-MATIC’s designers could not overcome the programming language’s one glaring shortcoming: unbearably long run times when compiling. MATH-MATIC, like Laning and Zierler’s algebraic compiler, had identified the paradox associated with automatic programming during much of the 1950s: the more user-friendly the source code, the longer the run times. Until this was rectified, automatic programming had no comparative advantage over teams of skilled programmers writing machine code.
SOLVING THE EFFICIENCY PARADOX: JOHN BACKUS AND FORTRAN
Charlie Adams’s short description of Laning and Zierler’s algebraic compiler did not go unnoticed by another ONR symposium attendee, John Backus. The 29-year-old Backus had joined IBM as a programmer in 1950 and had spent his first years there developing Speedcode for the IBM 701. An avid student of Hopper’s automatic programming work, Backus was impressed by the A-2 compiler’s speed but frustrated by its awkwardness. Part of the A-2’s inefficiency, Backus believed, was attributable to the limitations of the UNIVAC hardware. The 1,000-word memory and the lack of an index register caused the A-2 to spend much of its processing time on “housekeeping chores.”9 If automatic programming was to be accepted on a larger scale, Backus believed, the efficiency paradox had to be solved.
In January 1954, John Backus, Harlan Herrick, and Irving Ziller set out to develop a more “streamlined” compiler for IBM’s proposed 704 computer. The MIT-inspired hardware design, complete with core memory, floating decimal processor, and index register, provided the IBM team with unprecedented design flexibility. But despite these advantages, Backus recalled, the group spent months debating the theoretical design limits of automatic programming. Though he did not mention Hopper’s name directly, Backus suggested that one of the reasons the computing community as a whole was skeptical of automatic programming in the mid 1950s “came from the energetic public relations efforts of some visionaries to spread the word that their ‘automatic programming’ systems had almost human abilities to understand the language and needs of the user.”10 But, as the case of the A-2, on further inspection most compiling systems proved to be “complex, exception-ridden performers of clerical tasks.”11
Laning and Zierler’s algebraic compiler served as evidence that prestigious institutions such as MIT were taking automatic programming seriously, prompting Backus to write Laning a letter shortly after the May symposium. In the letter, Backus informed Laning that his team at IBM was working on a similar compiler, but that they had not yet done any programming or even any detailed planning.12 To help formulate the specifications for their proposed language, Backus requested a demonstration of the algebraic compiler, which he and Ziller received in the summer of 1954. Much to their dismay, the two experienced firsthand the efficiency dilemma of compiler-based language design. The MIT source code was commendable, but the compiler slowed down the Whirlwind computer by a factor of 10. Since computer time was so dear a commodity, Backus realized that only a compiler that maximized efficiency could hope to compete with human programmers. Despite this initial disappointment, Laning and Zierler’s work inspired Backus to attempt to build a compiler that could translate a rich mathematical language into a sufficiently economical program at a relatively low cost.13
On 10 November 1954, Backus submitted a report titled “Preliminary Report: Specifications for the IBM Mathematical Formula Translating System, FORTRAN” to his boss, Cuthbert Hurd. The report stipulated that FORTRAN would “comprise a large set of programs to enable the IBM 704 to accept a concise formulation of a problem in terms of a mathematical notation and to produce automatically a high-speed 704 program for the solution of the problem.”14 The report also suggested that a novice programmer would be able to manipulate FORTRAN’s notation after a one-hour course. In effect, Backus was promising ease of coding with unprecedented speed of execution.
Backus initially thought the time from specifications to prototype would be 6 months. It turned out to be 30 months. During that time, Backus and his team of twelve programmers developed a variety of compiling techniques that increased both the efficiency of the generated machine code and decreased the computer time needed to compile. These advances included separate compilation of commonly used subroutines, detection of identical sub expressions (thus eliminating duplicate calculations), methods to avoid recompilation during the debugging phase, and self-checking mechanisms that flagged user-based errors when preparing source code.15
Though the FORTRAN operator’s manual was completed by the fall of 1956, the compiler itself was not distributed to IBM 704 installations until April 1957. Within a year after distribution, half of the IBM 704 installations were using FORTRAN to solve more than half of all mathematical problems.16 Subsequently, compilers were produced for the IBM 705 and the IBM 650, quickly making FORTRAN the most widely used automatic program of its day. By 1961, UNIVAC users demanded a compatible FORTRAN compiler and abandoned Hopper’s MATH-MATIC. Backus’s focus on compiler efficiency, coupled with IBM’s growing market share in hardware, resulted in the first programming-language standard. For the first time, programmers on different machines could speak the same language.
Grace Hopper at a UNIVAC keyboard, 1960. Courtesy of Smithsonian Institution.
CREATING A BUSINESS LANGUAGE: B-0 AND FLOW-MATIC
FORTRAN, like MATH-MATIC, was designed to permit engineers and mathematicians to write source code using standard mathematical symbols. But in the winter of 1955–56, while these two languages were still being developed, Grace Hopper was planning a far more radical programming language. She believed that computers were more than just elaborate calculators—that they were also data-processing and decision-making tools. Therefore, she wanted her Automatic Programming Department to create a unique compiler that would enable managers and administrators to write business programs using standard business vocabulary.
On 31 January 1955, Hopper submitted to Remington Rand management a report (titled �
�The Preliminary Definition of a Data Processing Compiler”) that outlined the original specifications for the first B-0 business language. The most significant concept outlined in this report concerned the nature of the compiler’s pseudo-code. The report stated that both symbols and abbreviations should be replaced with understandable English words. “Oh, I loved the symbols, but I saw there were a lot of people that didn’t, and I wanted them to be able to use the computer as well,” Hopper recalled.17 The typical business administrator or manager, she argued, would rather use standard business vocabulary to delineate both operations and data names. Therefore, A × B = C should be written as
MULTIPLY BASE-PRICE AND DISCOUNT-PERCENT GIVING DISCOUNT-PRICE
At first glance the latter description seems verbose, but Hopper defended English source code on several counts. Mathematics utilized a widely accepted set of symbols; business language did not. According to B-0 co-developer Mary Hawes, Hopper first directed the design team to study business expressions used by different UNIVAC users in order to generate a suitable business vocabulary. “When we tried mnemonic abbreviations,” Hawes recalled, “we found different abbreviations for the same term in different departments of the same company, not to mention the differences among companies.”18 Instead of attempting to invent symbols and abbreviations to represent basic business terms (e.g., tax, gross pay, and price), Hopper came up with an elegant solution that would allow programmers to define their own pseudo-code language by assigning program variables to any desired English name. For example, X1, X2, and X3 could be assigned “number of employees,” “numbers of hours worked,” and “hours at 1½ times pay.”
Such unprecedented source-code flexibility created significant challenges for the B-0 design team. First, the habit of using fixed word lengths had to be broken, which was easier said then done after years of programming with this constraint. Additionally, a single variable could now be defined by more than one word. How would the computer know that “employee number” was one rather than two variables? “The contingency was taken care of by defining a word . . . as a group of letters preceded by a space symbol and followed by another space symbol,” Mary Hawes explained.19 Sometimes, shifts in perspective yielded simple, elegant solutions.
Hopper also noted some of the unintended positive byproducts of pseudo-code. English-based programs were self-documenting, permitting a second programmer or technician to pick up where a colleague left off . This changed programming from an individualistic endeavor to a group activity, thus setting the foundation for teams of programmers to work on more complex projects. In addition, managers with limited computer skills could understand the logic and purpose of a given program, making administrative assessment possible (much to the chagrin of some programmers).20
To help Remington Rand management accept the seemingly strange concept that computers could translate and compile source code written in English, the Automatic Programming Department prepared a series of small demonstration programs. The first program, a simple inventory control application, contained only 20 lines of English-based code. Upon compilation, the English sentences were successfully expanded into machine code. To make the demonstration more impressive, and thus to increase the probability of funding, Hopper had her team rewrite the test compiler so it could also translate pseudo-code written in either French or German21:
IF GREATER GO TO OPERATION 1
OTHERWISE GO TO OPERATION 2
SI PLUS GRAND ALLEZ À OPÉRATION 1
AUTREMENT ALLEZ À OPÉRATION 2
WENN GRÖSSER GEHEN ZU BEDIENUNG 1
ANSONSTEN GEHEN ZU BEDIENUNG 2
Each of these three lines of pseudo-code generated the same machine code, despite the language differences. Hopper had proved that compiler pseudo-code could be designed to meet the needs of any business, even one in another country. Surprisingly, Hopper’s theoretical linguistic demonstration was not well received. Management was concerned that Hopper’s plans were too ambitious, and that the Automatic Programming Department was wasting time and energy exploring such marginal areas as multilingual programming. “It was completely self-evident [to management] that an American computer built in blue-belt Pennsylvania couldn’t possibly be programmed in French or German,” she recalled.22 Hopper had to assure her superiors that the proposed business language would only be in English.
B-0, as the business language compiler was designated, became available to UNIVAC customers at the start of 1958. Before its completion, Remington Rand merged with Sperry Gyroscope Corporation to form Sperry Rand. The marketing department of the new company renamed the business language FLOW-MATIC. (In addition, AT-3 was renamed MATH-MATIC.) The completed version of FLOW-MATIC had a rich library of operational verbs that appeared to meet the application needs of most businesses.23 These verbs included editing commands so information could be formatted before output. Furthermore, FLOW-MATIC provided unparalleled flexibility in data designation, thus allowing file names to be given complicated descriptions.
As they had had to do with previous UNIVAC compilers, Hopper and the staff of the Automatic Programming Department had to “sell” FLOW-MATIC to a skeptical programming and business community. “We finally got it running,” Hopper recalled, “and there we were, research and development group, with a product, and nobody to use it. We had to go out and sell the idea of writing programs in English.”24 The group turned immediately to dependable UNIVAC customers, and by the spring of 1958 US Steel, Westinghouse, the Air Force Comptroller, and the Navy Bureau of Ships were writing payroll and inventory applications in FLOW-MATIC.
Once again, the harshest critics of the new programming language were programmers. Many programmers believed FLOW-MATIC English phraseology to be superfluous, especially when basic operations such as MULTIPLY and SUBTRACT had to be written out. Moreover, the length of commands increased the time needed to compile the pseudo-code into machine code. Most important, the pseudo-code was so far removed from the inner workings of the machine that even the most gifted programmer could not express all needed operations. By making FLOW-MATIC accessible to a wider group of users, its designers had sacrificed a certain amount of flexibility and control.
Faced with a skeptical programming community, Hopper dedicated much of her time to communicating the benefits of automatic programming. Between 1955 and 1959 she published ten articles on the subject, with titles such as “Programming Business-Data Processors,” “Computer Programs in English,” and “From Programmer to Computer.” Hopper emphasized that automatic programming was economical, easy to learn, easy to debug, and easy to maintain, and that it reduced the time needed to build applications and solve problems. During this same period, Hopper gave dozens of papers at various conferences and symposiums and encouraged members of her staff to do the same. She continued to help organize programming symposia.
Hopper remained active in the Association of Computing Machinery. From 1956 to 1958 she served as on the association’s council, and in 1957 and 1958 she was a member of the editorial board for the association’s main publication, Communications of the ACM. By 1960, Hopper had indeed navigated her way to the center of the computer revolution. She would use her prominent position to help guide the development of COBOL, which remains to this day the most successful computer programming language.
11 DISTRIBUTED INVENTION MATURES: GRACE HOPPER AND THE DEVELOPMENT OF COBOL
By the late 1950s, Grace Hopper’s style of invention had taken full form. Though she had created the original A-0 compiler on her own, subsequent iterations of her prototype evolved out of a distributed process of invention and development. A-1, A-2, A-3, MATH-MATIC, and FLOW-MATIC were “invented” by Hopper in the sense that she coordinated the creative efforts of a heterogeneous group of programmers and users. Many of the co-inventors worked for her in Sperry Rand’s Automatic Programming Division, but Hopper did not limit her innovative alliances to her company’s borders.
Input from users was a essential component of the develop
ment process. Hopper freely distributed code and provided user manuals. In return she received feedback and suggestions for compiler improvement. Some users went so far as to rewrite or expand the original code. Hopper would review these changes and incorporate the best practices. The end result was a computer language that continually improved in an organic fashion, nourished by the collective insights of Hopper’s group of innovators.
During the 1950s, maintaining and expanding that network became a central activity for Hopper. The diversity of the expanding computing industry called for a person who could bridge a variety of subgroups within that community. Hopper moved freely in many of these disparate worlds, and could speak about computers and programming in both technical and nontechnical terms, depending on her audience. Her position as the director of automatic programming development at the largest computer manufacturer afforded her access to senior management throughout the entire industry, while making her aware of the requirements of influential customers. As a reserve officer in the Navy, she personified the growing bond between industry and the military during the Cold War era, and she spent 2 weeks per year on active duty assessing the Navy’s computing needs. And in 1958 she returned to her academic roots, taking a position as an adjunct professor at the University of Pennsylvania, where she lectured about programming to a new generation of computer enthusiasts.
Hopper also became a leading figure in the ever-growing Association of Computing Machinery. She headed the association’s Programming Committee, which was tasked with keeping track of the latest developments within the field and disseminating this information in the programming community. Much to her credit, ACM conferences allocated a growing number of panels and papers to issues related to programming. Hopper chaired many of these panels, and she encouraged members of her programming team at Remington Rand to speak about their latest work. During 1953 and 1954, Hopper also headed up the ACM’s Nomenclature Committee and was responsible for publishing the first complete computer glossary of terms. Though at first glance “Nomenclature Committee Chair” is not the most striking title, the position gave Hopper an opportunity to define a common language for the industry as a whole.