It Began with Babbage

Home > Other > It Began with Babbage > Page 30
It Began with Babbage Page 30

by Dasgupta, Subrata


  To repeat, by the mid 1960s, a fairly large body of knowledge about compiling would accumulate. However, the writers of the FORTRAN compiler in the mid 1950s had very little prior knowledge or experience to guide them; they were the producers of some of the knowledge that entered the texts and monographs of the 1960s and 1970s.

  The FORTRAN compiler was, in present-centered language, a one-pass compiler: the source program in its original form was “fed” to the compiler just once; the latter “saw” the source program only once.59 Thereafter, the source program would be analyzed, dissected, deconstructed—mangled, so to speak—all in the cause of producing “optimized” code for the IBM 704.60

  In the case of the FORTRAN compiler, there was not a single writer. Compiler writing was a group effort and different individuals given responsibility for different “sections” of the compiler. The compiler was organized into six sections to work in sequence, each section passing its output to the one that followed, with the last section assembling the final object program in IBM 704 machine code.61 Naturally, “code optimization” was a significant functional goal for the compiler, and various kinds of optimization strategies were distributed among the different sections. Some of the algorithms invented for object code generation and code optimization would have consequences, and they came to influence later compilers. These included an algorithm developed for compiling assignment statements involving arithmetic expressions (for example, a statement such as X = (A[I] * B[I])/C[K])62 and an algorithm for allocating machine registers to variables (the register allocation problem, in present-centered language).63 As a later author would write, for these reasons the FORTRAN compiler for the IBM 704 was certainly the most significant of the first generation of compilers.64

  XI

  The FORTRAN project demonstrated empirically the viability of the high-level programming language concept much as the EDSAC and the Manchester Mark I had demonstrated the viability of the stored-program computer concept. However, although the stored-program schema (see Chapter 8, Section V) became the paradigm in the realm of computer architecture, in that virtually all computers that followed adhered to this schema, this was not so in the realm of programming. Assembly languages continued to prevail, especially for writing “programming systems”—programs that created the interface between the physical machine and the “ordinary” computer user. Assemblers, loaders, subroutine linkage editors, interpreters, specialized subroutines, and the compilers themselves were written in the assembly languages of the relevant computers, because this allowed systems programmers to have direct control over machine resources in a way the high-level language programmers could not have.

  Concern for program efficiency was, no doubt, the driving factor for the preference of assembly languages. However, it is also tempting to think that, much as Latin gave clerics and the learned classes in medieval Europe an esoteric tongue that separate them from the unlearned “masses”—and thus endowing the former with a certain kind of power the masses could not possess—so also each assembly language was an esoteric tongue that programmers could use to protect their guild and skills from the larger community of computer users.

  XII

  However, as cultural anthropologists and linguists well know, culture and language are intimately related. In the realm of computers, “culture” means the problem environment in which a community of computer users resides. FORTRAN was adopted by the community of scientists and engineers because the language matched the reality of their problem environment.

  During the mid 1950s, the other major computer culture was that of business computing (or data processing). And this culture—its “universe of discourse,” its “worldview”—was significantly different from the culture that embraced FORTRAN. The people who worked in data processing were not at home in the world of abstract symbols, nor were they usually mathematically savvy,65 as Grace Hopper, Howard Aiken’s collaborator on the Harvard Mark I machine of the 1940s remembered (see Chapter 5, Section VI), speaking in 1981 of her experience of the 1950s when she was employed by Remington-Rand (who had absorbed the Eckert-Mauchly Computer Corporation), the makers of the UNIVAC computers.66 In her words, they were “word manipulators.”

  Hopper knew well about FORTRAN from its earliest inception.67 She also knew about the algebraic translator Laning and Zierler had developed for the Whirlwind I.68 At Remington-Rand in 1953/1954, under her supervision, the A-2 compiler had been built to translate programs in algebraic form into UNIVAC machine code. It was almost inevitable that, involved as she was with business computing, Hopper would want to design a language/translator that met the needs of the business computing culture.

  In January 1955, Hopper and her colleagues at Remington-Rand wrote a “formal proposal for writing a data processing compiler.”69 As with Backus and his group the year before, it was not the language that dominated her thinking but the compiler. The language itself would comprise English-language words composed into sentences.70 The outcome was a compiler for a language called FLOW-MATIC,71 which was used by Remington-Rand customers.72

  FLOW-MATIC’s place in this story lies in that it was the first attempt to build a language/compiler system for business computing, and it had some influence on what became the most significant development in language design for this computing culture. This latter development began by way of meetings of a large group of “computer people” (including Grace Hopper) representing government, users, consultants, and computer manufacturers to decide on a “common business language for automatic digital computers.”73 The immediate outcome of these meetings was the formation of a committee, in June 1959, called Committee on Data Systems Languages (CODOSYL) and subcommittees that would examine the development of a language for business computing.74 If the FORTRAN project manifested signs of Big Science (see Section VII, this chapter), here was Big Science in an altogether different sense: language design-by-committee.

  The resulting language, the first complete specification of which was produced in December 1959 was named COBOL (COmmon Business Orientation Language).75 Like FORTRAN, COBOL would evolve through several versions in its first dozen years of existence, as COBOL 60, COBOL 61, COBOL 61 EXTENDED, COBOL 65, COBOL 68, and COBOL 70. Like FORTRAN, COBOL formed a very distinctive language genus. Much as FORTRAN became the effective lingua franca of the scientific computer culture, so also COBOL emerged as the lingua franca of the business computing culture—at least in English-speaking parts of the world. As in the case of FORTRAN, standards would be established for COBOL by both national and international standardization bodies (specifically, the American National Standards Institute and the International Standards Organization).76

  However, unlike the case for FORTRAN, COBOL did not spawn other “species” of the genus. More markedly, unlike the case for FORTRAN, in the realm of computer science COBOL had the uneasiest of places. As Jean Sammet (1928–), one of the original designers of the first version of COBOL (COBOL 60) remarked wistfully in 1981, “most computer scientists are simply not interested in COBOL.”77

  Perhaps this was because computer scientists were far more interested in languages that could be used to program complex, interesting algorithms. And with some exceptions (such as sorting and searching large files of data) this was not really the stuff of business computing.

  XIII

  It seems fair to say that the designers of programming languages, circa 1956 and 1957, were not overly conscious of the description of programming languages as a scientific problem in its own right, or that it was a problem of practical importance. This situation changed with the advent of what we may call the Algol movement, and with some computer scientists’ encounters with the ideas of Noam Chomsky.

  XIV

  In summer 1958, a meeting was convened at ETH, the Swiss Federal Institute of Technology, Zurich, organized jointly by the ACM and the Gesellschaft für Angewandte Mathematik und Mechanic (GAMM)—the former, an American society; the latter, a German one. The objective was to discu
ss the design of an internationally accepted (“universal”) machine-independent programming language.78 Thus was launched a remarkable international scientific collaboration. It all began when, after the development of FORTRAN, groups of people on both sides of the Atlantic independently believed that, out of the potential Babelic profusion of tongues, a single tongue should emerge, a single language that would serve to communicate programs between humans as well as between humans and machines, an invented language that would be as abstract as mathematics and natural language are (to enable interhuman communication of computations), and as necessarily and practically close to the genre of real stored-program computers. People had been watching with unease how each new computer was spawning its own distinct programming language—or so it seemed to Alan Perlis (1922–1990), a mathematician-turned-computer scientist at the Carnegie Institute of Technology, Pittsburgh (later, Carnegie Mellon University), recalling in 1986 the situation a quarter century before.79 The time for action to counter this trend seemed at hand.

  Under the auspices of the ACM, a committee was struck in June 1957 “to study the matter of creating a universal programming language.”80 This committee spawned a subcommittee in 1958, comprising John Backus (IBM), William Turanski (Remington-Rand), Alan Perlis, and John McCarthy (MIT). Its mandate was to draft a language proposal for submission to the ACM.81

  There were those in Europe with similar thoughts. In 1955, at an international symposium on automatic computing held in Darmstadt, (then West) Germany, several participants spoke on the need for “unification” (a term surely in the minds of many in a land split into two nations)—a single universal programming language.82 A working group as a subcommittee of GAMM was established to design such a language. In fall 1957, this subcommittee wrote to the president of the ACM, John W. Carr, III, suggesting that the two organizations collaborate in this common enterprise83—thus the ACM–GAMM meeting in Zurich in summer 1958. Each organization was represented by four people, including Backus and Perlis from America, and Friedrich Bauer (1924–) and Rutihauser from Europe.84

  The significance of this project was remarkable as an exercise in international cooperation. If the search for a universal is an ideal of science—of the natural sciences—then this project was whole-heartedly scientific in spirit. The members of this group were in the business of creating an artifact, no doubt, and that surely belongs to engineering; but, engineering designers are not usually interested in universals. They design and create to solve specific problems, and the solution is all, regardless of its generality. The programming “linguists” who gathered in Zurich in summer 1958 were, however, intent on the creation of a single universal language. Here was a gathering of “scientists of the artificial” (see Prologue, Section III) who aspired to the highest ideal of the natural sciences: universal principles, concepts, theory. The difference lay in that, in a natural science (such as physics), principles, laws, theories are deemed universal with respect to all physical time and space, whereas in the artificial science of language design its principles, syntax, and semantics would be considered universal with respect to the world of computation only. Perhaps in the emerging discipline of computer science, the only precedent to such a grand objective of universality was Turing’s design of his abstract, mathematical machine during the late 1930s (see Chapter 4).

  The international group identified more specific goals. They agreed that the new language should be “as close as possible to standard mathematical notation,” that it should be “readable with little further explanation,” that it should be used to describe numeric computations “in publications,” and that it should be “readily translatable into machine code by the machine itself.”85 Both human–human communication and human–machine communication were central to their enterprise. However, the group recognized that, in using the language for publication and for automatic compilation, there might be discrepancies “between the notation used in publication” and the “characters available on input–output mechanisms.”86 This meant that the language for the purpose of publication and, especially, the language for the purpose of computer execution each could vary from one environment to another. They decided that, in defining their language, such variability and potential vagaries of both print fonts and input/output “hardware” would be disregarded. Rather, they would attend to an abstract representation that they called a reference language from which appropriate publication and hardware languages might later be spawned.87

  So this yet-to be-designed programming language would have three “levels” of description: as a reference language, as a publication language, and as a hardware language. The group of eight, however, would concern themselves only with the reference language.

  The result of this meeting of eight minds, “who met, reasoned about, argued over, insisted on, and ultimately compromised over the issues” during an eight-day period,88 was a language they named Algol (an acronym for ALGOrithmic Language). Later, this version of the language was called Algol 58.89

  Algol 58 had a very short life. A description of the language, edited by Perlis and Klaus Samelson (1918–1970), and coauthored by the entire group, was published in Communications of the ACM in 1958 and in the brand-new German journal Numerische Mathematik in 1959.90

  On the one hand, Algol 58 began to be used immediately as a publication language for presenting algorithms in journals; an IBM group attempted to implement it on the IBM 709.91 On the other hand, much as mathematicians pounce on new proofs of theorems to hunt out its flaws and possible errors, so also readers of the Algol 58 report found many definitions incomplete, contradictory, or inadequate for the description of numeric algorithms.92 Improvements and modifications were suggested from both sides of the Atlantic. In Europe, an ALGOL Bulletin was launched in 1959 under the editorship of Peter Naur (1928–), a Danish astronomer-turned-computer scientist who worked for the independent computer laboratory Regnecentralen in Copenhagen.93 This would the medium for all matters Algol.

  Another international meeting on Algol was deemed necessary. In fact, there ensued more than one. Under the auspices of the United Nations Educational, Scientific and Cultural Organization (UNESCO), an international conference on “information processing” was held in Paris in June 1959. Quite apart from the Algol context, this was significant historically in that it marked the immediate prelude to the founding, in 1960, of the International Federation for Information Processing (IFIP), which became the official international umbrella organization for all individual national computing societies, the United Nations of computing, so to speak. The UNESCO Paris conference came to be later recognized as the first of the thrice yearly IFIP Congress, the official worldwide conference sponsored by IFIP.94

  The design of the “new” Algol was on the agenda of this conference. However, proposals for change were still forthcoming, so yet another conference was held in January 1960, also in Paris, and attended by a group of 13—from America, Britain, Denmark, France, West Germany, Holland, and Switzerland.95

  The new language proposed by this group was called Algol 60. A formal report on Algol 60 at the reference language level, edited by Naur and coauthored by all 13 members of the group, was quickly published, again in Communications of the ACM and Numerische Mathematik.96

  Once more, inconsistencies were found in the report. Many discussions ensued. Another Algol meeting was found necessary, this being held at the IFIP Congress of 1962 in Rome. There were several changes in the composition of the committee. The product of their deliberations was the Revised Report on the Algorithmic Language ALGOL 60, edited by Naur and coauthored by the same group of 13 who had authored the first Algol 60 report. The Revised Report was published, this time, not only in Communications of the ACM and Numerische Mathematik, but also in the British Computer Journal, which had been launched in 1957.97 It became the final, “official” definition of Algol 60.

  The Algol project, spanning roughly 1958 to 1962 and culminating in the publication of the Algol 60 Rev
ised Report, manifested vividly the role of criticism in the development of scientific knowledge. Austrian-British philosopher of science Sir Karl Popper had characterized this feature of science by way of the schema

  P1 → TT → EE → P2.

  Here, P1 is the initial problem situation (or goal), TT is a tentative theory advanced as an explanation or solution for P1, EE is the process of error identification and elimination applied to TT relative to P1, and the outcome is a new problem situation (or revised goal) P2. The cycle renews itself until a problem situation gives rise to a tentative theory for which no error can be identified.98

  Popper’s schema was intended for the natural sciences—or rather, for any enterprise that was genuinely deemed to be called a science. We see in the succession of events in the Algol project an identical process at work in the realm of a science of the artificial.

  XV

  We get a sense of the nature of the Algol 60 programming style (and can compare it with that of FORTRAN) with a small example. The Algol 60 program reads in three positive integers from an input tape and computes their greatest common divisor (GCD). In the program, the name buffer refers always to the next input value on the tape. The computation of “GCD (first, second, third)” is done through a subroutine, called procedure in Algol, that computes the GCD on pairwise numbers at a time—that is, as “GCD (GCD (first, second), third).”

  The entire program is enclosed in an entity called block, which begins and ends with the “reserved” words begin and end respectively. (In Algol 60 and, indeed, all languages belonging to the Algol genus, reserved words were in bold type.) Five integer variables are declared inside the (outer) block, of which the first three serve to hold the three input integers and the last two are “internal” variables used for computation. Also declared within the outer block is a procedure named gcdoftwonumbers; it also comprises a block. The symbol := indicates assignment; inside the procedure, decision (or conditional) statements are specified: if … then … and if … then … else …. Also shown is the goto statement, which is an unconditional branch.

 

‹ Prev