It Began with Babbage
Page 21
The point about necessity is worth noting. Concern for the preparation of a mathematical problem as a computation reach back to Ada Lovelace, as we have seen (see Chapter 2, Section VIII). However, now we see the emergence of a concern for a systematic notation for expressing computer programs. In Goldstine’s flow diagram, we see the beginnings of another part of the stored-program computing paradigm: what would later be called programming languages.
In their 1947 memorandum titled Planning and Coding Problems for an Electronic Computing Instrument, Goldstine and von Neumann articulated explicitly at the very beginning the complexity of their problem.12 A computer’s “control organ” does not just move from one instruction to the next in a linear fashion. If this was the case, there would be no problem. The complexity lay in that the control organ has to “jump” every now and then, forward or backward, and sometimes it has to repeat sequences of instructions. In other words, the control organ must execute branches and iterations. And because in the IAS computer (for which this document was written)13 the instructions in the Selectron-based main memory (see Chapter 8, Section XV) could be modified by other instructions, there was also the possibility that, in repeating a sequence of instructions, the instructions may have been altered.14
The intricacies of programming were being put on the table, and the future programmer of the IAS computer was forewarned by Goldstine and von Neumann that the actual sequence of instructions executed in real time may not coincide with the ordering of the instructions in the program text. The latter is static; the former, dynamic. The point they were making, that the relation of the actual computer program to the mathematical procedure being programmed was not simply a matter of translation, was an important early insight into the complexity of the programming activity. It was not simply a question of translating a piece of mathematical text into code.15
In the literary realm, as every thoughtful translator of literary texts written in one language into another has recognized, translation is a creative process, for it entails not just a linguistic conversion, but a mapping from one culture to another. Literary translation has often been referred to as transcreation to acknowledge the creative aspect of this process.16
The problem of “coding” (that is, programming) is also a problem of mapping from one culture to another—in the case of a solution to a mathematical problem, for example, mapping from mathematical culture to machine culture. Ever since Lovelace, those who have attempted to prepare problems for automatic computation have realized this, but it seems fair to say that Goldstine and von Neumann were probably the first to treat this problem of intercultural translation as a problem in its own right, and to bestow on it a huge measure of respect.
A program, a “coded instruction sequence,” itself seems a static entity—simply a sequence of instructions residing in specific locations of memory. This static appearance is misleading. What one must keep in mind is that a program “text” represents a process. And so the art of preparing a program should begin, as Goldstine and von Neumann advised, by planning the course of the process itself, its dynamic aspect, and then extract from this the actual instruction sequence as a “secondary” operation.17 Here, Goldstine and von Neumann were grappling with an entirely new intellectual experience, and the insight it led them to was that one should first master the process (a dynamic entity), then program text (a static entity) can be extracted from the process.
Thus, the planning of a program should begin by depicting graphically the movement of the “control organ” through the instructions held in memory as the program is being executed. This graphic depiction (“schematic”) is the flow diagram for the control organ.18
At the heart of why the program text does not mirror the process undertaken by the control organ is the phenomenon of repetition—the fact that the control organ may have to move several times through the same locations in memory to repeat a sequence of instructions.19 The authors gave a litany of examples from mathematical situations of this sort. A repetitive process—or “iterative” process—would be best “visualized as a loop.” Goldstine and von Neumann called it an inductive loop. And although a linear sequence of operations with no iteration within it would be represented by a directed line, iterative (or inductive) loops would be symbolized by directed lines that closed in on themselves.
And so the language of the flow diagram unfolds. New symbols are introduced to represent how or when the control organ enters into a process and when it exists—to signify alternative pathways within a process, to specify conditions that cause the control organ to select one or the other of the alternative pathways, to describe actual arithmetic and other numeric operations performed during the course of the process.
The actual (arithmetic) operations performed during the course of a computational process are described by expressions and the destinations of the expression values—in present-centered language, by means of assignment statements. However, Goldstine’s flow diagramming language also has something very interesting: “assertion boxes.”20
An assertion given in an assertion box inserted in a certain point in a flow diagram specifies certain relations are always satisfied whenever the control organ reaches that point.21 This concept of assertions, describing relations between variables, such as x + y = n, or x < z, that always hold at certain points in the process is a way of stating that, in the midst of all the changes that occur during the course of a process, certain relations always hold—rather as the ancient Greek philosopher Parminedes insisted: behind all the appearance of change there is an unchanging constancy. As we will see later, this concept of static assertions placed within the representations of a dynamic process such as a computer program will become one of the linchpins in the development of (in present-centered language) programming methodology. Goldstine and von Neumann anticipated this later development, although it does not seem that they influenced those developments. They articulated a programming methodology of their own: programs (or coding, in their language) begins with the construction of flow diagrams.22
Their flow diagramming notation had a syntax. There were rules and principles that would have to be followed in drawing a flow diagram, and the reader of such a diagram would have to be conversant with the syntactic principles. In this sense, their flow diagram was more than a notational system; it was a language. Having first drawn the flow diagram showing the overall flow of control (the “dynamic” or “macroscopic” picture, in their language), the programmer then proceeds to code the instructions for the individual tasks shown in the “operation boxes” within a flow diagram. They called this the “static” or “microscopic” stage of programming. The main virtue here was that one could work on the coding of the individual operation boxes independent of one another.23 In contemporary language, their programming methodology advocated a blend of the hierarchical (macroscopic/microscopic) and the modular (independence of the individual boxes).
Figure 9.1 gives a flavor of how a Goldstine–von Neumann flow diagram might look. The boxes marked # signify assertions; the other boxes specify operations. At two points in the flow diagram alternate pathways are shown, with the path chosen depending on whether the pertinent expression is positive or negative. Iterations, possibly nested in other iterations, are also shown in the diagram.
IV
If Preliminary Discussion of the Logical Design of an Electronic Computing Instrument by Burks and colleagues was a manuscript for the architecture of a stored-program computer, Planning and Coding Problems for an Electronic Computing Instrument was as surely a manifesto for a methodology of programming that computer. As we have seen, this methodology enfolded both the design of a language in which to describe programs as well as the process of translating a problem into a computation.
However, there was much more to programming and its methodology than this. At the time Goldstine and von Neumann wrote Planning and Coding Problems for an Electronic Computing Instrument, work was already underway in Cambridge in England
on the EDSAC. And Maurice Wilkes understood clearly that as soon as the EDSAC was working, the focus of interest must shift to the realm of programming. He understood that even the simplest task demanded of the EDSAC required a nontrivial program of some sort.24 And so, very early during the EDSAC project, he assembled a group of people interested in the problem of program development. A committee was assembled to create a library of subroutines on which users could draw without having to write them from scratch every time they were needed.25
FIGURE 9.1 The Goldstine-von Neumann Flow Diagram.
Thus, it was not enough to create a language for expressing a program in some human-comprehensible term; it was not enough to show how the description should be transformed into code for a real computer. “Higher level” building blocks had to be created to facilitate the ordinary user’s task of programming a stored-program computer. The subroutine was such a type of building block.
The idea of the subroutine had actually occurred to John Mauchly even before this time. In January 1947, he presented a paper titled Preparation of Problems for EDVAC-Type Machines at the Harvard symposium on digital computing machines organized by Howard Aiken (see Chapter 8, Section XVI). In this paper, Mauchly dwelled on a basic decision problem that all computer designers would face: to distinguish between operations that must be built into the machine (hardwired, in present-centered terms) and those that should be programmed.26 He made the point that this would depend on the frequency of operations and the ease with which the “nonbuilt” operations could be programmed on a given machine.
He then noted that some operations that are not “built-in” may still occur with sufficient frequency that, although it may not be justified to “build them into” the machine, one should not have to program them ab initio every time. Rather, he imagined magnetic tapes that would contain small programs that performed these operations, prepared once and for all, which could then be summoned into use whenever needed by different programmers for their respective programs. Mauchly used the term subroutines to signify such frequently used and usable programs. But, he warned, they must be sufficiently flexible to be genuinely general purpose.27 In fact, half of Mauchly’s paper was devoted to the topic of subroutines and how to implement them in a practical way.
That same year, a month after Mauchly’s presentation, in his lecture to the London Mathematical Society (see Chapter 8, Section VIII), Alan Turing also mentioned, although far more briefly, the necessity of what he called “subsidiary tables”—essentially, subroutines.28
There is also anecdotal evidence, recalled many years later, that Grace Murray Hopper, who had worked with Aiken on the development of the Harvard Mark I and coauthored a three-part paper on that machine (see Chapter 5, Sections V and VI), was aware of the subroutine concept at that time (the mid 1940s). Recollecting her experiences in a keynote address given at a conference on the history of programming languages in June 1978, she spoke of writing subroutines, although they were not called as such, but rather thought of simply as “pieces of code.”29
Thus the concept of the subroutine was much “in the air” in the mid to late 1940s. The question of who “invented” the subroutine, like the larger question of who “invented” the stored-program computer, is highly problematic, for it depends on what we mean by invention.
If we were to signify by “invention” the conceptualization of an idea, then Mauchly should probably have the credit. If Hopper’s recollection was correct, then the term “subroutine” existed as early as 1944. Mauchly, however, clearly conceived the notion of subroutines existing as entities (on magnetic tape) independent of the physical computer but summoned to use on and by the physical computer when required. Subroutines are programs that would become building blocks in the creation of larger programs. Here was an idea of the autonomy of a computer program—an artifact at once apart from the physical computer but with a usefulness that depended on the physical computer—a realization of the liminality of programs (see Prologue, Section III).
On the other hand, if invention in the world of useful artifacts entails not just the conception or ideation but the construction of the first artifacts that carry the idea into a complete, demonstrable form, then it would seem that David Wheeler in the Cambridge University Mathematical Laboratory was the “real” inventor of the subroutine as a class of liminal computational artifacts.
V
Wheeler, as a student of Trinity College, Cambridge, read for the mathematics tripos and obtained his degree in 1948.30 He joined the EDSAC project as a research student immediately after graduation.31 When the time came for the EDSAC group to think about how the machine should be used, Wilkes suggested to Wheeler that he should “imagine” that the machine was completed and investigate how it should be used.32
This kind of “imagining” could never have happened in the days when an artifact’s conceptual design existed entirely in the mind of a single creator, and the creator made the artifact based on his private mental concept, had computer designs still been at the “craft stage” (see Chapter 2, Section VIII).33 Wheeler’s “imagining” was possible because the EDSAC existed, if not materially, then in abstract form as a design that was shared by members of the EDSAC team. In present-centered language, this design was the EDSAC architecture—a liminal artifact, abstract on the one hand, yet its physical realization sufficiently well understood that Wheeler could “imagine” the workings of the physical computer as if it had been completed and in use.
Wheeler’s response was twofold. The first was the development and use of subroutines and the creation of a subroutine library. At the Cambridge conference in June 1949, Wheeler presented a paper titled Planning the Use of a Paper Library and began by noting that “different problems have many parts in common.”34 He went on to say that these “parts” may be prepared “once and for all” but—echoing Mauchly—they must be sufficiently flexible so that they can be adapted to differing problem contexts, thus the prospect of a library of subroutines.35
Then, after defining “programme” and “routine,” he explained that a subroutine “is the coded form of an element in a routine (e.g., square root, input and binary decimal conversion, etc.)” (see Section I, this chapter). He went on to state the advantages of a subroutine and a library of such entities: simplification of the task of preparing problems for computation, facilitating the understanding of routines by other users, and, if the subroutines are correct then the chances of errors in a routine as a whole are considerably reduced. These advantages originate in the fact that subroutines are larger functional “units” in a program than the order codes themselves.
Wheeler did not just conceive an idea, however; he actually built subroutines and demonstrated how they could be used. He identified certain problems engendered by the subroutine concept and proposed solutions to them.
One of the problems was that instructions (orders) within a subroutine refer to memory locations (addresses) of the “data” (operands, as they were being called by 195136) relevant to that subroutine only. This meant that the subroutine would have to be placed in a fixed set of locations in the EDSAC memory. It would always have to be placed in that same set of locations. Wheeler proposed, instead, a more flexible solution that allowed for the “relocatability” of a subroutine, which involved putting the subroutine on the tape in such a way that it deferred fixing its location in memory until when it would be input into the machine. The latter task, in turn, would necessitate a “special” subroutine.37
Another problem was to make a subroutine for a particular kind of task sufficiently general by using parameters that could be set to specific values for each instance of the subroutine’s use. These parameters could be “external”—set to values at the beginning of the subroutine’s use, which would remain fixed throughout the execution of the subroutine—or “internal,” meaning that their values could vary during the solution of the problem.38
A third problem was where and how a subroutine should be invoked for
execution. Here, Wheeler distinguished between two types of subroutines. In the first, and simpler, situation, control is transferred to the beginning order of the subroutine and, when the latter ends, execution control simply flows to the order located immediately after the subroutine in memory. Wheeler termed such a subroutine “open.”39 But, this was not very flexible, for if the subroutine was required to be used at several points in the main program, as many copies of the subroutine would have to be inserted in the appropriate places. Instead, Wheeler conceived the closed subroutine, which would be called from some location within a program and, on its execution, control would be transferred to the point in the “calling” program immediately after the location from when the subroutine was called.40
So, a closed subroutine can be placed anywhere in the EDSAC memory, outside the main program. When it was needed to be called from a point in the calling (main) program—say, from address n—control “jumps” to the first instruction in the subroutine. At the end of the subroutine’s execution, control “jumps back” to address n + 1. This mechanism came to be called the Wheeler jump (Figure 9.2).
VI
Wheeler’s other major invention was as consequential as that of the subroutine. To understand the nature of this second contribution, let us follow his own words:
It is inconvenient to write our orders in their binary form, and so we shall write them in the following form. The first five binary digits [within an order, called the function digits, which specify the function or operation to be performed] will be represented by the capital letter that represents these digits on the keyboard perforator. The next 11 binary digits [signifying the operand in memory] will be represented by an integer n in the normal decimal form. This integer is the address referred to in the order.… The last digit [signifying the length of the operand, “long” (binary 1) or “short” (binary 0)] will be represented by D if … [long] and if … [short] it will be represented by F.41