It Began with Babbage Page 39 Read online free by Dasgupta, Subrata

Home > Other > It Began with Babbage > Page 39

It Began with Babbage Page 39

Programming, for Dijkstra, was a human activity; and as a human, he had to live with his cognitive limits.8 Like Herbert Simon, who recognized the cognitive limits to human rationality (see Chapter 14, Section III), Dijkstra understood the limits to the human ability to cope with complexity. Simon circumvented bounded rationality by way of heuristic problem solving. Dijkstra sought for ways out from the messiness of the programming habits of his time. One way was to discipline the use of certain types of statements in programming languages—in particular, the goto statement in Algol-like languages. In 1968, he wrote a letter to the editor of the Communications of the ACM that was published under the title “Goto Statements Considered Harmful” and in which he proposed that the goto statement in Algol-like languages should be simply abolished from all programming languages, for it was an open invitation to make one’s program unnecessarily messy, thus unnecessarily complex. The goto statement, he pointed out, was entirely avoidable9 This letter proved to be astonishingly controversial because many thought it posed a threat to their very freedom as programmers, a restriction, they believed, on their “first amendment” rights as programmers.

As for large programs such as OS/360 and Multics—he was writing in the first age of massive operating systems (see Chapter 15, Section XIII)—their complexity had to be controlled, or rather mastered, in some fashion. The solution lay in the Latin adage divide et impera, divide and rule.10

But how should one divide to rule successfully? If a program is to be decomposed into many parts—or is to be composed out of many parts—then decomposition/composition of the parts must be done so that there is a harmonious fit between the parts to achieve a unity of the whole.

There was a corollary to this. Conventional wisdom on programming at the time had it that program debugging—weeding out errors (bugs) from a program already written—was a standard operating procedure. Dijkstra had a horror for debugging. Believing the precept “prevention is better than cure,” he also believed that program development should prevent nasty bugs from entering a program at all, from the moment its construction begins.11

III

So how does one divide and rule? Dijkstra laid out the fundamentals of the gospel that became structured programming in his 1965 paper.

The programmer first identifies and completely specifies the functional requirements of the individual program components—that is, what each component is required to do. He then demonstrates, rigorously, to his own satisfaction, that if components are put together properly, then the program will solve the intended problem. He then implements the individual components independently so that their respective functional requirements are met.12 However, an individual component of the total program may itself be of such complexity that this, in turn, may need to be decomposed into subparts using the same procedure as just mentioned.13

This, then, was the basic manifesto of structured programming, and Dijkstra used this term for the first time in 1969 at the NATO conference on software engineering.14 Later, other terms would also be used to refer to this approach: program development by stepwise refinement and top-down programming.

The division into parts comes with responsibilities. One must ensure that the specifications of the components are such that, collectively, they do the job.15 This is where aesthetics—elegance and clarity—enter the picture.16 The insides of the components are (not yet) of concern. Indeed, they must not be of concern; they may not even have been constructed. The specifications describe the functional behavior of the parts, and the initial task of verification must be concerned only with these functional specifications and their “fitting together” into the desired whole so that, quite independent of the internal details of the components, when put together the components do not interfere with one another. Thus, one can guarantee the correctness of the whole program.17

Only when this has been achieved does the programmer proceed to the implementation of the components. This implementation, in effect its internal design and construction, must satisfy the part’s exterior, functional specification. If the component is still too complex (for the “small” human brain to cope with), then it must also be divided into subcomponents, their functional specifications be established, verification of their working together correctly be guaranteed before constructing the interior of the subcomponents, and so on.

IV

In 1968, Dijkstra published a paper on a multiprogramming system he and a small team of colleagues were then designing and building at THE.18 There was no reference to his “Programming Considered as a Human Activity” paper. In fact, there were no references at all, an idiosyncrasy of his style of writing scientific papers more reminiscent of a much earlier style of scientific writing than of his time. But, the paper gives us a sense of how the principle of divide et impera was applied in the design of a real operating system. This was not structured programming in the way described earlier; it was not “top down.” Rather, it was an exercise of what, in present-centered language, would be called bottom-up, hierarchical structured design. However, the imperative to produce a verifiably correct system as part of the design process was still in evidence. As significantly, Dijkstra claimed that the method presented in the paper could demonstrate the logical correctness (“soundness”) of the multiprogramming system in the design stage itself and thus make its empirical testing much easier after implementation.19

At the time the paper was written, the system had not been completed. However, Dijkstra claimed, the rigor with which the system design will be proved correct will greatly assuage any fear that the system, when implemented and put into operation, will fail.20 We will see later how programs could be proved correct. What interests us, however, especially about the THE multiprogramming system, is how the credo of divide and rule was realized in the design of a complex piece of software. In fact, divide and rule applied in two different ways. First, Dijkstra conceived the whole computing system logically as “a society of sequential processes.”21 Each distinct user program corresponded to a sequential process; each input and output device (in operation) constituted a sequential process.22

A sequential process (not to be confused with sequential circuit or machine [see Chapter 15, Sections II–IV]) is a procedure or a program in a state of execution on its own “virtual” (or abstract) processor; it is a serial process. Sequential processes are abstract. Logically speaking each process goes about “doing its own thing,” independent of other processes, except when they need to communicate with one another. A “society” of sequential processes logically operates concurrently each on its own virtual processor. Because the processes are abstract, there is no sense of physical time; rather, there is a sense of abstract time so that each process executes at its own abstract speed. So when two processes P1 and P2, working concurrently, need to communicate—for example, P1 sends a message to P2, which must receive it—they must be synchronized at the time communication is made. Such communication would involve P1 writing its message into a storage area (a “buffer”) and P2 reading this message. So access to the buffer by P1 and P2 must not interfere with each other. This is the “mutual exclusion problem” involving shared resources that Dijkstra first studied and solved (by inventing a fundamental mechanism he called semaphors).23

The abstractness of sequential processes was important, for the “society” of such processes live and work harmoniously and cooperatively with each other quite independent of the physical processors (material artifacts) that actually run these processes.24 In fact, the THE system was, in part, a realization of a theory of “cooperating sequential processes” Dijkstra had developed in 1965.25

The organization of the whole computing system into a “society” of abstract, independent but cooperating sequential processes was one aspect of the divide-and-rule strategy used in building the THE multiprogramming system. The processes were also structured hierarchically into multiple levels. Each level comprised one or more sequential processes that created abstractions out of
more basic, more concrete processes or resources. Thus, the lowest level, level 0, abstracted across physical processors. Above this level, the number of processors actually shared among the processes was rendered invisible26; processors, so to speak, lost their individual identity.27 At the next level, level 1, processes abstracted physical memory into segments (in the virtual memory sense [see Chapter 15, Section XII]). Above level 2, each process had its own “console” to hold “conversations” with the operator. Virtual consoles were created; the fact that there may have been only one (shared) physical console “disappeared from the picture.”28 At level 3, sequential processes concerned with buffering input streams and unbuffering output streams were realized. Virtual input/output communication units were created. Finally, at level 4, the user programs resided.29

Decomposing and organizing the computing system—dividing and ruling—in these two ways produced what Dijkstra would later call “hierarchical ordering of sequential processes”.30 Structuring a programming system hierarchically in this bottom-up fashion, one level at a time, has great aesthetic appeal; it is an elegant way for managing the complexity of a large system. After the sequential processes at level i have been designed, implemented, and proved correct, one need not worry about them; they become reliable components for building sequential processes at level i + 1.

V

But how does one prove that an individual program (or a sequential process) is correct? And why is it important to prove correctness as opposed to the usual method in programming of testing a program experimentally on a computer?

Dijkstra raised this latter issue in his 1969 paper, “Structured Programming,” presented at the second NATO conference on software engineering. Proving program correctness was preferable to program testing because the latter had an inherent limitation; although empirical testing of a program can reveal the presence of errors, it could never prove their absence.31 This assertion became a much-quoted aphorism, a mantra, in later times. The logic of the statement was impeccable. We do not know whether Dijkstra had read the writings of philosopher of science Sir Karl Popper (1902–1994),32 but there was complete resonance between Dijkstra’s aphorism and Popper’s assertion that no amount of empirical evidence can ever prove the truth of a scientific theory, but just one piece of counterevidence is sufficient to falsify the theory. Thus, “falsifiability” of a scientific theory—demonstrating an error in the theory—is analogous to demonstrating the presence of a bug in a program; demonstrating the absence of error in a scientific theory is the analog to demonstrating the absence of a bug in a program. Neither can be shown by empirical (“inductive”) means—experiment or observation.

The kind of science with which Popper was concerned were the natural sciences—physics, chemistry, biology, and the like. In the natural sciences, the object of interest being “natural,” one has to perform experiments or make observations; a theory in physics or in biology has to be tested against reality. Popper’s point was that one must test a scientific theory—that is unavoidable—but not to prove the theory (because that is logically impossible) but to falsify or refute it.

But programs—software—are artifacts; they are not discovered, but invented. Moreover, programs are symbolic artifacts in the same sense that mathematical theorems are symbolic artifacts. Computer science is, after all, a science of the artificial, not a science of the natural (see Chapter 1). Thus, there is a way out for programs that does not exist for a theory in a natural science. One can formally prove the correctness of a program in the same way one can formally prove the correctness of a theorem. Indeed, in this view, a program is a theorem. It says that “if this procedure is followed, then such and such a function will be computed.” And so, as in mathematics and logic, one can apply the axiomatic approach.

We have already seen the axiomatic method “in action” in the realm of computing in Turing’s work on the Entscheidungsproblem (see Chapter 4) and in Allen Newell’s and Herbert Simon’s work on the LT, or Logic Theorist (see Chapter 14, Section V). To prove a theorem in mathematics or logic, one has a set of basic definitions, axioms, and a set of rules of inference, along with a body of already proved theorems. One then applies the rules of inference on appropriate axioms, definitions, and prior theorems to construct a logically rigorous chain of inferences with an end product that is the theorem of interest; the chain of reasoning is the proof.

A similar approach can apply—indeed, for Dijkstra, must apply—in the case of programs. Thus, a new arena in computer science within the subparadigm of programming methodology was born: formal program verification.

VI

Dijkstra was by no means the first to think such thoughts. In 1949, at the Cambridge conference on automatic calculating machines where the EDSAC was first demonstrated publicly (see Chapter 8, Section XVI), the redoubtable Alan Turing presented a two-page paper in which he anticipated the germinal ideas underlying formal program verification.33

Turing pointed out that “checking”—his word—whether a program is correct can be greatly facilitated if the programmer states assertions that are expected to be true at certain points in the program. For example (using Algol-like notation), immediately after the execution of the assignment statement

x := y + z

the assertion

x = y + z

will always be true. Or, following the execution of the statement

if x ≤ 0 then x := 0 else x := x + 1

the assertion

x ≥ 0

will always be true, regardless of which of the two assignments is executed.

These assertions, Turing noted, can be such that after they are checked individually to be correct, the correctness of the whole program follows. We are reminded here of the “assertion boxes” Herman Goldstine and John von Neumann created as part of the flow diagram notation they invented for specifying algorithms (see Chapter 9, Section III).

Turing also made the distinction between what would later be called “partial” and “total” correctness. Partial correctness is the correctness of a program assuming it terminates—that is, it comes to a stop. Total correctness is concerned with proving that the program does terminate. Recall that finiteness or termination is a defining characteristic of an algorithm (see Chapter 15, Section VIII).

Turing illustrated his ideas by way of an example. The problem he used was an algorithm (“a routine,” in his words) to compute the factorial of a number n (denoted n!) without using a multiply operation, with multiplication being carried out by repeated additions. The algorithm was represented by a flowchart.

Sometimes, an idea is far too ahead of its time. Turing’s paper apparently made no waves at the time. It died a quiet death. It lay forgotten until well after formal program verification became an established subparadigm. The paper was “discovered” almost 20 years afterward.34

VII

It is quite possible to follow Dijkstra’s gospel of structured programming without actually proving the correctness of the resulting program. Structured programming represented both a mentality and a method for managing the intellectual burden of developing computer programs. One can follow the divide-and-rule philosophy; proceed from the higher level specification for the programs as a whole through decomposition into parts and subparts in a top-down, hierarchical fashion without formal proofs along the way. However, the jewel in the crown of structured programming was formal verification. For this, Dijkstra’s preference was the axiomatic approach of mathematics. Herein lay the aesthetics.

It is one thing to aspire to prove programs correct as one proves theorems. It is another thing to achieve this aspiration. To understand this issue we recall the fundamentals of Dijkstra’s gospel. The function of an individual program must first be specified rigorously in some formal language, such as a language of logic—for example, “predicate calculus.” Let us denote this specification by S; this describes the intended behavior of the program (yet to be written). The program itself must then be describe
d in some appropriate programming language. Let us denote this program as P. The process of formal proof involves showing, by logical arguments, that P does indeed behave in the manner prescribed by S. But, this requires the availability of a precisely defined semantics of the language in which P has been written. Let us denote this semantics by σ. Verification thus requires this ménage a trois < S, P, σ >.

As we have seen, although the syntax of Algol 60 was defined formally by a context-free grammar expressed in the meta-language BNF (see Chapter 13, Section XVI), its semantics was in English. A much more formal and unambiguous meta-language than English (or any other natural language) would have to be used for the purpose of formal verification.

VIII

In fact, inventing a meta-language for a formal semantics of programming languages was very much in the air during the 1960s, quite apart from the needs of structured programming. After all, the user of a programming language would need to know the precise meaning of the various (precisely defined) syntactic units in the language, such as declarations of variables, assignment statements, conditional (if then) statements, repetition (for) statements, procedures (proc), program blocks (begin end statements), and so on. Furthermore, the compiler writer needed to know the precise meaning of these syntactic units to translate a program correctly in that language into object code for the target machine.

One approach, originating in the ideas of John McCarthy (the inventor of LISP [see Chapter 14, Section IX]) in 1963 and Peter Landin (1930–2009) in 1964, associated with each statement type in a programming language an interpretation of how that statement would be executed on a standardized (“canonical”) abstract machine.35

This approach to defining semantics came to be called operational semantics, because meanings of syntactic units were defined by showing the sequence of operations that would be performed on the standardized abstract machine to execute or realize that syntactic element. The idea was a kind of formalization of the informal semantics described in English, but with respect to a formal and abstract computer.

‹ Prev Next ›