It Began with Babbage
Page 33
105. N. Chomsky. (1957). Syntactic structures. The Hague: Mouton.
106. J. Lyons. (1970). Chomsky (p. 9). London: Fontana/Collins.
107. Chomsky, 1957, op cit., p. 13.
108. Chomsky, 1956, op cit., p. 114.
109. Ibid.
110. Ibid.
111. Chomsky, 1956, op cit., pp. 116–118; Chomsky, 1957, op cit., pp. 26–33.
112. Chomsky, 1957, op cit., p. 29.
113. N. Chomsky. (1959). On certain formal properties of grammar. Information & Control, 2, 136–167.
114. Rutihauser, op cit., p. 274.
115. See, for example, S. A. Greibach. (1966). The unsolvability of the recognition of linear context free languages. Journal of the Association for Computing Machinery, 13, 582–587; S. Ginsburgh. (1966). The mathematical theory of context free languages. New York: McGraw-Hill.
116. R. W. Floyd. (1963). Syntax analysis and operator precedence. Journal of the Association for Computing Machinery, 10; T. Kasami. (1965). An efficient recognition and syntax analysis algorithm for context free languages. Scientific report no. AFCRL-65–758. Bedford, MA: Air Force Cambridge Research Laboratory; P. Z. Ingermann. (1966). A syntax-oriented translator. New York: Academic Press; D. H. Younger. (1967). Recognition and parsing of context-free languages in time n2. Information & Control, 10, 181–208; J. Earley. (1968). An efficient context-free parsing algorithm. PhD dissertation, Carnegie-Mellon University.
117. Rutihauser, op cit., p. 275.
118. O.- J. Dahl & K. Nygaard. (1966). SIMULA: An Algol-based simulation language. Communications of the ACM, 9, 671–682.
119. A. van Wijngaarden (ed.), B. J. Mailloux, J. E. L. Peck, C. H. A. Koster. (1969). Report on the algorithmic language ALGOL 68. Numerische Mathematik, 14, 79–218.
120. A. van Wijngaarden, B. J. Mailloux, J. E. L. Peck, C. H. A. Koster, M. Sintsoff, C. H. Lindsay, L. G. L. T. Meerttens, & R. G. Fisker. (1975). Revised report on the algorithmic language ALGOL 68. Acta Informatica, 5, 1–234.
121. J. E. L. Peck. (Ed.). (1971). ALGOL 68 implementation. Amsterdam: North-Holland.
122. J. T. Bonner. (1988). The evolution of complexity by means of natural selection. Princeton, NJ: Princeton University Press. See, however, D. W. McShea. (1997). Complexity in evolution: A skeptical assessment. Philosophica, 59, 79–112, for an alternative view of the relationship between biological evolution and the evolution of complexity.
123. S. Dasgupta. (1997). Technology and complexity. Philosophica, 59, 113–140.
124. N. Wirth. (1971). The programming language PASCAL. Acta Informatica, 1, 35–63.
125. B. Randell & L. J. Russell. (1964). Algol 60 implementation. New York: Academic Press; E. I. Organick. (1973). Computer systems organization: The B5700/6700 series. New York: Academic Press; R. W. Doran. (1979). Computer architecture: A structured approach. New York: Academic Press.
126. K.. E. Iverson. (1981). Transcript of presentation. In Wexelblat (pp. 674–682), op cit., p. 682.
127. A.. D. Falkoff & K. E. Iverson. (1981). The evolution of APL. In Wexelblat (pp. 661–674), op cit., p. 663.
128. In January 1972, freshly arrived as a graduate student in computer science at the University of Alberta, Edmonton, Canada, I recall attending a talk by Iverson delivered at the College of Education.
129. K. E. Iverson. (1980). Notation as a tool of thought (Turing Award lecture). Communications of the ACM, 23, 444–465.
130. K. E. Iverson. (1962). A programming language. New York: Wiley.
131. “Devotees” is the right word. APL commanded a fierce and loyal following the likes of which was rare during the first decade of programming languages.
132. Iverson, 1981, op cit., p. 675.
133. F. P. Brooks, Jr. & K. E. Iverson. (1969). Automatic data processing: System/360 edition. New York: Wiley.
134. A. D. Falkoff, K. E. Iverson, & E. H. Sussenguth. (1964). A formal description of System/360. IBM Systems Journal, 3, 191–262.
135. For a survey of this genus as it had developed from the mid 1960s through its first two decades, see S. Dasgupta. (1982). Computer design and description languages. In M. C. Yovits (Ed.), Advances in computers (Vol. 21, pp. 91–155). New York: Academic Press.
136. Falkoff & Iverson, op cit., p. 664.
137. Ibid., p. 665.
138. Ibid., p. 666.
139. A. D. Falkoff & K. E. Iverson. (1966). APL360. White Plains, NY: IBM Corporation; A. D. Falkoff & K. E. Iverson. (1968). APL360 user’s manual. White Plains, NY: IBM Corporation.
14
Going Heuristic
I
LET US REWIND the historical tape to 1945, the year in which John von Neumann wrote his celebrated report on the EDVAC (see Chapter 9). That same year, George Polya (1887–1985), a professor of mathematics at Stanford University and, like von Neumann, a Hungarian-American, published a slender book bearing the title How to Solve It.1
Polya’s aim in writing this book was to demonstrate how mathematical problems are really solved. The book focused on the kinds of reasoning that go into making discoveries in mathematics—not just “great” discoveries by “great” mathematicians, but the kind a high school mathematics student might make in solving back-of-the-chapter problems.
Polya pointed out that, although a mathematical subject such as Euclidean geometry might seem a rigorous, systematic, deductive science, it is also experimental or inductive.2 By this he meant that solving mathematical problems involves the same kinds of mental strategies—trial and error, informed guesswork, analogizing, divide and conquer—that attend the empirical or “inductive” sciences. Mathematical problem solving, Polya insisted, involves the use of heuristics—an Anglicization of the Greek heurisko—meaning, to find. Heuristics, as an adjective, means “serving to discover.”3
We are often forced to deploy heuristic reasoning when we have no other options. Heuristic reasoning would not be necessary if we have algorithms to solve our problems; heuristics are summoned in the absence of algorithms. And so we seek analogies between the problem at hand and other, more familiar, situations and use the analogy as a guide to solve our problem, or we split a problem into simpler subproblems in the hope this makes the overall task easier, or we summon experience to bear on the problem and apply actions we had taken before with the reasonable expectation that it may help solve the problem, or we apply rules of thumb that have worked before.
The point of heuristics, however, is that they offer promises of solution to certain kinds of problems but there are no guarantees of success. As Polya said, heuristic thinking is never considered as final, but rather is provisional or plausible.4
Problem solving, discovering, inventing—great and small, by scientists, inventors, artists, professional practitioners (such as doctors, engineers, designers, architects) or by students or, indeed, by people in the course of their everyday life—have always involved the use of heuristics, although perhaps most people are never conscious of it as a particular method of reasoning, or that it has a name. Like Monsieur Jourdain in the play Le Bourgeois Gentilhomme (1670) by the French playwright Molière (1622–1673) who suddenly learned that he had been speaking prose all his life without knowing it, Polya brought to the mathematical reader’s attention that he had been using heuristic reasoning all his mathematical life without knowing it.
II
An undergraduate physics major at Stanford University, where Polya taught, took some of his courses. Through them he came to know the ideas in How to Solve It. The undergraduate’s name was Allen Newell (1927–1992). By way of Polya, he was introduced to the concept of heuristics.5
In 1952, while working as an applied mathematician at the RAND Corporation, the California-based think tank, Newell met a social scientist by the name of Herbert Simon (1916–2001).6 Simon was 36 at the time. By then, he had already established a formidable reputation not only as a social scientist, but also as a polymath: later, the term “Renaissance man” would be applied to him.7
&nbs
p; Simon’s dissertation, for which the University of Chicago awarded him a PhD in political science in 1942, became, appropriately revised, a book titled Administrative Behavior (1947).8 That same year, he published his first article on the foundations of Newtonian mechanics in the august Philosophical Magazine.9 He coauthored another book on Public Administration (1950).10 During the 1940s, while an assistant professor of political science at the Illinois Institute of Technology in Chicago, he began publishing papers in leading economics journals such as the Quarterly Journal of Economics and Econometrica, and established a close association with the Cowles Commission for Research in Economics, a collection of some of the most eminent economists of the post-World War II era, several of whom were future Nobel laureates as, indeed, Simon himself was.11 He published an article in 1952 on causality in the Journal of Philosophy.12 By 1950—he was then a full professor in the Graduate School of Industrial Administration at the Carnegie Institute of Technology (later, Carnegie-Mellon University) in Pittsburgh—Simon was seriously contemplating using principles of servomechanisms (from cybernetics, created by Norbert Wiener and others, circa 1948 [see Chapter 12, Section I]) in organization theory, and, in 1952, he published a very formal article in Econometrica on servomechanism theory applied to production control.13 That same year, he wrote an article on the interaction of social groups in the American Sociological Review.14 Perhaps most relevant to our story, by the mid 1950s, he had read Edmund Berkeley’s Giant Brains (1950) and Faster Than Thought (1953), edited by Vivian Bowden.15
The meeting of Simon and Newell heralded the beginning of an extraordinary scientific partnership that would span some 20 years (and a friendship that ended only with Newell’s death in 1992). Newell became Simon’s doctoral student, but theirs was never a teacher/guru–student/disciple relationship. It was always that of collaborators–colleagues. What interests us in this story is how they transformed the idea of heuristic reasoning and problem solving as done by human beings into a research tradition or a subparadigm within the computer science paradigm itself.
III
We cannot really understand how Newell and Simon embarked on their particular intellectual enterprise without knowing something of its prehistory. It began in a discipline as far removed from computer science as one can imagine. In 1947, Simon published Administrative Behavior (1947), wherein he presented a fundamental insight for which he would receive the Nobel Prize in economics 31 years later. The Nobel Foundation citation for 1978 tells us that the prize was awarded for Simon’s “pioneering research into the decision-making process in economic organizations.”16 This “pioneering research” was first embedded in Administrative Behavior, which dealt not with the economic milieu at all, but with decision making in administrative organizations. The administrative “actor,” Simon pointed out (drawing on the ideas of administrative theorist Chester Barnard17) is a purposive (or goal-oriented) being. He desires to take action, make decisions, that lead to the attainment of goals. For Simon, purposive behavior is rational if the choice of means leads to the attainment of goals.
However, such purposive behavior faces several oppositions. There are limits to the decision maker’s innate cognitive capabilities. There are limits to one’s knowledge about all the factors that must be taken into consideration to make a fully rational decision. There are limits to the decision maker’s ability to cope with the complexity of the environment in which decisions have to be made, and with the rich profusion of alternative actions (choices, decisions) and their consequences.
All these constraints suggest that, except in the simplest situations, it is virtually impossible for an individual to achieve fully rational behavior. There are limits to his or her rationality. The decision maker’s behavior is governed by what Simon first called subjective rationality.18 A decade later, he renamed this notion bounded rationality, and this is the term by which this phenomenon has come to be known.19
By 1950, Simon had begun to apply his theory of bounded rationality to the realm of economic decision making20—to build a bridge between administrative theory and economic theory. And this bridge was to be paved with the stuff of psychology.21 In fact, Simon was tending toward a psychological (or behavioral) theory of what might be called a “universal decision maker,” regardless of the domain in which decisions are to be made—including chess. Bounded rationality lay at the core of this theory.
His interest in chess reached back to boyhood. In high school he had studied the game seriously, to the extent that he had mastered the basic decision-making issues in chess and tic-tac-toe (H. A. Simon, personal communication, November 4, 2000). This interest intensified years later when he attended a lecture by the ubiquitous John von Neumann in 1952 on computer chess, which, as we have seen, was studied by Claude Shannon circa 1949 to 1950 (see Chapter 11, Section VII). He was sufficiently stimulated by the lecture actually to engage, by mid 1953, with the design of chess-playing programs.22 The chess player—human or automaton—is as much constrained by bounded rationality in making decisions of which moves to make as are administrative or economic decision makers.
The operative word here is computer chess. In summer 1954, Simon taught himself to program the IBM 701. By then, computers were much on his mind23—and not just in the realm of chess. As a visiting scientist at the RAND Corporation’s Systems Research Laboratory, his interest in computers had been stimulated after meeting Allen Newell, and another person, Cliff Shaw (1922–1991), “an outstandingly talented and sophisticated systems programmer.”24 Their work made Simon realize that the computer was much more than a number processor—that it was, in fact, a general symbol processor.25
Newell was also much interested in computer chess. In 1955, he presented a paper on a “chess machine” at the Western Joint Computer Conference.26 That same year, Simon published what was perhaps his first definitive article on bounded rationality, titled “A Behavioral Model of Rational Choice” in the Quarterly Journal of Economics.27
The question was, Simon wrote, how to cope with bounded rationality. “Classic” concepts of rationality assumed complete information available to the decision maker, and that she computes all possible outcomes of choices and then selects the “best” (optimal) decision from all the possibilities. In fact, there was no empirical evidence, Simon pointed out, that people actually went about the matter in this way. His task in the article, Simon wrote, was to propose a model that would endow the decision maker with a form of rationality that was consistent with the cognitive limits possessed by organisms—including, especially, humans—in the actual kinds of environments—social as well as physical—organisms occupy.28
So what does a decision maker really do? Instead of seeking an optimal outcome he will establish an “acceptable” or “satisfactory” goal, an aspiration level29 that will be less ambitious than the optimal. This will reduce the amount of information the individual needs to make a decision. The decision maker then searches the reduced “space” of possibilities for a choice that meets the aspired goal. So the decision will not be optimal. Rather, in resigned recognition of the reality of bounded rationality, the decision will be “satisfactory.” Or, as Simon will coin the term in 1956, it will be satisficing.30 To satisfice is to choose what is “good” rather than what is “best.” The bounded rational decision maker is a satisficing, rather than an optimizing, being.
However, even satisficing leaves a formidable problem for the decision maker. How does one search the “space” of possibilities? This is where heuristics enters the discourse. In his design of a chess machine, Newell incorporated a variety of heuristics that the computer could deploy to select its moves. He was, of course, not the first to have done so; recall Claude Shannon’s minmax strategy—a heuristic—of 1950 (see Chapter 11, Section VII). However, although Newell’s immediate intent was to design a program to play chess, like Simon he had larger ambitions. Chess was a means to understand how the mind deals with the complexities of the real-world environment, and how compute
rs could simulate such mindlike behavior. Newell and Simon pondered at great length the possibility of modeling human thinking by computers.31 Here was a meeting of profoundly like minds.
IV
In September 1956, at the same symposium at MIT where Noam Chomsky presented his seminal paper in linguistics (see Chapter 13, Section XVIII), Newell and Simon announced the first fruits of their joint ruminations. They described a computer program called Logic Theorist (LT), which was capable of discovering proofs for theorems in symbolic logic.32
A computer program had been given a name, exactly as individual computers were given such names as the ENIAC and the EDSAC in the 1940s. This, itself, gives cause for momentary pause. It suggests that a very special artifact had been created, for here was a program that exhibited behavior akin to the highest form of human intelligence. A reader of their 1956 work would have irresistibly harked back to Alan Turing’s 1950 essay “Computing Machinery and Intelligence” (see Chapter 11, Section XI).
LT was able to prove several theorems stated (and proved) in one of the chapters of Principia Mathematica (1910–1913), the massive treatise on the logical foundations of mathematics, authored by A. N. Whitehead and Bertrand Russell. Symbolic logic was the meta-language (see Chapter 13, Section XVI) for describing mathematics in the Principia, and the entities in this work were expressions (axioms and theorems) in symbolic logic and the proofs of the theorems.
By the mid 1950s, computers were being used routinely to solve complex mathematical problems such as finding roots of polynomials, evaluating matrices, and solving differential equations. What was it about LT that merited it a name of its own, an individuality, special attention?
The answer lay in the nature of proofs in axiomatic (formal) systems of the kind mathematics and symbolic logic were concerned with (see Chapter 4, Section I, for more on axiomatic systems). One begins with a set of axioms—expressions that are assumed to be true—and definitions. There will also be a small number of “rules of inference,” by means of which new true expressions (theorems) can be inferred (or deduced) beginning with axioms, definitions, and other previously proved theorems. A proof of a theorem is a sequence of expressions e1, e2, … eN, where e1 is either an axiom or a previously proved theorem, each ei is derived from its predecessor by applying the rules of inference and using the axioms and definitions, and eN is the theorem to be proved.