It Began with Babbage
Page 24
A neuron generally has more than one synapse impinging on it. At any instant, a neuron is activated when the sum of its “input” synapses’ activities reach a certain threshold. Such synapses are called excitatory synapses. If the threshold is not reached, the neurons remain quiescent. However, there are also inhibitory synapses that inhibit the excitation of neurons on which they impinge regardless of the excitation level of excitatory synapses connecting to that same neuron.
A neuron, according to the general understanding of the time, has an “all-or-none” nature; it is either active or inactive. It is, thus, a binary digital device. The neural activity of a network of neurons can be determined by the pattern of binary activity of its constituent neurons. It was this apparent binary character of neurons that prompted McCulloch and Pitts to draw on Boolean (or propositional) logic to describe neural activity—just as, 5 years earlier, Claude Shannon had analyzed relay switching circuits using Boolean logic (see Chapter 5, Section IV). They imagined the behavior of a neuron or a neuron network in terms of the language of Boolean propositions,12 and they represented the neuron accordingly.
1. The activity of the neuron is an “all-or-none” process.
2. A certain fixed number of synapses must be excited … in order to excite a neuron at any time.
3. The only significant delay within the nervous system is synaptic delay.
4. The activity of an inhibitory synapse absolutely prevents excitation of the neuron at that time.
5. The structure of the net does not change with time.13
An example of a McCulloch–Pitts neuron is shown in Figure 11.1. Here, A and B signify synaptic “inputs” to the neuron, and C denotes the “output.” The number in the “neuron” shows the threshold of activation. Suppose at the start of some fixed time interval t both A and B are inactive. Then, at the start of the next time interval, t + 1, the neuron is inactive. If either A or B is active and the other inactive, then the neuron also remains inactive because the threshold of activation has not been reached. Only if both A and B are active at the start of time interval t, will the neuron be activated at the start of the t + 1 time interval. Here, the neuron functions as a Boolean AND device satisfying the proposition C = A AND B.
Figure 11.2 shows a neuron with two excitatory synapses A and B, and one inhibitory synapse C as inputs. The threshold value is 1. If the inhibitory synapse is inactive, then activation of A or B will excite the neuron and it will “fire.” However, if the inhibitory synapse c is active, then no matter what the states of excitation of A and B, the neuron will not excite. The Boolean proposition describing this neuron is D = (A OR B) AND (NOT C).
McCulloch and Pitts described the behavior of both single neurons and neuron networks in terms of Boolean propositions or expressions. The correspondence with switching circuit behavior as Shannon described in 1938 (see Chapter 5, Section IV) is evident to anyone familiar with Shannon’s work. Although the notation used by McCulloch and Pitts and the mathematics they present to demonstrate their “logical calculus” were complex, their core result was clear. The behavior of neural activity could be described by Boolean (logical) expressions; conversely, any Boolean proposition could be realized by a network of McCulloch-Pitts neurons, and although the neurons themselves are simple in behavior, they can give rise to neuron systems of considerable complexity.
FIGURE 11.1 An Abstract Neuron with Excitatory Inputs.
FIGURE 11.2 An Abstract Neuron with an Inhibitory Input.
IV
Let us leave aside how their article was received by its primary target readers—the neurophysiological and theoretical biological community. Its place in this story lies in that it caught the attention of the irrepressible von Neumann. Tucked away in the EDVAC report, written two years after the publication of the McCulloch-Pitts article, was an observation of the “all-or-none” character of relay elements in digital computing devices.14 But then, von Neumann continued, the neurons in the brains of “higher animals” also manifest this binary character; they possess two states: “quiescent” and “excited.”15 Referring to the McCulloch-Pitts article, von Neumann noted that the behavior of neurons could be “imitated” by such binary artifacts as telegraph relays and vacuum tubes.16
The parallel between binary circuit elements (relays and vacuum tubes) in a digital computer and neurons in the brain is thus established. Drawing on the McCulloch-Pitts neuron, von Neumann envisioned digital circuits as a network of idealized circuit elements, which he called Eelements, that “receives the excitatory and inhibitory stimuli and emits its own stimuli”17 after an appropriate fixed “synaptic delay.”18 A significant portion of the EDVAC report is then devoted to the description of the Eelements and networks of Eelements, including the structure and behavior of arithmetic circuits modeled by such networks.19
von Neumann had, albeit briefly, almost casually identified a relationship between the circuits in the brain and circuits in the digital computer, but this was merely a scenting of blood. For a man of his restless intellectual capacity and curiosity, these allusions in the EDVAC report were only the beginning of a new scientific track.
In September 1948, a number of scientists from different disciplines—mathematics, neurophysiology, and psychology—assembled in Pasadena, California. They were participating in a conference titled Cerebral Mechanisms in Behavior, funded by the Hixon Foundation and, thus, named the Hixon Symposium.20 This symposium has an important place in the histories of psychology and cognitive science. Behaviorism, the dogma that eschewed any discussion of mind, mentalism, cognition as being within the purview of scientific psychology—a dogma that became the dominant paradigm in American experimental psychology throughout much of the first half of the 20th century from the time of World War I—came under serious attack from people such as McCulloch and neuropsychologist Karl Lashley (1890–1958).
And there was von Neumann. In a lecture later published in the Hixon Symposium proceedings as a 40-page chapter, he pursued in some detail the blood he had scented 3 years before. He titled his exposition “The General and Logical Theory of Automata.”21
If Leonardo Torres y Quevedo had redirected the old idea of automata from active to thinking artifacts in 1915 (see Chapter 3, Section IX), if Alan Turing had launched a branch of intellectual inquiry into how abstract automata could work with his Entscheidungsproblem paper of 1936 (see Chapter 4, Section III), then by way of his Hixon paper, Neumann surely gave the field a name. The subject of automata theory is not the automata of Hellenistic antiquity, but abstract computational machines such as the Turing machine. The epicenter of what would later be called theoretical computer science lay at the door of automata theory.
V
The kind of theory von Neumann advocated for the study of automata lay in the realm of mathematics and logic. In this he followed the approach adopted by McCulloch and Pitts. It was to be an axiomatic theory. Beginning with fundamental undefined concepts, assumptions, and propositions (axioms), and using well-understood rules of reasoning, one derives logical consequences of these fundamentals (see Chapter 4, Section I for a brief discussion of the axiomatic approach). The axiomatic world, familiar to those of a mathematical or logical disposition,22 is a formal world, quite unlike the severely empirical world that people like Wilkes, Kilburn, and Mauchly inhabited. Abstract automata, like actual digital electronic computers, are artifacts—inventions of the human mind—but electronic computers belong to the realm of the empirical; abstract automata belong to the realm of the formal. And so, axiomatizing the behavior of abstract automata meant that their building blocks be treated as “black boxes” with internal structures that are ignored (or abstracted away), but with functional behaviors that are well defined and visible.23 Defining the behavior of McCulloch-Pitts neurons by logical (Boolean) expressions was an instance of the axiomatization of actual neurons. Their internal structure, which obeys the laws of physics and chemistry, can be ignored.24
This approach is bread and
butter to mathematicians and logicians, and, indeed, to certain theoretical physicists and biologists. But, von Neumann cautioned, there is a fundamental limitation of the axiomatic approach when applied to empirical objects such as neurons. The approach is only as good as the fundamental assumptions or axioms. One must be sure that the axioms are valid and are consistent with observed reality. The formal world must have resonance with the empirical world. To ensure this validity, the theorist has to rely on the empirical scientists—in the case of neurons, the neurophysiologists and biochemists.25
von Neumann’s primary concern was not neurons in the head but “artificial automata”—more specifically, computing machines.26 And although such automata are vastly less complicated than the nervous system, he found the idea of investigating the behavior of neural machines in terms of automata enticing—hence the comparative study of neural systems in living matter and artificial automata. More ambitiously, it could be claimed that he was aiming to establish a universal automata theory that applied as much to the natural as to the artificial, unifying nature and artifact in some specific sense. von Neumann was, of course, quite aware that the neuron has both binary, digital (“all-or-none”), and nondigital or analog characteristics.27 In contrast, computing machines of the kind recently conceived were digital.28 Nonetheless, as a stage of the axiomatic approach one could consider the living organism as if it was a purely digital automaton.29 This suggested, to von Neumann, that there were two kinds of automata—natural and artificial, a first step in unification and universalization. Moreover, even though such artifacts as the electromechanical relay and the electronic vacuum tube were digital entities, they were really rather complicated analog mechanisms that obeyed the laws of physics. They become digital entities under certain restricted conditions.30 There was, then, a small difference between such devices and biological neurons.31 Neither was really of an all-or-nothing character, but both could be so regarded if (a) they could operate under certain conditions in an all-or-nothing manner and (b) such operating conditions were the normal conditions under which they would be used.32
Like relays and vacuum tubes, biological neurons are electrical switching units.33 This, of course, was the assumption undergirding the McCulloch-Pitts model of nervous activity, which enabled them to draw on Boolean logic to describe the behavior of neuron networks.
However, McCulloch and Pitts were not interested in computers per se, whereas von Neumann was. And so, comparison between organisms and computing machines—between the natural and the artificial—followed: their relative sizes in terms of the number of basic switching elements (the central nervous system, according to estimates of the time, had 1010 neurons34; existing machines such as the ENIAC or the IBM version of the Harvard Mark I had about 20,000 switching elements—relays and vacuum tubes35), the relative sizes of the switching organs (“the vacuum tube … is gigantic compared to a nerve cell,” the ratio of the sizes “about a billion” to one36), and the relative switching speeds (that is, the speed at which digital elements can switch states from active to inactive or vice versa); vacuum tubes were vastly faster than neurons, with the switching speed ratio being something like 1200:1.37
von Neumann did not actually present an axiomatic theory of automata but, rather, the promise of such a theory. There were already, he noted, prior results that would contribute to this theory: the McCulloch-Pitts theory according to which the functions of nervous activity can be described by a formal (or abstract or idealized) Pitts-McCulloch neural network,38 and Turing’s description and construction of an abstract computing machine (in particular, his formulation of a universal automaton that could perform computations performed by any other computing machine).39
But von Neumann went beyond what Turing or McCulloch and Pitts had offered. He reasoned that if automata theory was to be universal in scope, embracing the natural and the artificial, it had to tackle the problem of self-reproduction, for in this lay the essence of biological systems. And for this, Turing’s universal computing machine did not suffice. Turing’s machine could only produce as output, strings of 0s and 1s on a tape. von Neumann’s vision was more daring; he desired automata that could produce as output other automata.40
So the “General” in the title of his article went beyond unifying natural and artificial computers. Although the universal Turing machine allowed for computational universality, what von Neumann sought was also constructive universality.41 It was no longer a matter of comparing cerebration with computation, but of uniting the organic and the mechanical in a more fundamental sense—the possibility, so to speak, of “artificial life.”
von Neumann then speculated in a general way on the construction of such a self-reproducing automaton. Consider first an automaton A that has associated with it a description of itself, d(A). Using this description, A produces a copy of itself. However, A is not self-reproducing because it does not make a copy of its own description, d(A). So next consider another automaton, B, that when supplied with d(A), makes a copy of just this description.
Suppose now that machines A and B are combined by way of a control device C—call this combination A + B + C automaton D—that, when provided with d(A), passes it to A for constructing a copy of A, passes d(A) to B to produce a copy of d(A), and inserts the copy of d(A) into the new automaton A. So D can reproduce A along with the description d(A) of A.
Last, consider the machine D provided with its own description d(D). Call this combination automaton E. What E can do is produce another identical automaton E. E is thus self-reproducing because it can produce E that can produce another E and so on.
As von Neumann put it, d(D) “is roughly effecting the function of a gene” whereas the copying automaton B “performs the fundamental act of reproduction, the duplicating of the genetic material, which is clearly the fundamental operation in the multiplication of living cells.”42
von Neumann’s “general theory of automata” was, first, a comparative study of biological and artificial neuron systems; second, it explored, in a highly speculative way, the idea of self-reproducing automata, thereby positing a general theory that could mimic the self-reproducing capacity of biological cells. It embraced not only the computing automaton à la Turing, but also something startlingly new: a self-constructing automaton. In fact, this would be the beginning of an entirely different branch of automata theory that came to be called cellular automata theory, which considered how arrays of individual automata (“cells”) could work to transmit information between cells, perform computation, and construct various computational organs.43
VI
New scientific paradigms, new artistic and literary styles, and new technological movements are created by a select few. The birth of new creative impulses belong to the realm, primarily, of intellectual history rather than to social or cultural history. It is only when that impulse or movement is recognized as a paradigm that it spreads to the larger population. Revolutionary science, art, design then become normal science, art, design. Which is why we find the birthing years of computer science dominated by a small number of people, some of whom appear, disappear, and reappear during these early years. It is as if, having carved out a space of their own—indeed, created a space of their own—having been led to a virgin land they invent for themselves a space within that land before other newcomers have the chance to come on it. von Neumann was one such person, as we have seen. Turing was surely another (and he will appear once more very soon). Shannon was a third such person.
Shannon, as we have also seen, in 1938, connected the technological art of switching circuit design with the abstract, symbolic logic of Boolean algebra. It is, perhaps, for this reason that the design of circuits that input, process, and output the Boolean values—1 and 0 or TRUE and FALSE—is called logic design.44 In fact, during the late 1940s, Shannon did much more. In 1948, then a mathematician at the Bell Telephone Laboratories in Murray Hill, New Jersey, Shannon published an article on a mathematical theory of communication.45 A
year later, with Warren Weaver (1894–1978), a mathematician and preeminent science administrator at the Rockefeller Foundation, Shannon published a book that developed this theory more fully.46
The mathematical theory of communication is otherwise and more succinctly called information theory, which forms the theoretical foundation of telecommunications engineering and has also influenced the study of human communication.47 The word information is used in information theory in a specific sort of way; it has nothing to do with how “information” is understood in everyday language. We usually think of information as being about something. In common parlance, information has meaning, there is a semantic aspect to it. In information theory, however, information is devoid of meaning. It is simply the commodity that is transmitted across communication “channels,” whether between human beings, along telegraph wires, or across telephone lines. The unit of information in information theory is called the bit (short for binary digit, a term coined by another distinguished Bell Laboratory mathematician and Shannon’s colleague, John W. Tukey [1915–2000]48). It means nothing but itself, just as a unit of money refers to nothing but itself.
Shannon is commonly referred to as the Father of Information Theory, but—like all such glib journalistic appellations—this, too, much be viewed quizzically if only because the history of the origins of information theory began well before Shannon.49 What is beyond dispute is that he has a preeminent place in the creation of information theory.
Insofar as the transmission and storage of information bits are prominent aspects of the design of computer systems, Shannon’s contribution to information theory has an obvious place in the history of computing. But this is not why he appears in this part of our story. Shannon was one of those individuals who, during the 1940s crossed interdisciplinary boundaries with total insouciance, who ignored the narrow domestic walls the poet Tagore had dreamed of demolishing. He was, after all, a contemporary of Wiener, and it is this trait that ushers him into this chapter.