The Man Who Knew Too Much: Alan Turing and the Invention of the Computer (Great Discoveries)
Page 6
So—let us return to our human computer. In keeping with the industrial ethos of the age, let us imagine her passing her days in a factory or sweatshop akin to the vast tracts of human toil depicted in Dickens novels or in The Man in the White Suit. This factory, however, produces not buttons or button hooks or even train engine parts, but numbers. In it, multitudes of women sit at tables, each slaving away at a different algorithm. There are as many women in the factory as there are algorithms. One is divining cube roots. Another is breaking compound numbers into primes. Still another is compiling tables of logarithms. Our particular computer, luckily or not, is responsible for one of the simplest, if least challenging, of algorithms: she is adding numbers together. Because this is a Turing factory, the computers are working not on two-dimensional pads but on one-dimensional tapes, of which, presumably, each has an infinite supply. These tapes are “divided into squares like a child’s arithmetic book.” This simply means that the operations most of us carry out vertically she carries out horizontally. At present our particular computer is sitting before her tape with a pencil, working out a sum:
Or, as it would appear on the tape:
How long would it take most of us to perform this elementary algorithm? Assuming we did not use calculators, but simply lined the numbers up and performed the usual operations of adding and carrying, probably about half a minute. We might well, however, make mistakes. More importantly, in order to complete the task at hand, we would need to divide the operation up into what computer programmers call subroutines, for the simple reason that few of us have memories capable of holding all the figures at one time. Turing acknowledges this from the very beginning, noting that the justification for his definition of computable numbers “lies in the fact that the human memory is necessarily limited.” This is especially the case when it comes to what Turing calls “compound symbols,” of which he notes, “The difference from our point of view between the single and compound symbols is that the compound symbols, if they are too lengthy, cannot be observed at one glance. This is in accordance with experience. We cannot tell at a glance whether 999999999999999 and 999999999999999 are the same.”
It is now that Turing introduces into his argument what he calls the computer’s “state of mind” as she does her work. Arguments of this sort, he freely admits, “are bound to be, fundamentally, appeals to intuition, and for this reason rather unsatisfactory mathematically. The real question is ‘What are the possible processes which can be carried out in computing a number?’”
He attacks this question first by presenting a thorough account of just what is going on in the computer’s head as she does her job. At any moment, he explains, her behavior is determined by two factors: the symbols she is looking at and her “state of mind.” Obviously there is a limit to how many symbols she can take in at any one moment. For example, if you show me the number 352, I can remember it and repeat it back to you without difficulty. On the other hand, if you show me the number 352,798,634,001, I will probably have to break it down into discrete units in order to repeat it back. As Turing explains, if our computer wishes to observe more symbols than her memory allows, she “must use successive observations. We will also suppose that the number of states of mind which need to be taken into account is finite.”
In effect, Turing is trying to break down the process of elementary arithmetic into its most basic parts, much as an inquisitive child might disassemble a machine in order to see how it works. In order to fulfill her task, he writes, the computer must first “split up” the algorithm that she is performing “into ‘simple operations’ which are so elementary that it is not easy to imagine them further divided.” A “simple operation” is one in which “not more than one symbol is altered. Any other changes can be split up into simple operations of this kind.”
To get an idea of what Turing means here, let us return to our computer and her tape. Before her she reads an equation. (Let’s make it a shorter one, because the pages of a book make it difficult to print a very long segment of tape.)
The first numeral she looks at is the last of the three numerals making up the number 815: 5. She then looks at the 9 that is the last numeral of the first number, 6,439. Putting these together, she gets 14. But in order to perform this calculation—and this is Turing’s crucial point—she has only to observe one square at a time. She also has to carry the 1 in 14, and in order to do this, she must insert into her calculation a numeral that will not be part of the final result but that occupies a square of the tape (and her mind) only until she is ready to add together the numerals to the left of the two she has just dealt with. In this diagram I shall indicate this temporary symbol by printing it in boldface italics:
The next observed squares would be the 3 from the first number and the 1 from the second. These add up to 4, to which the computer also has to add the 1 that is the remainder of the earlier addition of 9 and 5. The 1 is thus erased and replaced by the next permanent number, which is a 5.
The computer continues in this manner until she obtains the required answer:
There are two other aspects of the computational procedure that must be addressed if we are to establish a full-fledged technical description of it. The first is the problem of what Turing calls “changes of distribution of the observed squares.” What if our computer is at work on an especially arduous calculation—one in which each of the figures to be added contained, say, 100 numerals? In such a case, she must observe and absorb a very long sequence of squares on the tape. But because she can take in, in a single glance, only a specific length of the tape, she will have to divide up the computational process into subprocesses.
The second matter that must be taken into account is the question of what Turing calls “immediate recognisability.” Squares “marked by special symbols,” one might think, would be by definition “immediately recognisable” to the computer. This makes obvious sense: when calculating, our minds will of course instantly distinguish such symbols as +, –, =, π, and from the numbers that surround them. But what happens when we encounter a sequence that constitutes a special symbol, such as the numbers that in mathematical papers are used to label equations and theorems? Just as compound symbols such as 999999999999999 can be used to indicate large numbers, in “most mathematical papers the equations and theorems are numbered. Normally the numbers do not go beyond (say) 1,000. But if the paper was very long, we might reach Theorem 157767733443477; then, further on in the paper, we might find ‘. . . hence (applying Theorem 157767733443477) we have . . .’ ” But just as a human computer could compare the two numbers “figure by figure, possibly ticking the figures off in pencil to make sure of their not being counted twice,” it is possible to design a machine capable of doing, more or less, the same thing.
We now have a sort of road map of the procedure by which the computer fulfills her job at the factory: through a complex and, yes, cumbrous succession of steps—observations, recognitions, operations—she is able to do the sum with which we began: she adds 746,380 and 9,251,754,803 and obtains 9,252,501,183. The “simple operations” into which her procedure breaks down consist of:
(a) Changes of the symbol on one of the observed squares.
(b) Changes of one of the squares observed to another square within [a certain number of] squares of one of the previously observed squares.
And since “some of these changes necessarily involve a change of state of mind,” the most
general single operation must therefore be taken to be one of the following:
(A) A possible change (a) of symbol together with a possible change of state of mind.
(B) A possible change (b) of observed squares, together with a possible change of state of mind.
“The operation actually performed,” Turing concludes, “is determined . . . by the state of mind of the computer and the observed symbols. In particular, they determine the state of mind of the computer after the operation is carried out.”
It is at this point that
Turing offers the closest thing to a triumphant rhetorical flourish in the whole of his paper. “We may now,” he writes, “construct a machine to do the work of this computer.” And yet, lest his reference to “states of mind” should provoke balking from mathematicians made uncomfortable by such unorthodox methods, he first offers an alternative argument, which depends on the idea of a “note of instructions” provided to the computer before she begins her work. According to this argument, the computer—and who can blame her, given the tedious nature of her work?—takes a lot of breaks. She performs one step in her addition, then gets up and has a sandwich. She performs another step, then has a cup of tea. She performs another step, then goes to the bathroom. At this rate it’s going to take her an inordinately long time to finish her work, but no matter. Speed, in this factory, is not of the essence. Besides, she has her “note of instructions”—which, of course, corresponds to the “table of behavior” with which Turing began his discussion of the hypothetical a-machine.
2.
So now let us make the cinematic jump that lies at the heart of Turing’s paper. All at once the millions of women in the factory disappear. Our computer disappears, in order, we hope, to take up a more restful and satisfying occupation. In her place, suddenly, there sits a Turing machine. Indeed, in the place of every one of the women, there sits a Turing machine. A factory full of Turing machines, each one performing some specific algorithm. Let us for the moment focus on the machine that has replaced our particular friend, the woman who tots up figures. To simplify matters—and because the machine has just started its job—we will give it an even simpler task to perform than the ones at which its human counterpart was at work. We are going to ask the Turing machine to add 2 and 2.*
At this point we need to change the notational system that we are using. Our human computer has used the familiar Arabic notation, a system which, because it employs ten symbols (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), is known as the denary system. In undertaking the addition algorithm, on the other hand, our machine will employ the much simpler unary system, which requires only one symbol: a 1. In the unary system, the number 2 is written as 11, the number 3 as 111, the number 4 as 1111, and so on. Most of us actually do make use of the unary system on occasion, when, for instance, we are keeping score during card games.
So—the tape runs through the machine. For such an elementary operation as addition, only one symbol is required: the symbol 1. As the tape approaches the machine, which is presumably situated somewhere to the left of the two sets of numbers, it reads as follows:
The machine works in exactly the same way as our human computer: that is to say, it observes the squares one at a time and then performs a specific operation on each of them. The operation is in every case determined by the machine’s table of behavior, which in this case is set up as a series of m-configurations each labeled with a letter. Depending on its m-configuration, the machine will respond in different ways to each of the two symbols, a 1 and a blank, then move into a new m-configuration. (From here on I shall intermittently use “state” as a synonym for “m-configuration.”)
In adding 2 and 2 (or, as the machine sees it, 11 and 11), the machine follows the instruction table, with the following result. It begins in state A, then works through the succession of blanks, to which it makes no changes. Upon arriving at the first 1, it switches into state B. In state B, it arrives at the second 1, and moves, once again, one square to the right. (For 1’s, states A and B are identical.) It then arrives, in state B, at a blank, which according to its instructions it erases and replaces with a 1. The machine now shifts into state C, in which it encounters—but makes no changes to—the next 1. However, it has now shifted into state D. This means that it erases the next 1, leaving a blank. At this point the machine stops, having effected its operation. (In the following diagrams, the arrow indicates the position of the scanner on the machine.)
As we can see, the two sets of 2 have now been replaced by one set of 4. By means of an algorithm that never even requires the mention of the usual terms associated with addition, we have established that 2 + 2 = 4.
In “Computable Numbers,” Turing gives two examples of a-machines. These, too, require us to make use of a new notational system—the binary system. To understand the difference between the binary, the unary, and the denary systems, it is helpful to imagine a ladder with infinite rungs, each of which corresponds to a natural number. In the unary system, these rungs are of equal width. In the denary system, the rungs thicken with each power of 10. That is to say, the first 9 rungs are of equal thickness. At the 10th rung a thickening occurs. The rungs remain the same thickness as the 10th rung until the 100th rung is reached, at which point the rungs thicken again, remaining the same thickness until the 1,000th rung is reached . . . and on and on. Each time one of these thickenings takes place, an extra numeral is added.
In the binary system, the rungs of the ladder thicken according to exactly the same scheme as in the denary system, except that instead of thickening with each power of 10, they thicken with each power of 2. Likewise, just as an extra numeral is added in the denary system at 10, 100, 1,000, etc., in the binary system an extra numeral is added at 2, 4, 8, 16, etc. Because the shifts occur at multiples of 2, however, the binary symbol requires only two symbols to the denary system’s 10: 0 and 1.
The great advantage of the binary system in computer programming is that it allows the use of Boolean algebra, with the 1 and 0 corresponding to the on and off position of a valve, switch, or circuit. In addition, the system permits a far more economical coding of large numbers than would be possible with the unary system. Lastly, the binary system simplifies the coding of letters, mathematical symbols, and punctuation marks, which can also be represented in binary form.
The first of the a-machines that Turing gives as examples is a very simple machine designed to generate the infinite sequence 010101 . . . (in this case the ellipses indicate that the sequence goes on forever without changing). This machine differs from the one described previously in that it must print and recognize two symbols: 0 and 1. Its table of behavior looks like this:
A slightly more complicated Turing machine prints the sequence 001011011101111011111. . . . This machine must be capable of printing , x, 0, and 1. (The odd symbol is used as a placeholder that indicates the initiation of the sequence, and is not a part of the sequence itself; like the x, it serves to minimize the number of steps the computer needs to take in order to get the result.) As Turing explains,
The first three symbols on the tape will be “0”; the other figures follow on alternate squares. On the intermediate squares we never print anything but “x.” These letters serve to “keep the place” for us and are erased when we have finished with them. We also arrange that in the sequence of figures on alternate squares there should be no blanks.*
The machine thus “scribbles” intermediate steps in the calculation much as our human computer did. Its table of behavior, needless to say, is somewhat more complicated than those for the previous machines. (Because this machine requires so many more moves, here we shall use L to signify “move one square to the left,” R to signify “move one square to the right,” P to signify “print,” and E to signify “erase.”)
Turing goes on to show the first few sequences of symbols that the table generates, along with the m-configurations that come into play at each stage. The tape begins in state A and prints this sequence:
The machine, which is now scanning an 0, then shifts into state B. Following its new set of instructions, it does nothing but shifts into state C, at which point it is issued another set of instructions, telling it to move two spaces to the right. Because the machine is still in state C, and has encountered another 0, it moves two further spaces to the right:
The machine has now encountered a blank square. Consulting its instruction table, it sees that when in state C, it should, upon encountering a blank square, print a 1, move one square to the left, and shift into state D:
In state D, its instruction when encountering a blank square is to move two squares to the left, remaining in state D. Encountering another blank square, the machine moves two further squares to the left, where it encounters an . Its new instruction is to move one square to the right and shift into state E. It now scans a 0. In state E, its instruction for any symbol is to move two squares farther to the right, remaining in state E. Scanning yet another 0, it moves two further squares to the right, where it encounters a 1. Moving two further squares to the right, it encounters a blank. In state E, its instruction upon encountering a blank is to print a 0, move two squares to the left (where it encounters a 1), and return to state B. Following the instructions from state B for what to do when landing on a 1, the machine moves one square to the right, prints an x, then moves three squares to the left:
The machine, still in state B, has now met up with another 0. It shifts into state C, moves two squares to the right, encounters a 1, moves two further squares to the right, encounters a 0, moves two further squares to the right, encounters a blank, prints a 1, and moves one square to the left. Now the tape reads like this: