I Am a Strange Loop
Page 19
A Tiny Spark in Gödel’s Brain
We now return to the story of Kurt Gödel and his encounter with the powerful idea that all sorts of infinite classes of numbers can be defined through various kinds of recursive rules. The image of the organic growth of an infinite structure or pattern, all springing out of a finite set of initial seeds, struck Gödel as much more than a mere curiosity; in fact, it reminded him of the fact that theorems in PM (like theorems in Euclid’s Elements) always spring (by formal rules of inference) from earlier theorems in PM, with the exception of the first few theorems, which are declared by fiat to be theorems, and thus are called “axioms” (analogues to the seeds).
In other words, in the careful analogy sparked in Gödel’s mind by this initially vague connection, the axioms of PM would play the role of Fibonacci’s seeds 1 and 2, and the rules of inference of PM would play the role of adding the two most recent numbers. The main difference is that in PM there are several rules of inference, not just one, so at any stage you have a choice of what to do, and moreover, you don’t have to apply your chosen rule to the most recently generated theorem(s), so that gives you even more choice. But aside from these extra degrees of freedom, Gödel’s analogy was very tight, and it turned out to be immensely fruitful.
Clever Rules Imbue Inert Symbols with Meaning
I must stress here that each rule of inference in a formal system like PM not only leads from one or more input formulas to an output formula, but it does so by purely typographical means — that is, via purely mechanical symbol-shunting that doesn’t require any thought about the meanings of symbols. From the viewpoint of a person (or machine) following the rules to produce theorems, the symbols might as well be totally devoid of meaning.
On the other hand, each rule has to be very carefully designed so that, given input formulas that express truths, the output formula will also express a truth. The rule’s designer (Russell and Whitehead, in this case) therefore has to think about the symbols’ intended meanings in order to be sure that the rule will work exactly right for a manipulator (human or otherwise) who is not thinking about the symbols’ intended meanings.
To give a trivial example, suppose the symbol “∨” were intended to stand for the concept “or”. Then a possible rule of inference would be:
From any formula “P ∨ Q” one can derive the reversed formula “Q ∨ P”.
This rule of inference is reasonable because whenever an or-statement (such as “You’re crazy or I’m crazy”) is true, then so is the flipped-around or-statement (“I’m crazy or you’re crazy”).
This particular ∨-flipping rule happens not to be one of PM’s rules of inference, but it could have been one. The point is just that this rule shows how one can mechanically shunt symbols and ignore their meanings, and yet preserve truth while doing so. This rule is rather trivial, but there are subtler ones that do real work. That, indeed, is the whole idea of symbolic logic, first suggested by Aristotle and then developed piecemeal over many centuries by such thinkers as Blaise Pascal, Gottfried Wilhelm von Leibniz, George Boole, Augustus De Morgan, Gottlob Frege, Giuseppe Peano, David Hilbert, and many others. Russell and Whitehead were simply developing the ancient dream of totally mechanizing reasoning in a more ambitious fashion than any of their predecessors had.
Mechanizing the Mathematician’s Credo
If you apply PM’s rules of inference to its axioms (the seeds that constitute the “zeroth generation” of theorems), you will produce some “progeny” — theorems of the “first generation”. Then apply the rules once again to the first-generation theorems (as well as to the axioms) in all the different ways you can; you will thereby produce a new batch of theorems — the second generation. Then from that whole brew comes a third batch of theorems, and so on, ad infinitum, constantly snowballing. The infinite body of theorems of PM is fully determined by the initial seeds and by the typographical “growth rules” that allow one to make new theorems out of old ones.
Needless to say, the hope here is that all of these mechanically generated theorems of PM are true statements of number theory (i.e., no false statement is ever generated), and conversely, it is hoped that all true statements of number theory are mechanically generated as theorems of PM (i.e., no true statement is left ungenerated forever). The first of these hopes is called consistency, and the second one is called completeness.
In a nutshell, we want the entire infinite body of theorems of PM to coincide exactly with the infinite body of true statements in number theory — we want perfect, flawless alignment. At least that’s what Russell and Whitehead wanted, and they believed that with PM they had attained this goal (after all, “s0 + s0 = ss0” was a theorem, wasn’t it?).
Let us recall the Mathematician’s Credo, which in some form or other had existed for many centuries before Russell and Whitehead came along:
X is true because there is a proof of X;
X is true and so there is a proof of X.
The first line expresses the first hope expressed above — consistency. The second line expresses the second hope expressed above — completeness. We thus see that the Mathematician’s Credo is very closely related to what Russell and Whitehead were aiming for. Their goal, however, was to set the Credo on a new and rigorous basis, with PM serving as its pedestal. In other words, where the Mathematician’s Credo merely speaks of “a proof ” without saying what is meant by the term, Russell and Whitehead wanted people to think of it as meaning a proof within PM.
Gödel himself had great respect for the power of PM, as is shown by the opening sentences of his 1931 article:
The development of mathematics in the direction of greater exactness has — as is well known — led to large tracts of it becoming formalized, so that proofs can be carried out according to a few mechanical rules. The most comprehensive formal systems yet set up are, on the one hand, the system of Principia Mathematica (PM) and, on the other, the axiom system for set theory of Zermelo-Fraenkel (later extended by J. v. Neumann). These two systems are so extensive that all methods of proof used in mathematics today have been formalized in them, i. e., reduced to a few axioms and rules of inference.
And yet, despite his generous hat-tip to Russell and Whitehead’s opus, Gödel did not actually believe that a perfect alignment between truths and PM theorems had been attained, nor indeed that such a thing could ever be attained, and his deep skepticism came from having smelled an extremely strange loop lurking inside the labyrinthine palace of mindless, mechanical, symbol-churning, meaning-lacking mathematical reasoning.
Miraculous Lockstep Synchrony
The conceptual parallel between recursively defined sequences of integers and the leapfrogging set of theorems of PM (or, for that matter, of any formal system whatever, as long as it had axioms acting as seeds and rules of inference acting as growth mechanisms) suggested to Gödel that the typographical patterns of symbols on the pages of Principia Mathematica — that is, the rigorous logical derivations of new theorems from previous ones — could somehow be “mirrored” in an exact manner inside the world of numbers. An inner voice told him that this connection was not just a vague resemblance but could in all likelihood be turned into an absolutely precise correspondence.
More specifically, Gödel envisioned a set of whole numbers that would organically grow out of each other via arithmetical calculations much as Fibonacci’s F numbers did, but that would also correspond in an exact oneto-one way with the set of theorems of PM. For instance, if you made theorem Z out of theorems X and Y by using typographical rule R5, and if you made the number z out of numbers x and y using computational rule r5, then everything would match up. That is to say, if x were the number corresponding to theorem X and y were the number corresponding to theorem Υ, then z would “miraculously” turn out to be the number corresponding to theorem Z. There would be perfect synchrony; the two sides (typographical and numerical) would move together in lock-step. At first this vision of miraculous synchrony was just a l
ittle spark, but Gödel quickly realized that his inchoate dream might be made so precise that it could be spelled out to others, so he started pursuing it in a dogged fashion.
Flipping between Formulas and Very Big Integers
In order to convert his intuitive hunch into a serious, precise, and respectable idea, Gödel first had to figure out how any string of PM symbols (irrespective of whether it asserted a truth or a falsity, or even was just a random jumble of symbols haphazardly thrown together) could be systematically converted into a positive integer, and conversely, how such an integer could be “decoded” to give back the string from which it had come. This first stage of Gödel’s dream, a systematic mapping by which every formula would receive a numerical “name”, came about as follows.
The basic alphabet of PM consisted of only about a dozen symbols (other symbols were introduced later but they were all defined in terms of the original few, so they were not conceptually necessary), and to each of these symbols Gödel assigned a different small integer (these initial few choices were quite arbitrary — it really didn’t matter what number was associated with an isolated symbol).
For multi-symbol formulas (by the way, in this book the terms “string of symbols” — “string” for short — and “formula” are synonymous), the idea was to replace the symbols, one by one, moving left to right, by their code numbers, and then to combine all of those individual code numbers (by using them as exponents to which successive prime numbers are raised) into one unique big integer. Thus, once isolated symbols had been assigned numbers, the numbers assigned to strings of symbols were not arbitrary.
For instance, suppose that the (arbitrary) code number for the symbol “0” is 2, and the code number for the symbol “=” is 6. Then for the three symbols in the very simple formula “0=0”, the code numbers are 2, 6, 2, and these three numbers are used as exponents for the first three prime numbers (2, 3, and 5) as follows:
22 · 36 · 52 = 72900
So we see that 72900 is the single number that corresponds to the formula “0=0”. Of course this is a rather large integer for such a short formula, and you can easily imagine that the integer corresponding to a fifty-symbol formula is astronomical, since it involves putting the first fifty prime numbers to various powers and then multiplying all those big numbers together, to make a true colossus. But no matter — numbers are just numbers, no matter how big they are. (Luckily for Gödel, there are infinitely many primes, since if there had been merely, say, one billion of them, then his method would only have let him encode formulas made of a billion symbols or fewer. Now that would be a crying shame!)
The decoding process works by finding the prime factorization of 72900 (which is unique), and reading off the exponents that the ascending primes are raised to, one by one — 2, 6, 2 in this case.
To summarize, then, in this non-obvious but simple manner, Gödel had found a way to replace any given formula of PM by an equivalent number (which other people soon would dub its Gödel number). He then extended this idea of “arithmetization” to cover arbitrary sequences of formulas, since proofs in PM are sequences of formulas, and he wanted to be able to deal with proofs, not just isolated formulas. Thus an arbitrarily long sequence of formulas could be converted into one large integer via essentially the same technique, using primes and exponents. You can imagine that we’re talking really big numbers here.
In short, Gödel showed how any visual symbol-pattern whatsoever in the idiosyncratic notation of Principia Mathematica could be assigned a unique number, which could easily be decoded to give back the visual pattern (i.e., sequence of symbols) to which it corresponded. Conceiving of and polishing this precise two-way mapping, now universally called “Gödel numbering”, constituted the first key step of Gödel’s work.
Very Big Integers Moving in Lock-step with Formulas
The next key step was to make Fibonacci-like recursive definitions of special sets of integers — integers that would organically grow out of previously generated ones by addition or multiplication or more complex computations. One example would be the wff numbers, which are those integers that, via Gödel’s code, represent “well-formed” or “meaningful” formulas of PM, as opposed to those that represent meaningless or ungrammatical strings. (A sample well-formed formula, or “wff ” for short, would be “0+0=sss0”. Though it asserts a falsity, it’s still a meaningful statement. On the other hand, “=)0(=” and “00==0+=” are not wffs. Like the arbitrary sequence of pseudo-words “zzip dubbiwubbi pizz”, they don’t assert anything.) Since, as it happens, longer wffs are built up in PM from shorter wffs by just a few simple and standard rules of typographical juxtaposition, their larger code numbers can likewise be built up from the smaller code numbers of shorter ones by just a few simple and standard rules of numerical calculation.
I’ve said the foregoing rather casually, but in fact this step was perhaps the deepest of Gödel’s key insights — namely, that once strings of symbols had been “arithmetized” (given numerical counterparts), then any kind of rule-based typographical shunting-around of strings on paper could be perfectly paralleled by some kind of purely arithmetical calculation involving their numerical proxies — which were huge numbers, to be sure, but still just numbers. What to Russell and Whitehead looked like elaborate symbol-shunting looked like a lot of straightforward number-crunching to Kurt Gödel (although of course he didn’t use that colorful modern term, since this was all taking place back in the prehistoric days when computers didn’t yet exist). These were simply two different views of what was going on — views that were 100 percent equivalent and interchangeable.
Glimmerings of How PM Can Twist Around and See Itself
Gödel saw that the game of building up an infinite class of numbers, such as wff numbers, through recursion — that is, making new “members of the club” by combining older, established members via some number-crunching rule — is essentially the same idea as Fibonacci’s recursive game of building up the class of F numbers by taking sums of earlier members. Of course recursive processes can be far more complicated than just taking the sum of the latest two members of the club.
What a recursive definition does, albeit implicitly, is to divide the entire set of integers into members and non-members of the club — that is, those numbers that are reachable, sooner or later, via the recursive building-up process, and those that are never reachable, no matter how long one waits. Thus 34 is a member of the F club, whereas 35 is a non-member. How do we know 35 is not an F number? That’s very easy — the rule that makes new F numbers always makes larger ones from smaller ones, and so once we’ve passed a certain size, there’s no chance we’ll be returning to “pick up” other numbers in that vicinity later. In other words, once we’ve made the F numbers 1, 2, 3, 5, 8, 13, 21, 34, 55, we know they are the only ones in that range, so obviously 35, 36, and so on, up to 54, are not F numbers.
If, however, some other club of numbers is defined by a recursive rule whose outputs are sometimes bigger than its inputs and other times are smaller than its inputs, then, in contrast to the simple case of the F club, you can’t be so sure that you won’t ever be coming back and picking up smaller integers that were missed in earlier passes.
Let’s think a little bit more about the recursively defined club of numbers that we called “wff numbers”. We’ve seen that the number 72900 possesses “wff-ness”, and if you think about it, you can see that 576 and 2916 lack that quality. (Why? Well, if you factor them and look at the exponents of 2 and 3, you will see that these numbers are the numerical encodings of the strings “0=” and “=0”, respectively, neither of which makes sense, whence they are not well-formed formulas.) In other words, despite its odd definition, wff-ness, no more and no less than squareness or primeness or Fibonacci’s F-ness, is a valid object of study in the world of pure number. The distinction between members and non-members of the “wff club” is every bit as genuine a number-theoretical distinction as that between members and non-members of
the club of squares, the club of prime numbers, or the club of F numbers, for wff numbers are definable in a recursive arithmetical (i.e., computational) fashion. Moreover, it happens that the recursive rules defining wff-ness always produce outputs that are bigger than their inputs, so that wff-ness shares with F-ness the simple property that once you’ve exceeded a certain magnitude, you know you’ll never be back visiting that zone again.
Just as some people’s curiosity was fired by the fact of seeing a square in Fibonacci’s recursively defined sequence, so some people might become interested in the question as to whether there are any squares (or cubes, etc.) in the recursively defined sequence of wff numbers. They could spend a lot of time investigating such purely number-theoretical questions, never thinking at all about the corresponding formulas of Principia Mathematica.
One could be completely ignorant of the fact that Gödel’s wff numbers had their origin in Russell and Whitehead’s rules defining well-formedness in Principia Mathematica, just as one can study the laws of probability without ever suspecting that this deep branch of mathematics was originally developed to analyze gambling. What long ago inspired someone to dream up a particular recursive definition obviously doesn’t affect the numbers it defines; all that matters is that there should be a purely computational way of making any member of the club grow out of the initial seeds by applying the rules some finite number of times.