Once Upon an Algorithm
Page 22
However, this is not the case for loops in general. The walking loop to find your colleague’s office will not terminate if the termination condition is to find an office that doesn’t exist. The reason you wouldn’t search for a nonexisting office forever is that the office-finding problem is only a small part of a much larger set of goals for your workday, and if the algorithm is unable to solve the subproblem, it will be abandoned, and either a different method will be adopted or the goal will be replaced by another goal. If you consider programming a robot that simply executes the office-finding loop and that does not understand it as part of a bigger mission, the robot would go on forever (or until it runs out of energy or is stopped by a human).
So how can we distinguish between a terminating and a nonterminating loop? As we have seen, the answer is trivial in the case of for loops, which have a fixed number of iterations and thus always terminate. In the case of repeat loops (and while loops) we have to understand the relationship between the termination condition and the steps in the body of the loop. An interesting question to be investigated in chapter 11 is whether there is an algorithm that we can use to determine the termination behavior of loops. The answer may surprise you.
The runtimes of algorithms matter. A linear algorithm is preferable to a quadratic one, and an exponential algorithm is impractical for all but trivial input sizes. However, an exponential algorithm is still better than one that never terminates even for small inputs. The question of termination of an algorithm comes down to the question of termination of loops; it is one of utmost importance. This is also the question that haunts Phil Connors as he desperately tries to identify and satisfy the termination condition of the Groundhog Day loop.
11
Happy Ending Not Guaranteed
Loops and recursion inject power into algorithms. Loops enable algorithms to process inputs of arbitrary size and complexity. Without loops, algorithms could deal only with small and simple cases of inputs. Loops get algorithms off the ground. They are to algorithms what wings are to airplanes: without wings, airplanes still can move, but their full transportation potential cannot be realized. Similarly, there are some computations that algorithms can describe without loops, but the full power of computation can be realized only with loops. Such power also needs to be controlled, however. As the Groundhog Day story illustrates vividly, a control structure over which one has no control is not a blessing but a curse. Goethe’s poem “The Sorcerer’s Apprentice” may also come to mind (which may be better known in the United States through the Mickey Mouse sequence in the film Fantasia). The key to controlling a loop is to understand the condition that decides when it ends. Phil Connors can eventually end the loop he is in, leading to a happy ending. But this happens more as a coincidence than by following a plan.
Since algorithms are generally expected to terminate, one would like to know before executing an algorithm whether all the loops in it will terminate. Trying to figure this out can be quite a tedious and complicated endeavor, and therefore we ideally would like to delegate this task to an algorithm itself. The quest for an algorithm that can decide whether another algorithm will terminate is a famous problem in computer science called the halting problem. This chapter explains and discusses this problem and shows that it is actually unsolvable, which means that no such algorithm can exist. This is a rather surprising fact that reveals a lot about the limitations of algorithms and computation in general.
Even though the events that are unfolding in the Groundhog Day story look like a loop, the script of the story does not contain a loop. Instead, all repetitions of actions are spelled out in detail. Thus, it is not hard to determine that the story does end. After all, a movie producer must know the length of a movie before starting the project. But we can imagine a version of the story as a play in which the acting is not prescribed but improvised. That description could include a loop with a termination condition, but for the improv performance of Groundhog Day, it is not clear how long it would take. And if it weren’t for the ultimate exhaustion of the actors (and the audience), we wouldn’t know in advance if it terminated at all.
Out of Control
On his second day in Punxsutawney, Phil Connors starts to suspect that he might be reliving the previous day, after he wakes up to the same song on the radio and meets the same people in the same situations. Being deferred by someone on the phone to the next day, he replies:
Well, what if there is no tomorrow? There wasn’t one today.
[Phone disconnects.]
During the many repetitions of Groundhog Day, most events and encounters are identical to the day before. However, some details vary because Phil Connors reacts differently each time. Initially, as he gets used to his new situation, he tries to take advantage of and manipulate people by accumulating information over successive iterations of the Groundhog Day. This is most prominent in his attempt to learn everything about his producer-turned-lover Rita.
His strategy works, in principle at least, because of the fundamentally different way in which he and other people experience the Groundhog Day loop. Most important, while he perceives the repetition of the day, all other people are not aware of it at all. In fact, when he tries to share his plight with others, for example, with Rita or later with a psychiatrist, they think he is crazy. This fact provides him with an important advantage, since, in contrast to other people, he has the ability to remember information that happened in previous iterations.
A similar situation exists in the loops of algorithms. Some things remain constant over different loop iterations while others change. For example, in the loop to find an element in a list (see chapter 5) the list of elements doesn’t change, and neither does the element that we are looking for, but the current position in the list and thus the currently inspected element changes in every iteration until the element is found or the end of the list is reached.
A closer look reveals that the loop body and the termination condition have access to the state of the world, also just called state. As explained in chapter 10, this access happens through variables. The instructions of the loop body can both read the variables and change their values. In contrast, the termination condition can only read the variables to produce a true or false value to end or continue the loop, respectively.
In the element-finding algorithm the state consists of the list to be searched, the element to be found, and the position in the list that is currently being searched. To make the discussion more concrete, I use the following analogy.
Suppose you have taken a picture of a flower on one of your recent hikes and want to know what kind of flower it is. To this end, you want to look it up in a book of plants that contains on each page a description for a particular flower, including a picture. Assuming that the pictures in the book are not given in any particular order, you have to go through all the pages of the book to find the page that describes your flower. To make sure you find an entry for your flower, you have to look at each page until you have found it or exhausted all the pages. A simple method is to start with the first page, then look at the second, and so on.
The state for this search consists of three items: the picture of your flower, the book, and the current page in the book, given, for example, by a bookmark. This state is read by the termination condition, which checks whether the bookmark for the current page has reached the end of the book or whether the picture on the current page shows the same flower as your picture. In either of these two cases, the loop ends, but only in the latter case with a successful search. The loop must also contain an instruction to turn the page if the picture has not been found yet and when there are more pages with pictures to inspect. This step modifies the state of the loop because it changes the current page. This is also the only modification of the state that occurs in the loop.
It is clear that the state modification is critically important for the search algorithm because without turning pages you would not be able to find the picture (unless it happens to be
on the first page).
This fact is clear to Phil Connors, and he therefore tries to change things around Punxsutawney to let the Groundhog Day termination condition become true so that he can exit the loop. But not only is the state of the Groundhog Day loop much larger—it includes all the people (and Punxsutawney Phil), their thoughts and attitudes, and so on—it is not even clear what the state consists of. And so it is rather by coincidence that he is able to finally escape the loop.
Figure 11.1 Unfolding a loop means to create a sequence of copies of its body. For each iteration of the loop a separate copy of the loop body is created. The figure shows the state that is modified and how it changes (if at all). If the state never changes so that the termination condition is fulfilled, the sequence becomes infinite.
You may object that the Groundhog Day loop analogy to computing is invalid because the computer that executes it and the rules by which it operates are completely fictional—some kind of metaphysical purgatory engine to produce good people. While the Groundhog Day story is certainly fictional, its loop metaphor is apt. It also illustrates that computation can happen under very different circumstances, with computers that operate under a vast variety of rules. In fact, computation is often used to simulate hypothetical worlds and scenarios. As long as the rules for computation are logical and consistent, there is no limit to the computational scenarios that our imagination can create.
Are We There Yet?
As mentioned in chapter 1, the description of an algorithm and its execution are two very different things. Loops and their termination behavior bring this issue to the fore, as one may wonder how a finite description of a loop can lead to an infinitely long computation. This phenomenon can be best understood through a method for tracing the execution of a loop called loop unfolding (see figure 11.1). In the example of finding a flower picture in a book, the loop body consists of the action to turn the page. Unfolding the loop body means to produce a sequence of “turn page” instructions.
In the case of the Groundhog Day, the unfolding is less obvious, but the idea still applies. Phil Connors lives each day acting according to his goals, guided by his character, and reacting to the encounters he makes. Since we don’t have a precise description of this behavior, we can’t call it an algorithm, but the execution of the Groundhog Day loop still unfolds the actions from each day into a long sequence that eventually leads to the satisfaction of the termination condition. We cannot pin down a specific action that changed the state so that the termination condition was fulfilled. This is different from the example of finding a picture in a book, where it is clear that turning the page is a necessary action to change the state so that the search algorithm can terminate.
For example, suppose that we alternate between flipping forward and backward a page. While this is changing the state by turning pages, it obviously will prevent the algorithm from terminating (unless the book has only two pages). Of course, few people would do such a thing, but this example illustrates that changes to the state are not enough to ensure the termination of an algorithm. Rather, an algorithm must make the right changes to achieve termination.
A related aspect is that we want the algorithm not only to terminate but also to produce a correct result. Correctness in this example means that we either stop on the page that contains the sought flower (if it is in the book), or we stop on the last page knowing that the flower is not contained in the book. But now consider the following variation of the algorithm, which turns not one single page but two or more pages at a time. In this case, we may miss the picture in the book. The algorithm will still terminate, but when we have reached the last page, we can’t be sure that the flower is not contained in the book.
The termination condition of the Groundhog Day loop is Phil Connors’s being a good person. Since he doesn’t know this, he initially tries out all sorts of actions to change the state to achieve termination. This includes different forms of suicide and even the murder of Punxsutawney Phil—all to no avail. As he realizes that he has no control over the loop, he changes his attitude from cynical to making the best of the day by helping other people. Ultimately, as his transformation into a good person is complete, he can successfully exit the moral loop. And to make the happy ending perfect, Rita reciprocates his love for her.
The task that Phil Connors faces is especially hard, since he is flying blind. He has to satisfy the termination condition of the Groundhog Day loop without even knowing what it is. Moreover, he doesn’t know what the underlying state is and how to change it. A truly daunting task. Surely, finding out the termination of an algorithm must be much easier because we can see the termination condition, know what the underlying state is, and can see how the actions in the loop body might change the state.
No End in Sight
Suppose you are presented with an algorithm and have to decide whether the algorithm is worth running, that is, whether it produces a result in a finite amount of time. How would you go about it? Knowing that the only source of nonterminating behavior is loops, you would first identify all the loops in the algorithm. Then, for each loop you would try to understand the relationship between the termination condition and the instructions of the loop body, because the termination of a loop depends on the termination condition and how the values on which it depends are transformed by the loop body. This analysis will allow you to decide whether a particular loop terminates and whether the algorithm has a chance of terminating. To ensure the termination of the algorithm, you have to perform this analysis for each loop in the algorithm.
Since termination is such an important property of algorithms—it separates algorithms that actually solve a problem from those that don’t—it would be nice to know the termination property for any algorithm that one may want to use. However, performing a termination analysis for an algorithm is not an easy task, and it might take a considerable amount of time to carry out. Thus, we are tempted to automate this task, that is, to create an algorithm, say, Halts, that does the termination analysis automatically. Since we have many other algorithms for analyzing algorithms (for example, for parsing, see chapter 8), this doesn’t seem like such an outlandish idea.
Unfortunately, it is impossible to construct the algorithm Halts. It is not that this is too difficult at the moment or that computer scientists have not thought about the problem long and hard enough. No, we know that it is impossible in principle to craft an algorithm Halts. It is impossible now, and it will never be possible. This fact is often referred to as the unsolvability of the halting problem. The halting problem is a major problem in computer science. It was conceived by Alan Turing in 1936 as an example of an undecidable problem.
But why can’t we create the algorithm Halts? Since any algorithm to be analyzed by Halts is itself given by a finite description, it seems we would have to inspect only a finite number of instructions and how they affect the state that determines the termination condition.
To understand why it is indeed impossible to define Halts, let us first construct an algorithm, Loop, whose termination behavior is clear (see figure 11.2). Loop takes a number as a parameter and assigns it to the variable x. Application of Loop to the number 1—written as Loop(1)—yields a computation that assigns 1 to the variable x and stops, since the termination condition of the repeat loop is true (middle graphic in the figure). Otherwise, for any other number, it loops back to the assignment of the parameter to x. For instance, application of Loop to the number 2, Loop(2), yields a computation that assigns 2 to the variable x and loops forever, since the termination condition of the repeat loop is false (right-hand graphic in the figure).
Figure 11.2 The algorithm Loop stops when called with the number 1 as an argument but loops forever when called with any other argument. Left: The definition of the algorithm Loop. Middle: Application of Loop to the number 1. Right: Application of Loop to the number 2.
The strategy to show that Halts cannot exist is to assume that it does and then demonstrate that this assum
ption leads to a contradiction. This strategy is called proof by contradiction and is widely used in mathematics.
Thus, what would the algorithm Halts look like? As illustrated by the algorithm Loop, the termination behavior generally depends on the input to an algorithm, which means that Halts should have two parameters: one parameter, say, Alg, for the algorithm to be examined, and another parameter, say, Inp, for the input that the algorithm should be tested with. Thus, the algorithm Halts has the structure illustrated in figure 11.3.1
Equipped with the algorithm Halts, we can define another algorithm, Selfie, which leads to a contradiction in the assumption that Halts can exist. Selfie employs Halts in a rather peculiar way: it tries to determine whether an algorithm terminates when executed given its own description as input. In that case, it enters a nonterminating loop. Otherwise, it stops. The definition of Selfie is shown in figure 11.4.