Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers Page 19 Read online free by John MacCormick

Home > Other > Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers > Page 19

Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers Page 19

DIGITAL SIGNATURES IN PRACTICE

Early on in this chapter, we learned that end-users like you and me don't have much need to sign things digitally. Some computer-savvy users do sign things like e-mail messages, but for most of us, the primary use of digital signatures is the verification of downloaded content. The most obvious example of this is when you download a new piece of software. If the software is signed, your computer “unlocks” the signature using the signer's public key and compares the results with the signer's “message”—in this case, the software itself. (As mentioned earlier, in practice the software is reduced to a much smaller message called a secure hash before it is signed.) If the unlocked signature matches the software, you get an encouraging message, otherwise, you see a more dire warning: examples of both were shown in the figure on page 150.

As has been emphasized throughout, all of our schemes require some sort of trusted “bank” to store the signers' public keys and clock sizes. Fortunately, as you have probably noticed, you don't need to take a trip to a real bank every time you download some software. In real life, the trusted organizations that store public keys are known as certification authorities. All certification authorities maintain servers that can be contacted electronically to download public key information. So when your machine receives a digital signature, it will be accompanied by information stating which certification authority can vouch for the signer's public key.

You have probably already noticed a problem here: sure, your computer can go ahead and verify the signature with the designated certification authority, but how can we trust the authority itself? All we have done is transfer the problem of verifying the identity of one organization (the one that sent you the software, say NanoSoft.com), to the problem of verifying the identity of another organization (the certification authority, say, TrustMe Inc.). Believe it or not, this problem is typically solved by the certification authority (TrustMe Inc.) referring you to yet another certification authority (say, PleaseTrustUs Ltd.) for verification, also via a digital signature. This type of chain of trust can be extended indefinitely, but we will always be stuck with the same problem: how can we trust the organization at the end of the chain? The answer, as shown in the figure above, is that certain organizations have been officially designated as so-called root certificate authorities, or root CAs for short. Among the better-known root CAs are VeriSign, GlobalSign, and GeoTrust. The contact details (including internet addresses and public keys) of a number of root CAs come pre-installed in your browser software when you acquire it, and that is how the chain of trust for digital signatures becomes anchored in a trustworthy starting point.

A chain of trust for obtaining keys needed to verify digital signatures.

A PARADOX RESOLVED

At the start of this chapter, I pointed out that the very phrase “digital signature” could be regarded as an oxymoron: anything digital can be copied, yet a signature should be impossible to copy. How was this paradox resolved? The answer is that a digital signature depends on both a secret known only to the signer and on the message being signed. The secret (which we called a padlock throughout this chapter) stays the same for each message signed by a particular entity, but the signature is different for each message. Thus, the fact that anyone can easily copy the signature is irrelevant: the signature cannot be transferred to a different message, so merely copying it does not create a forgery.

The resolution of this paradox is not just a cunning and beautiful idea. Digital signatures are also of immense practical importance: without them, the internet as we know it would not exist. Data could still be exchanged securely using cryptography, but it would be far more difficult to verify the source of any data received. This combination of a profound idea with such wide practical impact means that digital signatures are, without doubt, one of the most spectacular achievements of computer science.

10

What Is Computable?

Let me remind you of some of the problems of computing machines.

—RICHARD FEYNMAN (1965 Nobel Prize in physics)

We've now seen quite a number of clever, powerful, and beautiful algorithms—algorithms that turn the bare metal of a computer into a genius at your fingertips. In fact, it would be natural to wonder, based on the rhapsodic rhetoric in the preceding chapters, if there is anything that computers cannot do for us. The answer is absolutely clear if we limit ourselves to what computers can do today: there are plenty of useful tasks (mostly involving some form of artificial intelligence) that computers can't, at present, perform well. Examples include high-quality translation between languages like English and Chinese, automatically controlling a vehicle to drive safely and quickly in a busy city environment, and (as a teacher, this is a big one for me) grading students' work.

Yet, as we have seen already, it is often surprising what a really clever algorithm can achieve. Perhaps tomorrow, someone will invent an algorithm that will drive a car perfectly or do an excellent job of grading my students' work. These do seem like hard problems—but are they impossibly hard? Indeed, is there any problem at all that is so difficult, no one could ever invent an algorithm to solve it? In this chapter, we will see that the answer is a resounding yes: there are problems that can never be solved by computers. This profound fact—that some things are “computable” and others are not—provides an interesting counterpoint to the many algorithmic triumphs we've seen in the preceding chapters. No matter how many clever algorithms are invented in the future, there will always be problems whose answers are “uncomputable.”

The existence of uncomputable problems is striking enough on its own, but the story of their discovery is even more remarkable.

The existence of such problems was known before the first electronic computers were ever built! Two mathematicians, one American and one British, independently discovered uncomputable problems in the late 1930s—several years before the first real computers emerged during the Second World War. The American was Alonzo Church, whose groundbreaking work on the theory of computation remains fundamental to many aspects of computer science. The Briton was none other than Alan Turing, who is commonly regarded as the single most important figure in the founding of computer science. Turing's work spanned the entire spectrum of computational ideas, from intricate mathematical theory and profound philosophy to bold and practical engineering. In this chapter, we will follow in the footsteps of Church and Turing on a journey that will eventually demonstrate the impossibility of using a computer for one particular task. That journey begins in the next section, with a discussion of bugs and crashes.

BUGS, CRASHES, AND THE RELIABILITY OF SOFTWARE

The reliability of computer software has improved tremendously in recent years, but we all know that it's still not a good idea to assume software will work correctly. Very occasionally, even high-quality, well-written software can do something it was not intended to do. In the worst cases, the software will “crash,” and you lose the data or document you were working on (or the video game you were playing—very frustrating, as I know from my own experience). But as anyone who encountered home computers in the 1980s and 90s can testify, computer programs used to crash an awful lot more frequently than they do in the 21st century. There are many reasons for this improvement, but among the chief causes are the great advances in automated software checking tools. In other words, once a team of computer programmers has written a large, complicated computer program, they can use an automatic tool to check their newly created software for problems that might cause it to crash. And these automated checking tools have been getting better and better at finding potential mistakes.

So a natural question to ask would be: will the automated software-checking tools ever get to the point where they can detect all potential problems in all computer programs? This would certainly be nice, since it would eliminate the possibility of software crashes once and for all. The remarkable thing that we'll learn in this chapter is that this software nirvana will never be attain
ed: it is provably impossible for any software-checking tool to detect all possible crashes in all programs.

It's worth commenting a little more on what it means for something to be “provably impossible.” In most sciences, like physics and biology, scientists make hypotheses about the way certain systems behave, and conduct experiments to see if the hypotheses are correct. But because the experiments always have some amount of uncertainty in them, it's not possible to be 100% certain that the hypotheses were correct, even after a very successful experiment. However, in stark contrast to the physical sciences, it is possible to claim 100% certainty about some of the results in mathematics and computer science. As long as you accept the basic axioms of mathematics (such as 1 + 1 = 2), the chain of deductive reasoning used by mathematicians results in absolute certainty that various other statements are true (for example, “any number that ends in a 5 is divisible by 5”). This kind of reasoning does not involve computers: using only a pencil and paper, a mathematician can prove indisputable facts.

So, in computer science, when we say that “X is provably impossible,” we don't just mean that X appears to be very difficult, or might be impossible to achieve in practice. We mean that it is 100% certain that X can never be achieved, because someone has proved it using a chain of deductive, mathematical reasoning. A simple example would be “it is provably impossible that a multiple of 10 ends with the digit 3.” Another example is the final conclusion of this chapter: it is provably impossible for an automated software-checker to detect all possible crashes in all computer programs.

PROVING THAT SOMETHING ISN'T TRUE

Our proof that crash-detecting programs are impossible is going to use a technique that mathematicians call proof by contradiction. Although mathematicians like to lay claim to this technique, it's actually something that people use all the time in everyday life, often without even thinking about it. Let me give you a simple example.

To start with, we need to agree on the following two facts, which would not be disputed by even the most revisionist of historians:

1. The U.S. Civil War took place in the 1860s.

2. Abraham Lincoln was president during the Civil War.

Now, suppose I made the statement: “Abraham Lincoln was born in 1520.” Is this statement true or false? Even if you knew nothing whatsoever about Abraham Lincoln, apart from the two facts above, how could you quickly determine that my statement is false?

Most likely, your brain would go through a chain of reasoning similar to the following: (i) No one lives for more than 150 years, so if Lincoln was born in 1520, he must have died by 1670 at the absolute latest; (ii) Lincoln was president during the Civil War, so the Civil War must have occurred before he died—that is, before 1670; (iii) but that's impossible, because everyone agrees the Civil War took place in the 1860s; (iv) therefore, Lincoln could not possibly have been born in 1520.

But let's try to examine this reasoning more carefully. Why is it valid to conclude that the initial statement was false? It is because we proved that this claim contradicts some other fact that is known to be true. Specifically, we proved that the initial statement implies the Civil War occurred before 1670—which contradicts the known fact that the Civil War took place in the 1860s.

Proof by contradiction is an extremely important technique, so let's do a slightly more mathematical example. Suppose I made the following claim: “On average, a human heart beats about 6000 times in 10 minutes.” Is this claim true or false? You might immediately be suspicious, but how would you go about proving to yourself that it is false? Spend a few seconds now trying to analyze your thought process before reading on.

Again, we can use proof by contradiction. First, assume for argument's sake that the claim is true: human hearts average 6000 beats in 10 minutes. If that were true, how many beats would occur in just one minute? On average, it would be 6000 divided by 10, or 600 beats per minute. Now, you don't have to be a medical expert to know that this is far higher than any normal pulse rate, which is somewhere between 50 and 150 beats per minute. So the original claim contradicts a known fact and must be false: it is not true that human hearts average 6000 beats in 10 minutes.

In more abstract terminology, proof by contradiction can be summarized as follows. Suppose you suspect that some statement S is false, but you would like to prove beyond doubt that it is false. First, you assume that S is true. By applying some reasoning, you work out that some other statement, say T, must also be true. If, however, T is known to be false, you have arrived at a contradiction. This proves that your original assumption (S) must have been false.

A mathematician would state this much more briefly, by saying something like “S implies T, but T is false, therefore S is false.” That is proof by contradiction in a nutshell. The following table shows how to connect this abstract version of proof by contradiction with the two examples above:

For now, our detour into proof by contradiction is finished. The final goal of this chapter will be to prove, by contradiction, that a program which detects all possible crashes in other programs cannot exist. But before marching on toward this final goal, we need to gain familiarity with some interesting concepts about computer programs.

PROGRAMS THAT ANALYZE OTHER PROGRAMS

Computers slavishly follow the exact instructions in their computer programs. They do this completely deterministically, so the output of a computer program is exactly the same every time you run it. Right? Or wrong? In fact, I haven't given you enough information to answer this question. It's true that certain simple computer programs produce exactly the same output every time they are run, but most of the programs we use every day look very different every time we run them. Consider your favorite word processing program: does the screen look the same every time it starts up? Of course not—it depends on what document you opened. If I use Microsoft Word to open the file “address-list.docx,” the screen will display a list of addresses that I keep on my computer. If I use Microsoft Word to open the file “bank-letter.docx,” I see the text of a letter I wrote to my bank yesterday. (If the “.docx” here seems mysterious to you, check out the box on the facing page to find out about file name extensions.)

Let's be very clear about one thing: in both cases, I'm running exactly the same computer program, which is Microsoft Word. It's just that the inputs are different in each case. Don't be fooled by the fact that all modern operating systems let you run a computer program by double-clicking on a document. That is just a convenience that your friendly computer company (most likely Apple or Microsoft) has provided you. When you double-click on a document, a certain computer program gets run, and that program uses the document as its input. The output of the program is what you see on the screen, and naturally it depends on what document you clicked on.

Throughout this chapter, I'll be using file names like “abcd.txt.” The part after the period is called the “extension” of the file name—in this case, the extension of “abcd.txt” is “txt.” Most operating systems use the extension of a file name to decide what type of data the file contains. For example, a “.txt” file typically contains plain text, a “.html” file typically contains a web page, and a “.docx” file contains a Microsoft Word document. Some operating systems hide these extensions by default, so you might not see them unless you turn off the “hide extensions” feature in your operating system. A quick web search for “unhide file extensions” will turn up instructions on how to do this.

Some technical details about file name extensions.

In reality, the input and output of computer programs is quite a bit more complex than this. For instance, when you click on menus or type into a program, you are giving it additional input. And when you save a document or any other file, the program is creating additional output. But to keep things simple, let's imagine that programs accept exactly one input, which is a file stored on your computer. And we'll also imagine that programs produce exactly one output, which is a graphical window on your monitor.
>
Unfortunately, the modern convenience of double-clicking on files clouds an important issue here. Your operating system uses various clever tricks to guess which program you would like to run whenever you double-click on a file. But it's very important to realize that it's possible to open any file using any program. Or to put it another way, you can run any program using any file as its input. How can you do this? The box on the next page lists several methods you can try. These methods will not work on all operating systems, or on all choices of input file—different operating systems launch programs in different ways, and they sometimes limit the choice of input file due to security concerns. Nevertheless, I strongly urge you to experiment for a few minutes with your own computer, to convince yourself that you can run your favorite word processing program with various different types of input files.

Here are three ways you could run the program Microsoft Word using stuff.txt as the input file:

• Right-click on stuff.txt, choose “Open with…,” and select Microsoft Word.

• First, use the features of your operating system to place a shortcut to Microsoft Word on your desktop. Then drag stuff.txt onto this Microsoft Word shortcut.

• Open the Microsoft Word application directly, go to the “File” menu, choose the “Open” command, make sure the option to display “all files” is selected, then choose stuff.txt.

Various ways of running a program with a particular file as its input.

Microsoft Excel run with “photo.jpg” as its input. The output is garbage, but the important point is that you can, in principle, run any program on any input you want.

Obviously, you can get rather unexpected results if you open a file using a program it was not intended for. In the figure above, you can see what happens if I open the picture file “photo.jpg” with my spreadsheet program, Microsoft Excel. In this case, the output is garbage and is no use to anyone. But the spreadsheet program did run, and did produce some output.

‹ Prev Next ›