Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers

Home > Other > Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers > Page 22
Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers Page 22

by John MacCormick


  The debate over the validity of the Church-Turing thesis rages on. But if its strongest version holds, then our computers aren't the only ones humbled by the limits of undecidability. The same limits would apply not only to the genius at our fingertips, but the genius behind them: our own minds.

  11

  Conclusion: More Genius at Your Fingertips?

  We can only see a short distance ahead, but we can see plenty there that needs to be done.

  —ALAN TURING, Computing Machinery and Intelligence, 1950

  I was fortunate, in 1991, to attend a public lecture by the great theoretical physicist Stephen Hawking. During the lecture, which was boldly titled “The Future of the Universe,” Hawking confidently predicted that the universe would keep expanding for at least the next 10 billion years. He wryly added, “I don't expect to be around to be proved wrong.” Unfortunately for me, predictions about computer science do not come with the same 10-billion-year insurance policy that is available to cosmologists. Any predictions I make may well be disproved during my own lifetime.

  But that shouldn't stop us thinking about the future of the great ideas of computer science. Will the great algorithms we've explored remain “great” forever? Will some become obsolete? Will new great algorithms emerge? To address these questions, we need to think less like a cosmologist and more like a historian. This brings to mind another experience I had many years ago, watching some televised lectures by the acclaimed, if controversial, Oxford historian A. J. P. Taylor. At the end of the lecture series, Taylor directly addressed the question of whether there would ever be a third world war. He thought the answer was yes, because humans would probably “behave in the future as they have done in the past.”

  So let's follow A. J. P. Taylor's lead and bow to the broad sweep of history. The great algorithms described in this book arose from incidents and inventions sprinkled throughout the 20th century. It seems reasonable to assume a similar pace for the 21st century, with a major new set of algorithms coming to the fore every two or three decades. In some cases, these algorithms could be stunningly original, completely new techniques dreamed up by scientists. Public key cryptography and the related digital signature algorithms are examples of this. In other cases, the algorithms may have existed in the research community for some time, waiting in the wings for the right wave of new technology to give them wide applicability. The search algorithms for indexing and ranking fall into this category: similar algorithms had existed for years in the field known as information retrieval, but it took the phenomenon of web search to make these algorithms “great,” in the sense of daily use by ordinary computer users. Of course, the algorithms also evolved for their new application; PageRank is a good example of this.

  Note that the emergence of new technology does not necessarily lead to new algorithms. Consider the phenomenal growth of laptop computers over the 1980s and 1990s. Laptops revolutionized the way people use computers, by vastly increasing accessibility and portability. And laptops also motivated hugely important advances in such diverse areas as screen technology and power management techniques. But I would argue that no great algorithms emerged from the laptop revolution. In contrast, the emergence of the internet is a technology that did lead to great algorithms: by providing an infrastructure in which search engines could exist, the internet allowed indexing and ranking algorithms to evolve toward greatness.

  Therefore, the undoubted acceleration of technology growth that continues to unfold around us does not, in and of itself, guarantee the emergence of new great algorithms. In fact, there is a powerful historical force operating in the other direction, suggesting that the pace of algorithmic innovation will, if anything, decrease in the future. I'm referring to the fact that computer science is beginning to mature as a scientific discipline. Compared to fields such as physics, mathematics, and chemistry, computer science is very young: it has its beginnings in the 1930s. Arguably, therefore, the great algorithms discovered in the 20th century may have consisted of low hanging fruit, and it will become more and more difficult to find ingenious, widely applicable algorithms in the future.

  So we have two competing effects: new niches provided by new technology occasionally provide scope for new algorithms, while the increasing maturity of the field narrows the opportunities. On balance, I tend to think that these two effects will cancel each other out, leading to a slow but steady emergence of new great algorithms in the years ahead.

  SOME POTENTIALLY GREAT ALGORITHMS

  Of course, some of these new algorithms will be completely unexpected, and it's impossible to say anything more about them here. But there are existing niches and techniques that have clear potential. One of the obvious trends is the increasing use of artificial intelligence (and, in particular, pattern recognition) in everyday contexts, and it will be fascinating to see if any strikingly novel algorithmic gems emerge in this area.

  Another fertile area is a class of algorithms known as “zero-knowledge protocols.” These protocols use a special type of cryptography to achieve something even more surprising than a digital signature: they let two or more entities combine information without revealing any of the individual pieces of information. One potential application is for online auctions. Using a zero-knowledge protocol, the bidders can cryptographically submit their bids to each other in such a way that the winning bidder is determined, but no information about the other bids is revealed to anyone! Zero-knowledge protocols are such a clever idea that they would easily make it into my canon of great algorithms, if only they were used in practice. But so far, they haven't achieved widespread use.

  Another idea that has received an immense amount of academic research but limited practical use is a technique known as “distributed hash tables.” These tables are an ingenious way of storing the information in a peer-to-peer system—a system that has no central server directing the flow of information. At the time of writing, however, many of the systems that claim to be peer-to-peer in fact use central servers for some of their functionality and thus do not need to rely on distributed hash tables.

  The technique of “Byzantine fault tolerance” falls in the same category: a surprising and beautiful algorithm that can't yet be classed as great, due to lack of adoption. Byzantine fault tolerance allows certain computer systems to tolerate any type of error whatsoever (as long as there are not too many simultaneous errors). This contrasts with the more usual notion of fault tolerance, in which a system can survive more benign errors, such as the permanent failure of a disk drive or an operating system crash.

  CAN GREAT ALGORITHMS FADEAWAY?

  In addition to speculating about what algorithms might rise to greatness in the future, we might wonder whether any of our current “great” algorithms—indispensable tools that we use constantly without even thinking about it—might fade in importance. History can guide us here, too. If we restrict attention to particular algorithms, it is certainly true that algorithms can lose relevance. The most obvious example is in cryptography, in which there is a constant arms race between researchers inventing new crypto algorithms, and other researchers inventing ways to crack the security of those algorithms. As a specific instance, consider the so-called cryptographic hash functions. The hash function known as MD5 is an official internet standard and has been widely used since the early 1990s, yet significant security flaws have been found in MD5 since then, and its use is no longer recommended. Similarly, we discussed in chapter 9 the fact that the RSA digital signature scheme will be easy to crack if and when it becomes possible to build quantum computers of a reasonable size.

  However, I think examples like this answer our question too narrowly. Sure, MD5 is broken (and, by the way, so is its main successor, SHA-1), but that doesn't mean the central idea of cryptographic hash functions is irrelevant. Indeed, such hash functions are used extremely widely, and there are plenty of uncracked ones out there. So, provided we take a broad enough view of the situation and are prepared to adapt the specifics of an
algorithm while retaining its main ideas, it seems unlikely that many of our presently great algorithms will lose their importance in the future.

  WHAT HAVE WE LEARNED?

  Are there any common themes that can be drawn out from the great algorithms presented here? One theme, which was a great surprise to me as the author of the book, is that all of the big ideas can be explained without requiring previous knowledge of computer programming or any other computer science. When I started work on the book, I assumed that the great algorithms would fall into two categories. The first category would be algorithms with some simple yet clever trick at their core—a trick that could be explained without requiring any technical knowledge. The second category would be algorithms that depended so intimately on advanced computer science ideas that they could not be explained to readers with no background in this area. I planned to include algorithms from this second category by giving some (hopefully) interesting historical anecdotes about the algorithms, explaining their important applications, and vociferously asserting that the algorithm was ingenious even though I couldn't explain how it worked. Imagine my surprise and delight when I discovered that all the chosen algorithms fell into the first category! To be sure, many important technical details were omitted, but in every case, the key mechanism that makes the whole thing work could be explained using nonspecialist notions.

  Another important theme common to all our algorithms is that the field of computer science is much more than just programming. Whenever I teach an introductory computer science course, I ask the students to tell me what they think computer science actually is. By far the most common response is “programming,” or something equivalent such as “software engineering.” When pressed to provide additional aspects of computer science, many are stumped. But a common follow-up is something related to hardware, such as “hardware design.” This is strong evidence of a popular misconception about what computer scientists really do. Having read this book, I hope you have a much more concrete idea of the problems that computer scientists spend their time thinking about, and the types of solutions they come up with.

  A simple analogy will help here. Suppose you meet a professor whose main research interest is Japanese literature. It is extremely likely that the professor can speak, read, and write Japanese. But if you were asked to guess what the professor spends the most time thinking about while conducting research, you would not guess “the Japanese language.” Rather, the Japanese language is a necessary piece of knowledge for studying the themes, culture, and history that comprise Japanese literature. On the other hand, someone who speaks perfect Japanese might be perfectly ignorant of Japanese literature (there are probably millions of such people in Japan).

  The relationship between computer programming languages and the main ideas of computer science is quite similar. To implement and experiment with algorithms, computer science researchers need to convert the algorithms into computer programs, and each program is written in a programming language, such as Java, C++, or Python. Thus, knowledge of a programming language is essential for computer scientists, but it is merely a prerequisite: the main challenge is to invent, adapt, and understand algorithms. After seeing the great algorithms in this book, it is my hope that readers will have a much firmer grasp of this distinction.

  THE END OF OUR TOUR

  We've reached the end of our tour through the world of profound, yet everyday, computation. Did we achieve our goals? And will your interactions with computing devices be any different as a result?

  Well, it's just possible that next time you visit a secure website, you'll be intrigued to know who vouched for its trustworthiness, and check out the chain of digital certificates that were inspected by your web browser (chapter 9). Or perhaps the next time an online transaction fails for an inexplicable reason, you'll be grateful instead of frustrated, knowing that database consistency should ensure you won't be charged for something you failed to order (chapter 8). Or maybe you'll be musing to yourself one day, “Now wouldn't it be nice if my computer could do this for me”—only to realize that it's an impossibility, because your wished-for task can be proved unde-cidable using the same method as for our crash-finding program (chapter 10).

  I'm sure you can think of plenty more examples in which knowledge of the great algorithms might change the way you interact with a computer. However, as I was careful to state in the introduction, this wasn't the primary objective of the book. My chief goal was to give readers enough knowledge about the great algorithms that they gain a sense of wonder at some of their ordinary computing tasks—much as an amateur astronomer has a heightened appreciation of the night sky.

  Only you, the reader, can know whether I succeeded in this goal. But one thing is certain: your own personal genius is right at your fingertips. Feel free to use it.

  ACKNOWLEDGMENTS

  You road I enter upon and look around! I believe you are not all that is here;

  I believe that much unseen is also here.

  —WALT WHITMAN, Song of the Open Road

  Many friends, colleagues, and family members read some or all of the manuscript. Among them are Alex Bates, Wilson Bell, Mike Burrows, Walt Chromiak, Michael Isard, Alastair MacCormick, Raewyn MacCormick, Nico-letta Marini-Maio, Frank McSherry, Kristine Mitchell, Ilya Mironov, Wendy Pollack, Judith Potter, Cotten Seiler, Helen Takacs, Kunal Talwar, Tim Wahls, Jonathan Waller, Udi Wieder, and Ollie Williams. Suggestions from these readers resulted in a large number of substantial improvements to the manuscript. The comments of two anonymous reviewers also resulted in significant improvements.

  Chris Bishop provided encouragement and advice. Tom Mitchell gave permission to use his pictures and source code in chapter 6.

  Vickie Kearn (the book's editor) and her colleagues at Princeton University Press did a wonderful job of incubating the project and bringing it to fruition.

  My colleagues in the Department of Mathematics and Computer Science at Dickinson College were a constant source of support and camaraderie.

  Michael Isard and Mike Burrows showed me the joy and beauty of computing. Andrew Blake taught me how to be a better scientist.

  My wife Kristine was always there and is here still; much unseen is also here.

  To all these people I express my deepest gratitude. The book is dedicated, with love, to Kristine.

  SOURCES AND FURTHER READING

  As explained on page 8, this book does not use in-text citations. Instead, all sources are listed below, together with suggestions of further reading for those interested in finding out more about the great algorithms of computer science.

  The epigraph is from Vannevar Bush's essay “As We May Think,” originally published in the July 1945 issue of The Atlantic magazine.

  Introduction (chapter 1). For some accessible, enlightening explanations of algorithms and other computer technology, I recommend Chris Bishop's 2008 Royal Institution Christmas lectures, videos of which are freely available online. The lectures assume no prior knowledge of computer science. A. K. Dewdney's New Turing Omnibus usefully amplifies several of the topics covered in the present volume and introduces many more interesting computer science concepts—but some knowledge of computer programming is probably required to fully appreciate this book. Juraj Hromkovi's Algorithmic Adventures is an excellent option for readers with a little mathematical background, but no knowledge of computer science. Among the many college-level computer science texts on algorithms, three particularly readable options are Algorithms, by Dasgupta, Papadimitriou, and Vazirani; Algorithmics: The Spirit of Computing, by Harel and Feldman; and Introduction to Algorithms, by Cormen, Leiserson, Rivest, and Stein.

  Search engine indexing (chapter 2). The original AltaVista patent covering the metaword trick is U.S. patent 6105019, “Constrained Searching of an Index,” by Mike Burrows (2000). For readers with a computer science background, Search Engines: Information Retrieval in Practice, by Croft, Metzler, and Strohman, is a good option for learning more about indexing and m
any other aspects of search engines.

  PageRank (chapter 3). The opening quotation by Larry Page is taken from an interview by Ben Elgin, published in Businessweek, May 3, 2004. Vannevar Bush's “As We May Think” was, as mentioned above, originally published in The Atlantic magazine (July 1945). Bishop's lectures (see above) contain an elegant demonstration of PageRank using a system of water pipes to emulate hyperlinks. The original paper describing Google's architecture is “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” written by Google's co-founders, Sergey Brin and Larry Page, and presented at the 1998 World Wide Web conference. The paper includes a brief description and analysis of PageRank. A much more technical, wide-ranging analysis appears in Langville and Meyer's Google's PageRank and Beyond—but this book requires college-level linear algebra. John Battelle's The Search begins with an accessible and interesting history of the web search industry, including the rise of Google. The web spam mentioned on page 36 is discussed in “Spam, Damn Spam, and Statistics: Using Statistical Analysis to Locate Spam Web Pages,” by Fetterly, Manasse, and Najork, and published in the 2004 WebDB conference.

  Public key cryptography (chapter 4). Simon Singh's The Code Book contains superb, accessible descriptions of many aspects of cryptography, including public key. It also recounts in detail the story of the secret discovery of public key cryptography at GCHQin Britain. Bishop's lectures (see above) contain a clever practical demonstration of the paint-mixing analogy for public key crypto.

  Error correcting codes (chapter 5). The anecdotes about Hamming are documented in Thomas M. Thompson's From Error-Correcting Codes through Sphere Packings to Simple Groups. The quotation from Hamming on page 60 is also given in this book and derives from a 1977 interview of Hamming by Thompson. Mathematicians will greatly enjoy Thompson's delightful book, but it definitely assumes the reader has a healthy dose of college math. Dewdney's book (see above) has two interesting chapters on coding theory. The two quotations about Shannon on pages 77-78 are taken from a brief biography by N. J. A. Sloane and A. D. Wyner, appearing in Claude Shannon: Collected Papers edited by Sloane and Wyner (1993).

 

‹ Prev