The Technology Trap

Home > Other > The Technology Trap > Page 34
The Technology Trap Page 34

by Carl Benedikt Frey


  —WILLIAM NORDHAUS, “TWO CENTURIES OF PRODUCTIVITY GROWTH IN COMPUTING”

  The Luddites of the early 19th century surely had their voice heard, as did their likeminded emulators over the following decades. However, they could hardly expect to make a dent on their fate: democracy was still highly limited and living standards still very low for the vast majority, so that most people were just consumed by the need to provide for their basic needs. Much has changed since, and nowadays virtually every individual in advanced western countries has come to expect to be entitled, at least in principle, to full participation in every realm of society: the political, the economic, the cultural. The expectation is not just to vote in periodic elections but to have an influence via “participatory democracy”; not just to hold a job, but to partake in the benefits of economic growth—this is what constitutes “the democratization of expectations.”

  —MANUEL TRAJTENBERG, “AI AS THE NEXT GPT: A POLITICAL-ECONOMY PERSPECTIVE”

  The Danish physicist Niels Bohr supposedly once quipped that “God gave the easy problems to the physicists.” Since the scientific revolution, the steady accumulation of scientific knowledge has given the physical sciences much improved means of predicting outcomes. In economics, the opposite is true. While the laws of physics apply across time and space, in economics and other social sciences, boundary conditions are not timeless. Arguably, the predictability of economic outcomes peaked before the Industrial Revolution, when growth was slow or stagnant.

  It is true that technological progress follows an evolutionary process, meaning that invariant statements cannot be made over the long run. As we have seen in the preceding chapters, the potential scope of automation has steadily expanded over time. But we can establish some near-term engineering bottlenecks that currently set the boundaries for the type of tasks computers can perform. As we saw in chapter 9, routine jobs were eliminated in large numbers beginning in the 1980s. But already in the 1960s, the Bureau of Labor Statistics made the following observation: “Mechanization may indeed have created many dull and routine jobs; automation, however, is not an extension but a reversal of this trend: it promises to cut out just that kind of job and to create others of higher skill.”1 They predicted the Great Reversal two decades before it happened by observing what computers can do. Because it takes time before technologies are adopted and put into widespread use, we can infer the exposure of current jobs to future automation by examining technologies that are still imperfect prototypes.

  There is no economic law that postulates that the next three decades must mirror the last three. Much depends on what happens in technology and how people adjust. It is possible that we are on the cusp of a series of enabling technological breakthroughs that will create an abundance of new jobs for middle-class people. However, the empirical reality of the last decades points in the opposite direction, and there are good reasons to think that current trends will continue at least for some time, unless policies are implemented to counteract them. The employment prospects for the middle class crucially hinge upon what computers can and cannot do. And the division of labor between man and machine is constantly evolving. Recent breakthroughs in artificial intelligence (AI) mean that for the first time in history, machines are now able to learn. To better understand the next wave of automation, let’s begin by looking at exactly what computers can do in the age of AI.

  12

  ARTIFICIAL INTELLIGENCE

  A perfect storm of advances, including larger databases, Moore’s Law, and clever algorithms, has paved the way for much of the recent progress in artificial intelligence (AI). Most significantly over the past decade, this has led to automation extending beyond routine jobs and into new and unexpected areas. In the past rule-based era of computing, automation was limited to deductive instructions that had to be specified by a computer programmer. By discovering ways of automating things that we struggle to articulate or explain, like how to drive a car or to translate a news story, AI allows us to unravel Polanyi’s paradox, at least in part (see chapter 9).1 The fundamental difference is that instead of automating tasks by programming a set of instructions, we can now program computers to “learn” from samples of data or “experience.” When the rules of a task are unknown, we can apply statistics and inductive reasoning to let the machine learn by itself.

  Outside of the technology sector, AI is still in the experimental stage. Yet the frontiers of AI research are steadily advancing, which in turn has expanded the potential set of tasks that computers can perform. The victory of Deep Mind’s AlphaGo over the world’s best professional Go player, Lee Sedol, in 2016 is probably the best-known example. With the defeat of Sedol, humans lost their competitive edge in the last of the classical board games, two decades after being superseded in chess. As we all know, in a six-game match played in 1996, the chess master Garry Kasparov prevailed against IBM’s Deep Blue by three wins but lost in a historic rematch a year later.

  Relative to chess, the complexity of Go is striking. Go is played on a board that is nineteen by nineteen squares, whereas chess uses a board that is eight by eight squares. As the mathematician Claude Shannon demonstrated in 1950, in his seminal paper on how to program a machine to play chess, a lower-bound estimate of the number of possible moves in chess is greater than the number of atoms in the observable universe, and the number of possible moves in Go is more than twice that number.2 Indeed, even if every atom in the universe was its own universe and had inside it the number of atoms in our universe, there would still be fewer atoms than the number of possible legal moves in Go. The illimitable complexity of the game means that not even the best players are capable of breaking it down into meaningful rules. Instead professionals play by recognizing patterns that emerge “when clutches of stones surround empty spaces.”3 As discussed above, humans still held the comparative advantage in pattern recognition when Frank Levy and Richard Murnane published their brilliant book The New Division of Labor in 2004.4 At the time, computers were nowhere near capable of challenging the human brain in identifying patterns. But now they are.

  Much more important than the fact that AlphaGo won is how it did so. While Deep Blue was a product of the rule-based age of computing, whose success rested upon the ability of a programmer to write explicit if-then-do rules for various board positions, AlphaGo’s evaluation engine was not explicitly programmed. Instead of following prespecified rules of the programmer, the machine was able to mimic tacit human knowledge, circumventing Polanyi’s paradox. Deep Blue was built on top-down programming. AlphaGo, in contrast, was the product of bottom-up machine learning. The computer inferred its own rules from a series of trials using a large data set. To learn, AlphaGo first watched previously played professional Go games, and then it played millions of games against itself, steadily improving its performance. Its training data set, consisting of thirty million board positions reached by 160,000 professional players, was far greater than the experience any professional player could accumulate in a lifetime. The event marks what Erik Brynjolfsson and Andrew McAfee have called the “second half of the chessboard.”5 As Scientific American marveled, “An era is over and a new one is beginning. The methods underlying AlphaGo, and its recent victory, have huge implications for the future of machine intelligence.”6

  Deep Blue may have beaten Kasparov at chess. But ironically, at any other task, Kasparov would have won. The only thing Deep Blue could do was evaluate two hundred million board positions per second. It was designed for one specific purpose. AlphaGo, on the other hand, relies on neural networks, which can be used to perform a seemingly endless number of tasks. Using neural networks, DeepMind has already achieved superhuman performance at some fifty Atari video games, including Video Pinball, Space Invaders, and Ms. Pac-Man.7 Of course, a programmer provided the instruction to maximize the game score, but an algorithm learned the best game strategies by itself over thousands of trials. Unsurprisingly, AlphaGo (or AlphaZero, as the generalized version is called), also outperfor
ms preprogrammed computers at chess. It took AlphaZero four hours to learn the game well enough to beat the best computers.

  Much recent progress, like AlphaGo’s triumph, has been aided by exponentially growing data sets, collectively known as big data. When things are digitized, they can be stored and transferred at virtually no cost. The digitization of just about everything generates billions of gigabytes on a daily basis through web browsers, sensors, and other networked devices. Digital books, music, pictures, maps, texts, sensor readings, and so on constitute massive bodies of data, providing the raw material of our age. As an ever-growing percentage of the world’s population becomes digitally connected, more and more people gain access to a significant share of the world’s accumulated knowledge. This also means that more and more people are able to add to this knowledge base, creating a virtuous cycle. As billions of people interact online, they leave digital trails that allow algorithms to tap into their experience. According to Cisco, worldwide internet traffic will increase nearly threefold over the next five years, reaching 3.3 zettabytes per year by 2021.8 To put this number in perspective, researchers at the University of California, Berkeley estimate that the information contained in all books worldwide is around 480 terabytes, while a text transcript of all the words ever spoken by humans would amount to some five exabytes.9

  Data can justly be regarded as the new oil. As big data gets bigger, algorithms get better. When we expose them to more examples, they improve their performance in translation, speech recognition, image classification, and many other tasks. For example, an ever-larger corpus of digitalized human-translated text means that we are able to better judge the accuracy of algorithmic translators in reproducing observed human translations. Every United Nations report, which is always translated by humans into six languages, gives machine translators more examples to learn from.10 And as the supply of data expands, computers do better.

  Google Translate draws on a plethora of algorithms, but it would be far less pervasive without the great leap in computer hardware powered by Moore’s Law. Many of the building blocks of computing—processing speed, microchip density, storage capacity, and so on—have seen decades of exponential improvements. For example, the idea of artificial neural networks (that is, layers of computational units that mimic how neurons connect in the brain) has been around since the 1980s, but the networks performed poorly due to constraints imposed by computational resources. So up until recently, machine translations relied on algorithms that analyzed phrases word by word from millions of human translations. However, phrase-based machine translations suffered from some serious shortcomings. In particular, the narrow focus meant that the algorithm often lost the broader context. A solution to this problem has been found in so-called deep learning, which uses artificial neural networks with more layers. These advances allow machine translators to better capture the structure of complex sentences. Neural Machine Translation (NMT), as it is called, used to be computationally expensive both in training and in translation inference. But due to the progression of Moore’s Law and the availability of larger data sets, NMT has now become viable.

  In machine translation, deep learning is not without its own drawbacks. One major challenge relates to the translation of rare words. For example, if you type the Japanese word for “once-in a lifetime encounter” into an NMT-based system, your output is likely to be “Forrest Gump.” While this might seem strange at first, this happened to be the subtitle of the Japanese version of the film. And because the word is rare, it did not show up in many other contexts. However, machine learning researchers have found some creative ways of circumventing this problem, at least in part, by dividing words into subunits. As a team of Google researchers demonstrated in a 2016 Nature article, the use of “word-units” and neural networks collectively reduced error rates by 60 percent, compared to the old phrase-based system.11 Though Google’s NMT system still lags behind human performance, it is catching up.

  Like steam, electricity, and computers, AI is a general purpose technology (GPT), which has a wide range of applications. As the economists Iain Cockburn, Rebecca Henderson, and Scott Stern have shown, there has been a dramatic shift in AI-related publications, from computer science journals to application-oriented outlets. In 2015, the authors estimate, nearly two-thirds of all AI publications were outside the field of computer science.12 Their finding is consistent with the general observation that AI is being applied to a cascading variety of tasks. The same technology that has shown promising results in machine translation is also performing visual tasks, such as image recognition. Starting from the individual pixels in an image, these algorithms work up through increasingly complex features, like geometric patterns.

  Image recognition has seen exponential progress in recent years. Error rates in the labeling of images have fallen from 30 percent in 2010 to 2 percent in 2017.13 While in many cases the technology is still at an experimental stage, it is already showing promising results. In Germany, for example, trials of automatic face recognition technology to identify people passing through Berlin’s Suedkreuz railway station have proven successful, aiding the work of security officials. Interior Minister Thomas de Maiziere reported that the right person had been recognized 70 percent of the time, while the algorithm had flagged the wrong person in less than 1 percent of cases, despite poor image quality.14 The same type of AI that identifies faces has also proven adept at diagnosing disease. New research published in Nature Medicine shows that AI is already capable of distinguishing between different types of lung cancers, using pathology images. And it does so with 97 percent accuracy.15 Another Nature article, published in 2017, used neural networks and a data set of 129,450 clinical images to test AI’s performance against twenty-one board-certified dermatologists and found that AI has already reached human level performance: “The [algorithm] achieves performance on par with all tested experts across both tasks, demonstrating an artificial intelligence capable of classifying skin cancer with a level of competence comparable to dermatologists. Outfitted with deep neural networks, mobile devices can potentially extend the reach of dermatologists outside of the clinic. It is projected that 6.3 billion smartphone subscriptions will exist by the year 2021 and can therefore potentially provide low-cost universal access to vital diagnostic care.”16

  Machines are not just turning into better translators and diagnosticians. They are becoming better listeners, too. Speech recognition technology is improving at staggering speed. In 2016, Microsoft announced a milestone in reaching human parity in transcribing conversations. And in August 2017, a research paper published by Microsoft’s AI team revealed additional improvements, reducing the error rate from 6 percent to 5 percent.17 And like image recognition technology promises to replace doctors in diagnostic tasks, advances in speech recognition and user interfaces promise to replace workers in some interactive tasks. As we all know, Apple’s Siri, Google Assistant, and Amazon’s Alexa rely on natural user interfaces to recognize spoken words, interpret their meanings, and respond to them accordingly. Using speech recognition technology and natural language processing, a company called Clinc is now developing a new AI voice assistant to be used in drive-through windows of fast-food restaurants like McDonald’s and Taco Bell.18 And in 2018, Google announced that it is building AI technology to replace workers in call centers. Virtual agents will answer the phone when a customer calls. If a customer request involves something the algorithm cannot yet do, he or she will automatically be rerouted to a human agent. Another algorithm then analyzes these conversations to identify patterns in the data, which in turn helps improve the capabilities of the virtual agent.19 As the technology evolves, its effects on the labor market could be significant. Despite decades of companies’ moving jobs offshore, roughly 2.2 million Americans still work in 6,800 call centers across the country, and several hundred thousand do similar jobs in smaller sites.20

  * * *

  One of the greatest leaps forward has taken place in autonomous driving
. In 2004, the Defense Advanced Research Projects Agency (DARPA)—set up by President Dwight Eisenhower in 1958, in response to the Soviet Union’s launch of the first artificial earth satellite, Sputnik 1—held its first “grand challenge” for driverless cars. The goal was to drive 142.0 miles through the Mojave Desert within ten hours without any human assistance. The farthest any of the vehicles got was 7.1 miles, and several cars did not even get off the starting line. The $1 million prize went unclaimed. Yet in 2016, the world’s first self-driving taxis were picking up passengers in Singapore.

  Recent progress in autonomous driving is thanks to big data and clever algorithms. It is now possible to store representations of a complete road network in a car, which simplifies the navigation problem. The changing of seasons, which brings challenges like snow, was long a key bottleneck to algorithmic navigation. But by storing records from the last time that snow fell, AI can now handle this problem.21 AI researchers have shown that algorithmic drivers now are able to identify major changes in the environment in which they operate, such as roadwork.22 In a major study, my Oxford engineering colleagues Bonolo Mathibela, Paul Newman, and Ingmar Posner concluded: “A vehicle can therefore prepare for the possibility of encountering humans on the road, or areas where [the vehicle] may not be stationary—thus gaining a dynamic sense of situational awareness, like a human.”23

 

‹ Prev