by Ajay Agrawal
people who have the lowest wages, namely, pure manual labor jobs that can’t
be outsourced at all. Information technology might be progressive at the
lower end of the income distribution while hollowing out the middle, argu-
ably a phenomenon we have seen in the United States. The biggest eff ects
for income distribution might be across borders rather than within nations.
Or, to put it another way, Africa may never have the chance to follow in the
footsteps of Japan and South Korea with respect to industrialization.
From an egalitarian point of view, these distributional eff ects may be hard
to address, precisely because they cross borders. Citizens are often willing to
support income redistribution within their nations, but they are much less
likely to favor signifi cant investments in foreign aid, especially when it is to
distant nations rather than to neighbors or major trading partners.
15.3 The Political Economy of Artifi cial
Intelligence and Income Redistribution
Discussions of artifi cial intelligence sometimes postulate large numbers
of unemployed or underemployed people, possibly living off a guaranteed
annual income or some other form of massive redistribution. On one hand, I
can see the reason for considering a shift to larger cash payments. Yet the eco-
nomics, politics, and sociology of guaranteed income may create problems.
If you ask which are the countries today where citizens hardly do any
work, Brunei and Qatar, two resource- rich monarchies, come to mind. In
each country people get a lot of money from the government, and foreign
workers do much of the labor. From an analytical point of view, that is not
so diff erent from relying on robots.
The recent histories of those countries indicate that redistribution is a
politically tricky concept. Imagine for instance a polity where virtually the
entire gross domestic product is in some way recycled or redistributed. I
expect the resulting political economy would not resemble that of Norway,
as Norway without oil still would have a living standard close to that of
Sweden or Denmark. Brunei or Qatar without fossil fuels likely would be
much poorer. Given that reality, when so much of the gross domestic product
(GDP) is being redistributed through politics, I wonder if this is compatible
with American or Western notions of democracy. For instance, the oligar-
chic political forces that control the oil might make upfront off ers to the
interest groups that might oppose them and cement their control. Indeed
those monarchies do seem to be stable, and it is far from obvious that they
are evolving toward democracy. Their governments are partially benevolent
toward the citizenry, but they also use a lot of the surplus to achieve their
own ends, which may be religious or ideological. It seems countries that rely
on fossil fuels for their GDP don’t end up with the thick middle class that
in the West at least partially controls the government, and is also a domi-
nant force in our civic society and social capital. Possibly oil- rich countries
do not have the economic base to sustain a version of Western- style liberal
democracy, and that has something to do with so much of the GDP being
recycled and redistributed. That is correlated with having a politically weak
middle class and an opposition that is too easily bought off ; at least that is
what we observed to date in some of fossil- fuel- rich small states.
The experience of Brunei and Qatar also raises the question of what the
governmental authority should be redistributing. In simple economic mod-
els, cash is redistributed to those who typically need it most. But in more
comfortable settings with a lot of resource wealth, it also may be necessary
to redistribute status. That’s harder to do; for the social scientist, it is also
harder to model. We may need to redistribute the notion of having a mean-
ingful job because although Qatar and Brunei have high per capita incomes,
including at the median, it is not obvious to all outside observers that their
citizens are happy and fulfi lled.
It’s possible that government “make- work” jobs will supply status to
people, but there is also a danger the make- work component will be too
obvious, and the resulting jobs will bring low rather than high status. In the
last US presidential campaign, Hillary Clinton spoke more of redistribution
and Donald Trump talked more of jobs; Trump’s message seemed to be the
more eff ective of the two.
Some desired redistributions may cross gender lines. For instance, as the
population ages there will be a greater care burden for women than men,
as women seem to put more time and eff ort into caring for their aging par-
ents. Redistribution of money toward women may help, but at its core the
problem may be one of stress rather than money per se. A change in social
norms may produce a better and more eff ective redistribution than simply
sending around checks.
If we think of caring for the elderly as a potential job with a lot of growth
potential, on average women may be better at this than men, which in the
labor market context serves as a penalty on being male, again to speak of the
averages only. More generally, the shift toward service- sector jobs may favor
women more than unskilled men. The public policies needed for many men
may diff er from those needed for women once again, and cash is not always
the appropriate tool for recognizing those distinctions.
The general idea that in these stranger futures, what redistribution is, or
has to be, is something quite diff erent from what it is in the simple Paretian
model. That is a frontier issue where we economists haven’t done much
work at all, but the ongoing progress of AI may make those questions all
the more relevant.
III Machine Learning and Regulation
Artifi cial Intelligence, Economics,
and Industrial Organization
Hal Varian
16.1 Introduction
Machine learning (ML) and artifi cial intelligence (AI) have been around
for many years. However, in the last fi ve years, remarkable progress has been
made using multilayered neural networks in diverse areas such as image rec-
ognition, speech recognition, and machine translation. Artifi cial intelligence
is a general purpose technology that is likely to impact many industries. In
this chapter I consider how machine learning availability might aff ect the
industrial organization of both fi rms that provide AI services and industries that adopt AI technology. My intent is not to provide an extensive overview
of this rapidly evolving area, but instead to provide a short summary of
some of the forces at work and to describe some possible areas for future
16.2 Machine-
Imagine we have a set of digital images along with a set of labels that
describe what is depicted in those images—things like cats, dogs, beaches,
mountains, cars, or people. Our goal is to use this da
ta to train a computer
Hal Varian is an emeritus professor at the University of California, Berkeley, and chief economist at Google.
Carl Shapiro and I started drafting this chapter with the goal of producing a joint work.
Carl Shapiro and I started drafting this chapter with the goal of producing a joint work.
Unfortunately, Carl became very busy and had to drop out of the project. I am grateful to him for the time he was able to put in. I also would like to thank Judy Chevalier and the participants of the NBER Economics of AI conference in Toronto, Fall 2017.
to learn how to predict labels for some new set of digital images. (For a nice
demonstration, see .com/ vision where you can upload a photo
and retrieve a list of labels appropriate for that photo.)
The classical approach to machine vision involved creating a set of rules
that identifi ed pixels in the images with human- recognizable features such
as color, brightness, and edges and then use these features to predict labels.
This “featurization” approach had limited success. The modern approach
is to work directly with the raw pixels using layered neural networks. This
has been remarkably successful, not only with image recognition but also
with voice recognition, language translation, and other traditionally diffi
machine- learning tasks. Nowadays computers can outperform humans in
many of these tasks.
This approach, called deep learning, requires (a) labeled data for training,
(b) algorithms for the neural nets, and (c) special- purpose hardware to run
the algorithms. Academics and tech companies have provided training data
and algorithms for free, and compute time in cloud- computing facilities is
available for a nominal charge.
1. Training data. Examples are OpenImages, a 9.5 million data set of
labeled images and the Stanford Dog Data set, 20,580 images of 120 breeds
of dogs.
2. Algorithms. Popular open- source packages include TensorFlow, Caff e,
MXNet, and Theano.
3. Hardware. CPUs (central processing units), GPUs (graphical pro-
cessing units), and TPUs (Tensor processing units), are available via cloud-
computing providers. These facilities allow the user to organize vast amounts
of data, which can be used to train machine- learning models.
Of course, it is also important to have experts who can manage the data,
tune the algorithms, and nurture the entire process. These skills are, in fact,
the main bottleneck at the moment, but universities are rapidly rising to the
challenge of providing the education and training necessary to create and
utilize machine learning.
In addition to machine vision, the deep learning research community has
made dramatic advances in speech recognition and language translation.
These areas also have been able to make this progress without the sorts of
feature identifi cation that had been required for previous ML systems.
Other types of machine learning are described in the Wikipedia entry on
this topic. One important form of machine learning is reinforcement learn-
ing. This is a type of learning where a machine optimizes some task such as
winning at chess or video games. One example of reinforcement learning is
a multiarmed bandit, but there are many other tools used, some of which
involve deep neural nets.
Reinforcement learning is a type of sequential experimentation and is
therefore fundamentally about causality: moving a particular chess piece
from one position to another causes the probability of a win to increase.
This is unlike passive machine- learning algorithms that use only observa-
tional data.
Reinforcement learning can also be implemented in an adversarial context.
For example, in October 2017 DeepMind announced a machine- learning
system, Alpha Go 0, that developed a highly eff ective strategy by playing
Go games against itself!
The model of “self- taught machine learning” is an interesting model for
game theory. Can deep networks learn to compete and/or learn to cooperate
with other players entirely their own? Will the learned behavior look any-
thing like the equilibria for game- theoretic models we have built? So far these
techniques have been applied primarily to full information games. Will they
work in games with incomplete or asymmetric information?
There is a whole subarea of AI known as adversarial AI (or adversarial
ML) that combines themes from AI, game theory, and computer security
that examines ways to attack and defend AI systems. Suppose, for example,
that we have a trained image recognition system that performs well, on
average. What about its worst- case performance? It turns out that there are
ways to create images that appear innocuous to humans that will consis-
tently fool the ML system. Just as “optical illusions” can fool humans, these
“ML illusions” can fool machines. Interestingly, the optimal illusions for
humans and machines are very diff erent. For some examples, see Goodfel-
low et al. (2017) for illustrative examples and Kurakin, Goodfellow, and
Bengio (2016) for a technical report. Computer science researchers have
recognized the connections with game theory; in my opinion, this area
off ers many interesting opportunities for collaboration. (See, e.g., Sreeval-
labh and Liu 2017).
16.2.1 What Can Machine Learning Do?
The example of machine learning presented in the popular press empha-
sizes novel applications, such as winning at games such as chess, Go, and
Pong. However, there are also many practical applications that use machine
learning to solve real- world business problems. A good place to see what
kinds of problem ML can solve is Kaggle. This company sets up machine-
learning competitions. A business or other organization provides some data,
a problem statement, and some prize money. Data scientists then use the
data to solve the problem posed. The winners get to take home the prize
money. There are well over 200 competitions on the site. Here are a few of
the most recent.
• Passenger Threats. Improve accuracy of Homeland Security threat rec-
ognition: $1,500,000.
• Home Prices. Improve accuracy of Zillow’s home- price prediction:
c to Wikipedia Pages. Forecast future traffi
c to Wikipedia pages:
• Personalized Medicine. Predict eff ect of genetic variants to enable per-
sonalized medicine: $15,000.
• Taxi Trip Duration. Predict total ride duration of taxi trips in New
York: $30,000.
• Product Search Relevance. Predict relevance of search results on
homedepot .com: $40,000.
• Clustering Questions. Can you identify question pairs that have the
same intent?: $25,000.
• Cervical cancer screening. Which cancer treatments will be most eff ec-
tive?: $100,000.
• Click Prediction. Ca
n you predict which recommended content each
user will click?: $25,000.
• Inventory Demand. Maximize sales and minimize returns of bakery
goods: $25,000.
What is nice is that these are real questions and real money from orga-
nizations that want real answers for real problems. Kaggle gives concrete
examples of how machine learning can be applied for practical business
16.2.2 What Factors Are Scarce?
Suppose you want to deploy a machine- learning system in your orga-
nization. The fi rst requirement is to have a data infrastructure that collects
and organizes the data of interest—a data pipeline. For example, a retailer
would need a system that can collect data at point of sale, and then upload
it to a computer that can then organize the data into a database. This data
would then be combined with other data, such as inventory data, logistics
data, and perhaps information about the customer. Constructing this data
pipeline is often the most labor intensive and expensive part of building a
data infrastructure, since diff erent businesses often have idiosyncratic legacy
systems that are diffi
cult to interconnect.
Once the data has been organized, it can be collected together to in a
data warehouse. The data warehouse allows easy access to systems that can
manipulate, visualize, and analyze the data.
Traditionally, companies ran their own data warehouses that required not
only purchase of costly computers, but also required human system admin-
istrators to keep everything functioning properly. Nowadays, it is more and
more common to store and analyze the data in a cloud- computing facility
1. Disclosure: I was an angel investor in Kaggle up till mid- 2017 when it was acquired by Google. Since then, I have had no fi nancial interest in the company.
such as Amazon Web Services, Google Cloud Platform, or Microsoft Azure
The cloud provider takes care of managing and updating the hardware
and software necessary to host the databases and tools for data analysis.
From an economic point of view, what is interesting is that what was previ-
ously a fi xed cost to the users (the data center) has now turned into a vari-