by Ajay Agrawal
Athey’s chapter lays out an exciting future for empirical work in eco-
nomics. It makes clear that there are real complementarities between ML
techniques and econometric techniques and she and others are working
to develop the relevant methodological tools and make them available to
applied researchers. Athey also points out that the growth of ML and ML-
based decision- making raises a number of new questions—such as, how to
avoid “gaming” of the algorithms as they become known and how to ensure
algorithms are fair and nondiscriminatory—and that economists and other
social scientists seem particularly well- suited to shed light on these types
of issues.
While Athey discusses the current opportunities for economists to uti-
lize “off - the- shelf ” ML methodologies in their research—for example, to
systematize model selection and robustness checks, to create variables, or
to carry- out prediction exercises—I believe this point deserves even greater
emphasis. The opportunities for researchers to integrate ML techniques into
traditional reduced- form or structural empirical work seem enormous. This
is because ML, at a fundamental level, takes inputs that do not look like data
and turns them into an output that looks very much like the type of data that
we can include in traditional econometric analyses. Machine learning is a
machinery for prediction. Sometimes that prediction exercise looks like the
kind of prediction exercise we might carry out with a simple logit or probit
model. For example, we might have data on which students graduate college
along with a number of their attributes upon admission, and we might use
this data to develop a model that predicts that probability of graduation for
each new college applicant.
However, much of the excitement around ML algorithms is that they
550 Mara Lederman
can handle data sets that are “unstructured”—that do not contain a set of
neatly labeled covariates in a series of columns. Indeed, ML does not even
require the covariates to be specifi ed or labeled. The algorithm determines
what the relevant covariates are. Consider text. Text doesn’t look like data.
We cannot easily put text—whether long bodies of texts or short fragments
of text—into regression models. But what ML can do is take text as an
input and predict a variety of things about that text—its content, its senti-
ment, its political leaning—and these can be used as variables in traditional
empirical analyses. As a very simple example, in Gans, Goldfarb, and Leder-
man (2017), we use a sentiment analysis algorithm to classify the sentiment
of over four million unique tweets to or about a major US airline. This allows
us to construct a variety of variables that measure not only the quantity, but
also the sentiment of “voice” to an airline on a given day that can be used
in our empirical analysis. Absent the algorithm we would be able to count
up the number of tweets, but would have a much harder time classifying the
sentiment of the tweet for anything other than a sample small enough to
code by hand.
Tweets are only one example. There are many potentially interesting
and informative sources of text that, with ML, can be now be exploited in
empirical research. For example, other types of social media posts, online
reviews, patent applications, job descriptions, newspaper articles, commer-
cial contracts, court transcripts, research papers, email communications,
customer service logs, performance evaluations, and fi nancial fi lings to name
just a few. Indeed, some of these examples have been discussed by others in
this volume. Machine- learning technologies literally open the door to novel
sources of data that economists can use to answer important questions in
a variety of fi elds.
Finally, in addition to thinking about how we as researchers might inte-
grate ML techniques into our own work, it seems critical to also think
about how organizations’ integration of ML into their decision- making
may impact our research. Despite the growing use of randomized experi-
ments, most research in applied economics still relies on observational data.
Observational data, of course, creates challenges for causal identifi cation
because the data- generating process is unlikely to be random. We believe
that observed equilibrium prices are the result of the interaction of supply
and demand and we therefore cannot regress quantity on price to estimate
the slope of a demand curve. Or, to use an example from organizational
economics, we believe that organizational forms are chosen optimally to
maximize performance, including economizing on transaction costs, and
therefore we cannot simply regress performance on organizational form in
order to estimate the performance implications of fi rm boundary decisions.
We develop theoretical models to help us understand the data- generating
process which, in turn, informs both our concerns about causality as well as
the identifi cation strategies that we develop.
Comment 551
As organizations increasingly allocate decisions to ML- based algorithms
we need to ask what implications this will have for the variation we observe
and exploit in the data we use for research. There are a number of factors
to consider. First, ML- based decisions are generally opaque. Thus, even the
organizations deploying the ML may not be able to explain how certain deci-
sions were made and so we may not be to understand the data- generating
process in some cases. Second, to the extent that organizations use ML to
optimize decisions—for example, to target advertising toward those for
which it will have the largest impact or to admit the MBA students who
are predicted to be the most successful upon graduation—the use of ML
may exacerbate selection problems. The treated and nontreated groups that
we observe in our data may be even more diff erent on unobservables when
those two groups are the result of ML- based decisions. On the other hand,
in some instances ML- based decisions may come closer to the behavioral
models we specify. For example, many structural papers in industrial orga-
nization specify complicated pricing or entry models. Machine- earning-
based algorithms may come closer to solving these problems than individual
decision- makers within a fi rm. Finally, as ML and other artifi cial intel-
ligence technologies diff use across organizations, they are likely to diff use
at diff erent rates. This means that, at least in some data sets, we are likely to
observe a mix of ML- based and traditional decision- making that creates
another potentially important source of unobserved heterogeneity. Overall,
as applied researchers working with real- world data sets, we need to recog-
nize that increasingly the data we are analyzing is going the be the result of
decisions that are made by algorithms in which the decision- making process
may or may not resemble the decision- making processes we model as social
scientists.
References
Gans, Joshua S., Avi Goldfarb, and Mara Lederman. 2017. “Exit, Tweets and
Loy-
alty.” NBER Working Paper no. 23046, Cambridge, MA.
Glaeser, Edward L., Scott Duke Kominers, Michael Luca, and Nikhil Naik.
2018.”Big Data and Big Cities: The Promises and Limitations of Improved Mea-
sures of Urban Life.” Economic Inquiry 56 (1) 114– 37.
Kleinberg, Jon, Jens Ludwig, Sendhil Mullainathan, and Ziad Obermeyer. 2015.
“Prediction Policy Problems.” American Economic Review 105 (5): 491– 95.
22
Artifi cial Intelligence,
Labor, Productivity, and the
Need for Firm- Level Data
Manav Raj and Robert Seamans
22.1 Introduction
There have recently been dramatic increases in the technical capabilities of
artifi cial intelligence (AI).1 For example, in February 2016, Google’s Deep-
Mind used its AI to beat Korean Go master Lee Se- dol,2 and in January
2017, an AI system called DeepStack beat humans at the complex poker
game Texas Hold ’Em.3 The Electronic Frontier Foundation (EFF) has
tracked the rapid progress of AI in performing tasks at human- like levels of
capability in domains including voice recognition, translation, visual image
recognition, and others.4 These advancements have led to both excitement
about the capability of new technology to boost economic growth and con-
cern about the fate of human workers in a world in which computer algo-
rithms can perform many of the functions that a human can (e.g., Frey and
Osborne 2017; Furman 2016b).
Indicative of this excitement and interest in the area, recent academic
research, using national- level data on worldwide robotics shipments, sug-
gests that robotics may have been responsible for about one- tenth of the
Manav Raj is a PhD student in the Management and Organizations Department at the Stern
School of Business, New York University. Robert Seamans is associate professor of management and organizations at the Stern School of Business, New York University.
For acknowledgments, sources of research support, and disclosure of the authors’ material fi nancial relationships, if any, please see http:// www .nber .org/ chapters/ c14037.ack.
1. Artifi cial intelligence is a loose term used to describe a range of advanced technologies that exhibit human- like intelligence, including machine learning, autonomous robotics and vehicles, computer vision, language processing, virtual agents, and neural networks.
2. https:// www .nytimes .com/ 2016/ 03/ 10/ world/ asia/ google- alphago- lee- se- dol .html.
3. https:// www .scientificamerican .com/ article/ time- to-fold- humans- poker- playing- ai
- beats- pros- at- texas- hold- rsquo- em/.
4. https:// www .eff .org/ ai/ metrics.
553
554 Manav Raj and Robert Seamans
increase in the gross domestic product (GDP) between 1993 and 2007
(Graetz and Michaels 2015). Moreover, according to the 2016 Economic
Report of the President, worldwide demand for robotics has nearly doubled
between 2010 and 2014, and the number and share of robotics- oriented pat-
ents have also increased (CEA 2016). Thus, robots may now be contributing
even more to GDP growth than in the past.
However, even as these technologies may be contributing to GDP growth
at a national level, we lack an understanding about how and when they
contribute to fi rm- level productivity, what conditions they complement or
substitute for labor, how they aff ect new fi rm formation, and how they shape
regional economies. We lack an understanding of these issues because, to
date, there is a lack of fi rm- level data on the use of robotics and AI. Such
data will be important to collect to answer these questions and to inform
policymakers about the role of these new technologies in our economy and
society.
This chapter describes high- level fi ndings about the eff ects of robotics
on the economy while highlighting the few articles addressing the impact
of AI, describes shortcomings of the existing data, and argues for more
systematic data collection at the fi rm level. We echo a recent National Acad-
emies of Science Report (NAS 2017) calling for more data collection on the
eff ects of automation, including both artifi cial intelligence and robotics,
on the economy. More generally, collection of and access to granular data
allows for better analysis of complex questions, and provides a “scien-
tifi c safeguard” via replication work done by multiple sets of researchers
(Lane 2003).
22.2 Existing
Empirical
Work
While there is little empirical work on the eff ects of either AI or robots,
there are comparably more studies on robots, likely owing to their physical
nature, which makes them easier to track over time and location. Initial stud-
ies of the eff ect of robots on productivity and labor provide a mixed view.
Using robot shipment data at the country, industry, and year level from the
International Federation of Robotics (IFR), Graetz and Michaels (2015)
fi nd large eff ects on productivity growth. Looking at national- level data on
robot shipments across seventeen countries, Graetz and Michaels show that
robots may be responsible for roughly one- tenth of the increases in the gross
domestic product of these countries between 1993 and 2007 and may have
increased productivity growth by more than 15 percent. This is a signifi cant
eff ect; according to the authors, it is comparable to the impact of the adop-
tion of steam engines on British labor productivity in the nineteenth century.
They also fi nd evidence that, on average, wages increase with robot use, but
hours worked drops for low- skilled and middle- skilled workers.
In another study using IFR data, Acemoglu and Restrepo (2017) examine
AI, Labor, Productivity, and the Need for Firm- Level Data 555
the impact of the increase in industrial robot usage on regional US labor
markets between 1990 and 2007. Using the distribution of robots at the
industry level in other advanced countries as an instrument, the authors
fi nd that industrial robot adoption in the United States was negatively cor-
related with employment and wages during this time period. They estimate
that each additional robot reduced employment by six workers and that one
new robot per thousand workers reduced wages by 0.5 percent. The authors
note that the eff ects are most pronounced in manufacturing, particularly
in routine manual and blue- collar occupations, and for workers without a
college degree. Further, they fi nd no positive eff ects on employment due to
the adoption of robotics in any industry.
The European Commission Report on Robotics and Employment (EC
2016) examined the use of industrial robots in Europe. The report relies on
robotics data from the European Manufacturing Survey, a sample of 3,000
manufacturing fi rms in seven European countries, which has been periodi-
cally administered since 2001, most recently in 2012. Using this data, the
authors fi nd that the use of industrial robots is likelier in larger companies,
fi rms utilizing batch production, and fi rms that are export oriented. The
study fi nds no evidence that the use of industrial robots has any direct eff ect
on employment, though fi rms utilizing robotics do
have signifi cantly higher
levels of labor productivity.
More broadly, existing work on automation and employment has sug-
gested that automation can either substitute for or complement labor. Frey
and Osborne (2017) argue that almost half of the total US employment is at
risk of being automated over the next two decades. Similarly, Brynjolfsson,
and McAfee (2014) suggest that, due to the automation of cognitive tasks,
new technologies may increasingly serve as substitutes rather than comple-
ments. On the other hand, other research has found that positive technology
shocks have historically increased job opportunities and employment overall
(e.g., Alexopoulos and Cohen 2016).
Regardless of the eff ect of automation on employment in the directly
impacted industry, technology adoption may have positive upstream and
downstream eff ects on labor. Autor and Salomons (2017) show that, while
employment seems to fall within an industry as industry- specifi c productiv-
ity increases, positive spillovers to other sectors more than off set the nega-
tive own- industry employment eff ect. Further, Bessen (2017) fi nds that new
technologies should have a positive eff ect on employment if they improve
productivity in markets where there is a large amount of unmet demand. In
the context of robotics and automation, Bessen suggests that new computer
technology is associated with employment declines in manufacturing, where
demand has generally been met, but is correlated with employment growth
in less saturated, nonmanufacturing industries. Similarly, Mandel (2017),
studying the eff ects of e-commerce, fi nds that job losses at brick- and- mortar
department stores were more than made up for by new opportunities at
556 Manav Raj and Robert Seamans
fulfi llment and call centers. Dauth et al. (2017) combines German labor mar-
ket data with IFR robot shipment data and fi nds that, while each additional
industrial robot leads to the loss of two manufacturing jobs, enough new
jobs are created in the service industry to off set and in some cases overcom-
pensate for the negative employment eff ect in manufacturing.
There has been less systematic work on the eff ect of AI on the economy.
Two notable exceptions are studies by Frey and Osborne (2017) and the