by Ajay Agrawal
McKinsey Global Institute (MGI). Frey and Osborne (2017) attempt to
determine what jobs may be particularly susceptible to automation and
to provide an idea of how large an impact automation could have on the
US labor force. The authors focus particularly on machine learning and its
application to mobile robotics, and propose a model to predict the extent of
computerization’s impact on nonroutine tasks, noting potential engineer-
ing bottlenecks at tasks involving high levels of perception or manipula-
tion, creative intelligence, and social intelligence. After categorizing tasks
by their susceptibility to automation, Frey and Osborne map these tasks to
the O*NET job survey, which provides open- ended descriptions of skills
and responsibilities involved in an occupation over time. Integrating this
data set with employment and wage data from the Bureau of Labor Statistics
(BLS) allows the authors to propose certain subsets of the labor market that
may be at high, medium, or low risk of automation. The study fi nds that
47 percent of US employment is at high risk of computerization. It should
be noted that this study is at an aggregate level and does not examine how
fi rms may react, any labor saving innovations that could arise, or potential
productivity or economic growth.
Frey and Osborne’s work has also been applied by researchers in other
countries—mapping Frey and Osborne’s occupation- level fi ndings to Ger-
man labor market data, Brzeski and Burk (2015) suggest that 59 percent of
German jobs may be highly susceptible to automation, while conducting
that same analysis in Finland, Pajarinen and Rouvinen (2014) suggest that
35.7 percent of Finnish jobs are at high risk to automation.
The Organisation for Economic Co- operation and Development (OECD)
similarly set out to estimate the automatability of jobs across twenty- one
OECD countries applying Frey and Osborne to a task- based approach.
The OECD report argues that certain tasks will be displaced and that the
extent that bundles of tasks diff er within occupations and across countries
may make certain occupations less prone to automation than Frey and
Osborne predicted. Relying upon the task categorization done by Frey and
Osborne, the authors map task susceptibility to automation to US data from
the Programme for the International Assessment of Adult Competencies
(PIAAC), a microlevel data source containing indicators on socioeconomic
characteristics, skills, job- related information, job tasks, and competencies
at the individual level. They then construct a model using the PIAAC to
create a predicted susceptibility to automation based off of the observables
AI, Labor, Productivity, and the Need for Firm- Level Data 557
in the PIAAC data to mirror the automatability score that Frey and Osborne
created. This model is then applied at the worker level across all the PIAAC
data to predict how susceptible occupations may be to automation. By con-
ducting the analysis at the individual level, the OECD argues that it is better
able to account for task variation between individuals within the same occu-
pation. As a result, the report suggests that Frey and Osborne overestimated
the extent to which occupations would be susceptible to automation. The
OECD Report argues that only 9 percent of jobs in the United States and
across OECD countries will be highly susceptible to automation. The report
continues to discuss variations across OECD countries, suggesting that the
percent can range from 6 percent (in Korea) up to 12 percent (in Austria).
Mann and Püttmann (2017) take a diff erent approach to analyze the
eff ects of automation on employment. In their study, the authors rely on
information provided from granted patents. They apply a machine- learning
algorithm to all US patents granted from 1976 to 2014 to identify patents
related to automation (an automation patent is defi ned as a “device that
operates independently from human intervention and fulfi lls a task with rea-
sonable completion”). They then link the automation patents to the indus-
tries they are likely to be used in, and identify which areas in the United
States that these industries are related in. By examining economic indicators
in comparison to the density of automation patents used in an area, Mann
and Puttman fi nd that though automation causes manufacturing employ-
ment to fall, it increases employment in the service sector, and overall has a
positive impact on employment.
In June 2017, the McKinsey Global Institute published an independent
discussion paper examining trends in investment in artifi cial intelligence, the
prevalence of AI adoption, and how AI is being deployed by companies that
have started to use the technology (MGI Report 2017). For the purpose of
their report, the authors adopted a fairly narrow defi nition of AI, focusing
only on AI technology that is programmed to conduct one set task. The
MGI report conducted their investigation with a multifaceted approach: it
surveyed executives at over 3,000 international fi rms, interviewed industry
experts, and analyzed investment fl ows using third- party venture capital,
private equity, and mergers and acquisitions data. Using the data collected,
the MGI report attempts to answer questions regarding adoption by sector,
size, and geography; to look at performance implications of adoption; and
to examine potential impacts to the labor market. Though the fi ndings are
presented at an aggregate level, much of the data, particularly the survey
of executives, were collected at the fi rm level, allowing for further inquiry
if one had access.
In addition to these published works, other researchers have begun to
examine the eff ect of AI on occupations by looking at its impact on indi-
vidual abilities and skills. Brynjolfsson, Mitchell, and Rock (forthcoming)
apply a rubric from Brynjolfsson and Mitchell (2017) that evaluates the
558 Manav Raj and Robert Seamans
potential for applying machine learning to tasks to the set of work activities
and tasks in the Bureau of Labor Statistics’ O*NET occupational data-
base. With this analysis, they create a “Suitability for Machine Learning”
for labor inputs in the United States. Similar research by Felten, Raj, and
Seamans (forthcoming) uses data- tracking progress in artifi cial intelligence
aggregated by the Electronic Frontier Foundation (EFF) across a variety
of diff erent artifi cial intelligence metrics and the set of fi fty- two abilities
in the O*NET occupational database to identify the impact of artifi cial
intelligence on each of the abilities, and create an occupation- level score
measuring the potential impact of AI on the occupation. Because the data
from the EFF is separated by AI metric, this work allows for the investiga-
tion and simulation of progress in diff erent kinds of AI technology, such as
image recognition, speech recognition, and ability to play abstract strategy
games among others.
The current body of empirical literature surrounding robotics and AI
adoption is growing, but is still thin, and despite often trying to answe
r
similar questions, diff erent studies have found disparate results. These dis-
crepancies highlight the need for further inquiry, replication studies, and
more complete and detailed data.
22.3 The Need for Firm- Level Data
While there is generally a paucity of data examining the adoption, use,
and eff ects of both AI and robotics, there is currently less information avail-
able regarding AI. There are no public data sets on the utilization or adop-
tion of AI at either the macro or micro level. The most complete source of
information, the MGI study, is proprietary and inaccessible to the general
public or the academic community.
The most comprehensive and widely used data set examining the diff usion
of robotics is the International Federation of Robotics Robot Shipment
Data. The IFR has been recording information regarding worldwide robot
stock and shipment fi gures since 1993. The IFR collects this data from its
members, who are typically large robot manufacturers such as FANUC,
KUKA, and Yaskawa. The data are broken up by country, year, industry,
and technological application, which allows for analysis of the industry-
specifi c impacts of technology adoption. However, the IFR data set has
shortcomings. The IFR defi nes an industrial robot as an “automatically con-
trolled, reprogrammable, multipurpose manipulator, programmable in three
or more axes, which can be either fi xed in place or mobile for use in industrial
automation applications.”5 This defi nition limits the set of industrial robots
and ensures that the IFR does not collect any information on dedicated
industrial robots that serve one purpose. Further, some of the robots are
5. https:// ifr .org/ standardization.
AI, Labor, Productivity, and the Need for Firm- Level Data 559
not classifi ed by industry, detailed data is only available for industrial robots
(and not robots in service, transportation, warehousing, or other sectors),
and geographical information is often aggregated (e.g., data exist for North
America as a category rather than the United States, or an individual state
within the United States).
Another issue with the IFR data is the diffi
culty of integrating it with
other data sources. The IFR utilizes its own industry classifi cations when
organizing the data, rather than relying on broadly used identifi ers such as
the North American Industry Classifi cation System (NAICS). Mapping
IFR data to other data sets (such as BLS or census data) fi rst requires cross-
referencing IFR classifi cations to other identifi ers. Industry- level data also
cannot be used to answer micro- oriented questions about the impacts and
reaction to technology adoption at the fi rm level.
While the IFR data are useful for some purposes, particularly examining
the adoption of robotics by industry and country, its aggregated nature
obscures diff erences occurring within industries and across regions, mak-
ing it diffi
cult to uncover when and how robots might serve as substitutes
or complements to labor, and obscuring the diff erential eff ects of adoption
within industries or countries. Additional data is needed to answer the issues
raised above and to replicate existing studies. In particular, the National
Academy of Sciences Report (NAS 2017) highlights the need for computer
capital broken down at the fi rm and occupation level, skill changes over time
by fi eld, and data on organizational processes as they relate to technology
adoption.
The European Manufacturing Survey (EMS) has been organized and exe-
cuted periodically by a number of research organizations and universities
across Europe since 2001, and is currently one of the only fi rm- level data
sets examining the adoption of robotics. The overall objective of the EMS is
to provide empirical evidence regarding the use and impact of technological
innovation in manufacturing at the fi rm level. The EMS accomplishes this
via a survey of a random sample of manufacturing fi rms with at least twenty
employees across seven European countries (Austria, France, Germany,
Spain, Sweden, Switzerland, and the Netherlands). While some aspects of
the survey vary across countries, the core set of questions inquire about
whether the fi rm uses robots, the intensity of robot usage, and reinvestment
in new robot technology. Data currently exists for fi ve survey rounds: 2001–
2002, 2003– 2004, 2006– 2007, 2009– 2010, and 2012– 2013, and has been
used in reports created by the European Commission to analyze the use of
robotics and its impact on labor patterns, including wages, productivity,
and off shoring.
As of now, the EMS appears to be one of the few data sources that are
capturing the use of robots and automation at the fi rm level. This provides
opportunities to analyze microeff ects of robotics technology on fi rm pro-
ductivity and labor, and to analyze fi rm decision- making following adop-
560 Manav Raj and Robert Seamans
tion. However, the EMS has its own limitations. The survey only consid-
ers industrial robots, and the core questionnaire only asks three questions
regarding the use of robots in a factory setting. The survey is performed
at the fi rm rather than establishment level, and the sample size of 3,000
is quite small. In contrast, the Census’s Annual Survey of Manufacturers
(ASM) surveys 50,000 establishments annually and 300,000 every fi ve years.6
Finally, similar to many other existing data sets, the EMS is purely focused
on the manufacturing industry and does not address technology adoption
at smaller fi rms with less than twenty employees.
22.4 Additional Firm- Level Research Questions
Firm- level data on the use of AI would allow researchers to address a
host of questions including, but not limited to: the extent to which, and
under what conditions, AI complement or substitute for labor; how AI aff ect
fi rm- or establishment- level productivity; which types of fi rms are more or
less likely to invest in AI; how market structure aff ects a fi rm’s incentives to
invest in AI; and how adoption is eff ecting fi rm strategies. As the nature of
work itself changes with increased adoption, researchers can also investi-
gate how fi rm management has been aff ected, particularly at the lower and
middle level.
Additionally, there are many important policy questions that cannot be
answered without disaggregated data. Some of these questions are related
to the need to reevaluate how individuals are trained prior to entering the
workforce. Without an understanding of the changes in worker experience
resulting from technology adoption, it will be diffi
cult to craft appropriate
worker education, job training, and retraining programs. Further, issues
related to inequality could be examined, particularly with relation to the
“digital divide” and the eff ects of technology adoption on diff erent demo-
graphics. There are also unanswered questions regarding the diff erential
eff ects of adoption on regional economies. For example, the eff ects of AI
on labor may b
e pronounced in some regions because industries, and even
occupations within those industries, tend to be geographically clustered
(Feldman and Kogler 2010). Thus, to the extent that AI or robots substitute
for labor in certain industries or occupations, regions that rely heavily on
those industries and occupations for jobs and local tax revenue may suff er.
Moreover, following the recent fi nancial crisis, unemployment insurance
reserves in some states have been slow to recover (Furman 2016a). Data on
the regional adoption of AI could be used to simulate the extent to which
future adoption may increase unemployment and whether unemployment
insurance reserves are adequately funded.
6. The census surveys all 300,000 manufacturing establishments every fi ve years, and a rotat-ing subsample of about 50,000 every year. See: https:// www .census .gov/ programs- surveys/ asm
/ about .html.
AI, Labor, Productivity, and the Need for Firm- Level Data 561
Finally, these new technologies may have implications for entrepreneurs.
Entrepreneurs may lack knowledge of how best to integrate robotics with a
workforce and often face fi nancing constraints that make it harder for them
to adopt capital- intensive technologies. In the case of AI, entrepreneurs may
lack data sets on customer behavior, which are needed to train AI systems.
Firm- level surveys on the use of AI will help us develop a better understand-
ing of these and related issues.
22.5 Strategies for Collecting More Data
Micro- level data regarding the adoption of AI, robots, and other types
of automation can be created in a variety of ways, the most comprehensive
of which would be via a census. Census data would provide information for
the entire population of relevant establishments, and while the information
provided would be narrow, quality is likely to be high. Additionally, data
from the Census Bureau would be highly integrable with other government
data sources, such as employment or labor statistics from the BLS. Data
could be collected as a stand- alone inquiry, similar to the Management
and Organizational Practices (MOPS) survey (see Bloom et al. 2017), or by
adding questions to existing surveys, similar to work done by Brynjolfsson
and McElheran (2016), which involved adding questions on data- driven
decision- making to an existing census survey.