The Economics of Artificial Intelligence
Page 23
deep learning are by and large innovations that require a signifi cant level of
human planning and that apply to a relatively narrow domain of problem-
solving (e.g., face recognition, playing Go, picking up a particular object,
etc.) While it is, of course, possible that further breakthroughs will lead to
a technology that can meaningfully mimic the nature of human subjective
intelligence and emotion, the recent advances that have attracted scientifi c
and commercial attention are well removed from these domains.
Second, though most economic and policy analysis of AI draws out con-
sequences from the last two decades of automation to consider the future
economic impact of AI (e.g., in job displacement for an ever- increasing
number of tasks), it is important to emphasize that there is a sharp diff erence
between the advances in robotics that were a primary focus of applications
of AI research during the fi rst decade of the twenty- fi rst century and the
potential applications of deep learning that have come to the fore over the
last few years.
As we suggested earlier, current advances in robotics are by and large
associated with applications that are highly specialized and that are focused
126 Iain M. Cockburn, Rebecca Henderson, and Scott Stern
on end- user applications rather than on the innovation process itself, and
these advances do not seem as of yet to have translated to a more gener-
ally applicable IMI. Robotics is therefore an area where we might focus
on the impact of innovation (improved performance) and diff usion (more
widespread application) in terms of job displacement versus job enhance-
ment. We see limited evidence as yet of widespread applications of robotics
outside industrial automation, or of the scale of improvements in the ability
to sense, react to, and manipulate the physical environment that the use of
robotics outside manufacturing probably requires. But there are exceptions:
developments in the capabilities of “pick and place” robots and rapid pro-
gress in autonomous vehicles point to the possibility for robotics to escape
manufacturing and become much more broadly used. Advances in robotics
may well reveal this area of AI be a GPT, as defi ned by the classic criteria.
Some research tools/ IMIs based on algorithms have transformed the
nature of research in some fi elds, but have lacked generality. These types
of algorithmic research tools, based on a static set of program instructions,
are a valuable IMI, but do not appear to have wide applicability outside a
specifi c domain and do not qualify as GPTs. For example, while far from
perfect, powerful algorithms to scan brain images (so- called functional mag-
netic resonance imaging [MRI]) have transformed our understanding of the
human brain, not only through the knowledge they have generated, but also
by establishing an entirely new paradigm and protocol for brain research.
However, despite its role as a powerful IMI, fMRI lacks the type of general
purpose applicability that has been associated with the most important
GPTs. In contrast, the latest advances in deep learning have the potential to
be both a general purpose IMI and a classic GPT.
Table 4.1 summarizes these ideas.
How might the promise of deep learning as a general purpose IMI be
realized? Deep learning promises to be an enormously powerful new tool
that allows for the unstructured “prediction” of physical or logical events
in contexts where algorithms based on a static set of program instructions
(such as classic statistical methods) perform poorly. The development of this
new approach to prediction enables a new approach to undertaking scientifi c
and technical research. Rather than focusing on small well- characterized
data sets or testing settings, it is now possible to proceed by identifying large
pools of unstructured data that can be used to dynamically develop highly
accurate predictions of technical and behavioral phenomena. In pioneering
an unstructured approach to predictive drug candidate selection that brings
together a vast array of previously disparate clinical and biophysical data,
for example, Atomwise may fundamentally reshape the “ideas production
function” in drug discovery.
If advances in deep learning do represent the arrival of a general purpose
IMI, it is clear that there are likely to be very signifi cant long- run economic,
social, and technological consequences. First, as this new IMI diff uses across
many application sectors, the resulting explosion in technological oppor-
The Impact of Artifi cial Intelligence on Innovation 127
Table 4.1
General purpose technologies versus methods of invention
General purpose technology
NO
YES
Industrial robots (e.g.,
“Sense & react robots (e.g.,
NO
Fanuc R2000)
autonomous vehicles)
Invention of a
method of invention
YES
Statically coded algorithmic
Deep learning
tools (e.g., fMRI)
tunities and increased productivity of research and development (R&D)
seem likely to generate economic growth that can eclipse any near- term
impact of AI on jobs, organizations, and productivity. A more subtle impli-
cation of this point is that “past is not prologue”: even if automation over
the recent past has resulted in job displacement (e.g., Acemoglu and Restrepo
2017), AI is likely to have at least as important an impact through its ability to
enhance the potential for “new tasks” (as in Acemoglu and Restrepo 2018).
Second, the arrival of a general purpose IMI is a suffi
ciently uncom-
mon occurrence that its impact could be profound for economic growth
and its broader impact on society. There have been only a handful of pre-
vious general purpose IMIs and each of these has had an enormous impact,
not primarily through their direct eff ects (e.g., spectacles, in the case of the
invention of optical lenses), but through their ability to reshape the ideas
production function itself (e.g., telescopes and microscopes). It would there-
fore be helpful to understand the extent to which deep learning is, or will,
cause researchers to signifi cantly shift or reorient their approach in order to
enhance research productivity (in the spirit of Jones [2009]).
Finally, if deep learning does indeed prove to be a general purpose IMI,
it will be important to develop institutions and a policy environment that
is conductive to enhancing innovation through this approach, and to do so
in a way that promotes competition and social welfare. A central concern
here may be the interplay between a key input required for deep learning—
large unstructured databases that provide information about physical or
logical events—and the nature of competition. While the underlying algo-
rithms for deep learning are in the public domain (and can and are being
improved on rapidly), the data pools that are essential to generate predic-
tions may be public or private, and access to them will depend on orga-
nizational b
oundaries, policy, and institutions. Because the performance
of deep learning algorithms depends critically on the training data that
they are created from, it may be possible, in a particular application area,
for a specifi c company (either an incumbent or start-up) to gain a signifi -
cant, persistent innovation advantage through their control over data that is
independent of traditional economies of scale or demand- side network
eff ects. This “competition for the market” is likely to have several conse-
128 Iain M. Cockburn, Rebecca Henderson, and Scott Stern
quences. First, it creates incentives for duplicative racing to establish a data
advantage in particular application sectors (say, search, autonomous driv-
ing, or cytology) followed by the establishment of durable barriers to entry
that may be of signifi cant concern for competition policy. Perhaps even
more important, this kind of behavior could result in a balkanization of
data within each sector, not only reducing innovative productivity within the
sector, but also reducing spillovers back to the deep learning GPT sector, and
to other application sectors. This suggests that the proactive development
of institutions and policies that encourage competition, data sharing, and
openness is likely to be an important determinant of economic gains from
the development and application of deep learning.
Our discussion so far has been largely speculative, and it would be useful
to know whether our claim that deep learning may be both a general purpose
IMI and a GPT, while symbolic logic and robotics are probably not, have
any empirical basis. We turn in the next section to a preliminary examination
of the evolution of AI as revealed by bibliometric data, with an eye toward
answering this question.
4.5 Data
This analysis draws upon two distinct data sets, one that captures a set of
AI publications from Thompson Reuters Web of Science, and another that
identifi es a set of AI patents issued by the US Patent and Trademark Offi
ce
(USPTO). In this section, we provide detail on the assembly of these data
sets and summary statistics for variables in the sample.
As previously discussed, peer- reviewed and public domain literature on
AI points to the existence of three distinct fi elds within AI: robotics, learn-
ing systems, and symbol systems, each composed of numerous subfi elds. To
track development of each of these using this data, we began by identifying
the publications and patents falling into each of these three fi elds based on
keywords. Appendix table 4A.1 lists the terms we used to defi ne each fi eld
and identify the papers and patents belonging to it.2 In short, the robotics
fi eld includes approaches in which a system engages with and responds to
environmental conditions; the symbolic systems fi eld attempts to represent
complex concepts through logical manipulation of symbolic representa-
tions, and the learning systems fi eld processes data through analytical pro-
grams modeled on neurologic systems.
4.5.1 Publication Sample and Summary Statistics
Our analysis focuses on journal articles and book publications through
the Web of Science from 1955 to 2015. We conducted a keyword search
utilizing the keywords described in appendix table 4A.1 (we tried several
2. Ironically enough, we relied upon human intelligence rather than machine learning to develop this classifi cation system and apply it to this data set.
The Impact of Artifi cial Intelligence on Innovation 129
variants of these keywords and alternative algorithmic approaches, but this
did not result in a meaningful diff erence in the publication set). We are able
to gather detailed information about each publication, including publica-
tion year, journal information, topical information, as well as author and
institutional affi
liations.
This search yields 98,124 publications. We then code each publication into
one of the three main fi elds of AI, as described earlier. Overall, relative to an
initial data set of 98,124, we are able to uniquely classify 95,840 publications
as symbolic systems, learning systems, robotics, or “general” AI (we drop
papers that involve combinations of these three fi elds). Table 4.2 reports the
summary statistics for this sample.
Of the 95,840 publications in the sample, 11,938 (12.5 percent) are clas-
sifi ed as symbolic systems, 58,853 (61.4 percent) as learning, and 20,655
(21.6 percent) as robotics, with the remainder being in the general fi eld of
“artifi cial intelligence.” To derive a better understanding of the factors that
have shaped the evolution of AI, we create indicators for variables of interest
including organization type (private versus academic), location type (US
domestic versus international), and application type (computer science ver-
sus other application area, in addition to individual subject spaces, e.g.,
biology, materials science, medicine, physics, economics, etc.).
We identify organization type as academic if the organization of one of
the authors on the publication is an academic institution; 81,998 publica-
tions (85.5 percent) and 13,842 (14.4 percent) are produced by academic and
private- sector authors, respectively. We identify publication location as US
domestic if one of the authors on the publication lists the United States as
his or her primary location; 22,436 publications (25 percent of the sample)
are produced domestically.
We also diff erentiate between subject matter. Forty- four percent of the
publications are classifi ed as computer science, with 56 percent classifi ed as
other applications. Summary statistics on the other applications are pro-
vided in table 4.3. The other subjects with the largest number of publica-
tions in the sample include telecommunications (5.5 percent), mathematics
Table 4.2
Publication data summary statistics
Mean
Std. dev.
Min.
Max.
Publication year
2007
6.15
1990
2015
Symbolic systems
.12
.33
0
1
Learning systems
.61
.48
0
1
Robotics
.21
.41
0
1
Artifi cial intelligence
.06
.23
0
1
Computer science
.44
.50
0
1
Other applications
.56
.50
0
1
US domestic
.25
.43
0
1
International
.75
.43
0
1
Observations
95,840
130 Iain M. Cockburn, Rebecca Henderson, and Scott Stern
Table 4.3
Distribution of publications across subjects
Mean
Std. dev.
Biology
.034
<
br /> .18
Economics
.028
.16
Physics
.034
.18
Medicine
.032
.18
Chemistry
.038
.19
Mathematics
.042
.20
Materials science
.029
.17
Neurology
.038
.19
Energy
.015
.12
Radiology
.015
.12
Telecommunications
.055
.23
Computer science
.44
.50
Observations
95,840
(4.2), neurology (3.8), chemistry (3.7), physics (3.4), biology (3.4), and medi-
cine (3.1).
Finally, we create indicator variables to document publication quality
including journal quality (top ten, top twenty- fi ve, and top fi fty journals
by impact factor)3 and a count variable for cumulative citation counts. Less
than 1 percent of publications are in a top ten journal, with 2 percent and
10 percent in top twenty- fi ve and top fi fty journals, respectively. The average
citation count for a publication in the sample is 4.9.
4.5.2 Patent Sample and Summary Statistics
We undertake a similar approach for gathering a data set of AI patents.
We start with the public- use fi le of USPTO patents (Marco, Carley, et al.
2015; Marco, Myers, et al. 2015), and fi lter the data in two ways. First,
we assemble a subset of data by fi ltering the USPTO Historical Master-
fi le on the US Patent Classifi cation System (USPC) number.4 Specifi cally,
USPC numbers 706 and 901 represent “artifi cial intelligence” and “robots,”
respectively. Within USPC 706, there are numerous subclasses including
“fuzzy logic hardware,” “plural processing systems,” “machine learning,”
and “knowledge processing systems,” to name a few. We then use the USPC
subclass to identify patents in AI fi elds of symbolic systems, learning sys-
tems, and robotics. We drop patents prior to 1990, providing a sample of
7,347 patents through 2014.
Second, we assemble another subset of AI patents by conducting a title
3. The rankings are collected from Guide2Research, found here: http:// www .guide2research