The Economics of Artificial Intelligence

Page 23

by Ajay Agrawal

deep learning are by and large innovations that require a signifi cant level of

human planning and that apply to a relatively narrow domain of problem-

solving (e.g., face recognition, playing Go, picking up a particular object,

etc.) While it is, of course, possible that further breakthroughs will lead to

a technology that can meaningfully mimic the nature of human subjective

intelligence and emotion, the recent advances that have attracted scientifi c

and commercial attention are well removed from these domains.

Second, though most economic and policy analysis of AI draws out con-

sequences from the last two decades of automation to consider the future

economic impact of AI (e.g., in job displacement for an ever- increasing

number of tasks), it is important to emphasize that there is a sharp diff erence

between the advances in robotics that were a primary focus of applications

of AI research during the fi rst decade of the twenty- fi rst century and the

potential applications of deep learning that have come to the fore over the

last few years.

As we suggested earlier, current advances in robotics are by and large

associated with applications that are highly specialized and that are focused

126 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

on end- user applications rather than on the innovation process itself, and

these advances do not seem as of yet to have translated to a more gener-

ally applicable IMI. Robotics is therefore an area where we might focus

on the impact of innovation (improved performance) and diff usion (more

widespread application) in terms of job displacement versus job enhance-

ment. We see limited evidence as yet of widespread applications of robotics

outside industrial automation, or of the scale of improvements in the ability

to sense, react to, and manipulate the physical environment that the use of

robotics outside manufacturing probably requires. But there are exceptions:

developments in the capabilities of “pick and place” robots and rapid pro-

gress in autonomous vehicles point to the possibility for robotics to escape

manufacturing and become much more broadly used. Advances in robotics

may well reveal this area of AI be a GPT, as defi ned by the classic criteria.

Some research tools/ IMIs based on algorithms have transformed the

nature of research in some fi elds, but have lacked generality. These types

of algorithmic research tools, based on a static set of program instructions,

are a valuable IMI, but do not appear to have wide applicability outside a

specifi c domain and do not qualify as GPTs. For example, while far from

perfect, powerful algorithms to scan brain images (so- called functional mag-

netic resonance imaging [MRI]) have transformed our understanding of the

human brain, not only through the knowledge they have generated, but also

by establishing an entirely new paradigm and protocol for brain research.

However, despite its role as a powerful IMI, fMRI lacks the type of general

purpose applicability that has been associated with the most important

GPTs. In contrast, the latest advances in deep learning have the potential to

be both a general purpose IMI and a classic GPT.

Table 4.1 summarizes these ideas.

How might the promise of deep learning as a general purpose IMI be

realized? Deep learning promises to be an enormously powerful new tool

that allows for the unstructured “prediction” of physical or logical events

in contexts where algorithms based on a static set of program instructions

(such as classic statistical methods) perform poorly. The development of this

new approach to prediction enables a new approach to undertaking scientifi c

and technical research. Rather than focusing on small well- characterized

data sets or testing settings, it is now possible to proceed by identifying large

pools of unstructured data that can be used to dynamically develop highly

accurate predictions of technical and behavioral phenomena. In pioneering

an unstructured approach to predictive drug candidate selection that brings

together a vast array of previously disparate clinical and biophysical data,

for example, Atomwise may fundamentally reshape the “ideas production

function” in drug discovery.

If advances in deep learning do represent the arrival of a general purpose

IMI, it is clear that there are likely to be very signifi cant long- run economic,

social, and technological consequences. First, as this new IMI diff uses across

many application sectors, the resulting explosion in technological oppor-

The Impact of Artifi cial Intelligence on Innovation 127

Table 4.1

General purpose technologies versus methods of invention

General purpose technology

NO

YES

Industrial robots (e.g.,

“Sense & react robots (e.g.,

NO

Fanuc R2000)

autonomous vehicles)

Invention of a

method of invention

YES

Statically coded algorithmic

Deep learning

tools (e.g., fMRI)

tunities and increased productivity of research and development (R&D)

seem likely to generate economic growth that can eclipse any near- term

impact of AI on jobs, organizations, and productivity. A more subtle impli-

cation of this point is that “past is not prologue”: even if automation over

the recent past has resulted in job displacement (e.g., Acemoglu and Restrepo

2017), AI is likely to have at least as important an impact through its ability to

enhance the potential for “new tasks” (as in Acemoglu and Restrepo 2018).

Second, the arrival of a general purpose IMI is a suffi

ciently uncom-

mon occurrence that its impact could be profound for economic growth

and its broader impact on society. There have been only a handful of pre-

vious general purpose IMIs and each of these has had an enormous impact,

not primarily through their direct eff ects (e.g., spectacles, in the case of the

invention of optical lenses), but through their ability to reshape the ideas

production function itself (e.g., telescopes and microscopes). It would there-

fore be helpful to understand the extent to which deep learning is, or will,

cause researchers to signifi cantly shift or reorient their approach in order to

enhance research productivity (in the spirit of Jones [2009]).

Finally, if deep learning does indeed prove to be a general purpose IMI,

it will be important to develop institutions and a policy environment that

is conductive to enhancing innovation through this approach, and to do so

in a way that promotes competition and social welfare. A central concern

here may be the interplay between a key input required for deep learning—

large unstructured databases that provide information about physical or

logical events—and the nature of competition. While the underlying algo-

rithms for deep learning are in the public domain (and can and are being

improved on rapidly), the data pools that are essential to generate predic-

tions may be public or private, and access to them will depend on orga-

nizational b
oundaries, policy, and institutions. Because the performance

of deep learning algorithms depends critically on the training data that

they are created from, it may be possible, in a particular application area,

for a specifi c company (either an incumbent or start-up) to gain a signifi -

cant, persistent innovation advantage through their control over data that is

independent of traditional economies of scale or demand- side network

eff ects. This “competition for the market” is likely to have several conse-

128 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

quences. First, it creates incentives for duplicative racing to establish a data

advantage in particular application sectors (say, search, autonomous driv-

ing, or cytology) followed by the establishment of durable barriers to entry

that may be of signifi cant concern for competition policy. Perhaps even

more important, this kind of behavior could result in a balkanization of

data within each sector, not only reducing innovative productivity within the

sector, but also reducing spillovers back to the deep learning GPT sector, and

to other application sectors. This suggests that the proactive development

of institutions and policies that encourage competition, data sharing, and

openness is likely to be an important determinant of economic gains from

the development and application of deep learning.

Our discussion so far has been largely speculative, and it would be useful

to know whether our claim that deep learning may be both a general purpose

IMI and a GPT, while symbolic logic and robotics are probably not, have

any empirical basis. We turn in the next section to a preliminary examination

of the evolution of AI as revealed by bibliometric data, with an eye toward

answering this question.

4.5 Data

This analysis draws upon two distinct data sets, one that captures a set of

AI publications from Thompson Reuters Web of Science, and another that

identifi es a set of AI patents issued by the US Patent and Trademark Offi

ce

(USPTO). In this section, we provide detail on the assembly of these data

sets and summary statistics for variables in the sample.

As previously discussed, peer- reviewed and public domain literature on

AI points to the existence of three distinct fi elds within AI: robotics, learn-

ing systems, and symbol systems, each composed of numerous subfi elds. To

track development of each of these using this data, we began by identifying

the publications and patents falling into each of these three fi elds based on

keywords. Appendix table 4A.1 lists the terms we used to defi ne each fi eld

and identify the papers and patents belonging to it.2 In short, the robotics

fi eld includes approaches in which a system engages with and responds to

environmental conditions; the symbolic systems fi eld attempts to represent

complex concepts through logical manipulation of symbolic representa-

tions, and the learning systems fi eld processes data through analytical pro-

grams modeled on neurologic systems.

4.5.1 Publication Sample and Summary Statistics

Our analysis focuses on journal articles and book publications through

the Web of Science from 1955 to 2015. We conducted a keyword search

utilizing the keywords described in appendix table 4A.1 (we tried several

2. Ironically enough, we relied upon human intelligence rather than machine learning to develop this classifi cation system and apply it to this data set.

The Impact of Artifi cial Intelligence on Innovation 129

variants of these keywords and alternative algorithmic approaches, but this

did not result in a meaningful diff erence in the publication set). We are able

to gather detailed information about each publication, including publica-

tion year, journal information, topical information, as well as author and

institutional affi

liations.

This search yields 98,124 publications. We then code each publication into

one of the three main fi elds of AI, as described earlier. Overall, relative to an

initial data set of 98,124, we are able to uniquely classify 95,840 publications

as symbolic systems, learning systems, robotics, or “general” AI (we drop

papers that involve combinations of these three fi elds). Table 4.2 reports the

summary statistics for this sample.

Of the 95,840 publications in the sample, 11,938 (12.5 percent) are clas-

sifi ed as symbolic systems, 58,853 (61.4 percent) as learning, and 20,655

(21.6 percent) as robotics, with the remainder being in the general fi eld of

“artifi cial intelligence.” To derive a better understanding of the factors that

have shaped the evolution of AI, we create indicators for variables of interest

including organization type (private versus academic), location type (US

domestic versus international), and application type (computer science ver-

sus other application area, in addition to individual subject spaces, e.g.,

biology, materials science, medicine, physics, economics, etc.).

We identify organization type as academic if the organization of one of

the authors on the publication is an academic institution; 81,998 publica-

tions (85.5 percent) and 13,842 (14.4 percent) are produced by academic and

private- sector authors, respectively. We identify publication location as US

domestic if one of the authors on the publication lists the United States as

his or her primary location; 22,436 publications (25 percent of the sample)

are produced domestically.

We also diff erentiate between subject matter. Forty- four percent of the

publications are classifi ed as computer science, with 56 percent classifi ed as

other applications. Summary statistics on the other applications are pro-

vided in table 4.3. The other subjects with the largest number of publica-

tions in the sample include telecommunications (5.5 percent), mathematics

Table 4.2

Publication data summary statistics

Mean

Std. dev.

Min.

Max.

Publication year

2007

6.15

1990

2015

Symbolic systems

.12

.33

0

1

Learning systems

.61

.48

0

1

Robotics

.21

.41

0

1

Artifi cial intelligence

.06

.23

0

1

Computer science

.44

.50

0

1

Other applications

.56

.50

0

1

US domestic

.25

.43

0

1

International

.75

.43

0

1

Observations

95,840

130 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

Table 4.3

Distribution of publications across subjects

Mean

Std. dev.

Biology

.034
<
br /> .18

Economics

.028

.16

Physics

.034

.18

Medicine

.032

.18

Chemistry

.038

.19

Mathematics

.042

.20

Materials science

.029

.17

Neurology

.038

.19

Energy

.015

.12

Radiology

.015

.12

Telecommunications

.055

.23

Computer science

.44

.50

Observations

95,840

(4.2), neurology (3.8), chemistry (3.7), physics (3.4), biology (3.4), and medi-

cine (3.1).

Finally, we create indicator variables to document publication quality

including journal quality (top ten, top twenty- fi ve, and top fi fty journals

by impact factor)3 and a count variable for cumulative citation counts. Less

than 1 percent of publications are in a top ten journal, with 2 percent and

10 percent in top twenty- fi ve and top fi fty journals, respectively. The average

citation count for a publication in the sample is 4.9.

4.5.2 Patent Sample and Summary Statistics

We undertake a similar approach for gathering a data set of AI patents.

We start with the public- use fi le of USPTO patents (Marco, Carley, et al.

2015; Marco, Myers, et al. 2015), and fi lter the data in two ways. First,

we assemble a subset of data by fi ltering the USPTO Historical Master-

fi le on the US Patent Classifi cation System (USPC) number.4 Specifi cally,

USPC numbers 706 and 901 represent “artifi cial intelligence” and “robots,”

respectively. Within USPC 706, there are numerous subclasses including

“fuzzy logic hardware,” “plural processing systems,” “machine learning,”

and “knowledge processing systems,” to name a few. We then use the USPC

subclass to identify patents in AI fi elds of symbolic systems, learning sys-

tems, and robotics. We drop patents prior to 1990, providing a sample of

7,347 patents through 2014.

Second, we assemble another subset of AI patents by conducting a title

3. The rankings are collected from Guide2Research, found here: http:// www .guide2research

‹ Prev Next ›