The Economics of Artificial Intelligence

Page 24

by Ajay Agrawal

.com/ journals/.

4. We utilized data from the Historical Patent Data Files. The complete (unfi ltered) data sets from which we derived our data set are available here: https:// www .uspto .gov/ learning- and

- resources/ electronic- data- products/ historical- patent- data- fi les.

The Impact of Artifi cial Intelligence on Innovation 131

Table 4.4

Patent data summary statistics

Mean

Std. dev.

Min.

Max.

Application year

2003

6.68

1982

2014

Patent year

2007

6.98

1990

2014

Symbolic systems

.29

.45

0

1

Learning systems

.28

.45

0

1

Robotics

.41

.49

0

1

Artifi cial intelligence

.04

.19

0

1

Computer science

.77

.42

0

1

Other applications

.23

.42

0

1

US domestic fi rms

.59

.49

0

1

International fi rms

.41

.49

0

1

Org. type academic

.07

.26

0

1

Org. type private

.91

.29

0

1

Observations

13,615

search on patents, with the search terms being the same keywords used to

identify academic publications in AI.5 This provides an additional 8,640 AI

patents. We then allocate each patent into an AI fi eld by associating the rele-

vant search term with one of the overarching fi elds. For example, a patent

that is found through the search term “neural network,” is then classifi ed as

a “learning” patent. Some patents found through this search method will be

duplicative of those identifi ed by USPC search, that is, the USPC class will

be 706 or 901. We drop those duplicates. Together these two subsets create

a sample of 13,615 unique AI patents. Summary statistics are provided in

table 4.4.

In contrast to the distribution of learning systems, symbolic systems, and

robotics in the publication data, the three fi elds are more evenly distributed

in the patent data: 3,832 (28 percent) learning system patents, 3,930 (29 per-

cent) symbolic system patents, and 5,524 (40 percent) robotics patents. The

remaining patents are broadly classifi ed only as AI.

Using ancillary data sets to the USPTO Historical Masterfi le, we are able

to integrate variables of interest related to organization type, location, and

application space. For example, patent assignment data tracks ownership

of patents across time. Our interest in this analysis relates to upstream inno-

vative work, and for this reason we capture the initial patent assignee by

organization for each patent in our sample. This data enables the creation of

indicator variables for organization type and location. We create an indica-

tor for academic organization type by searching the name of the assignee for

words relating to academic institutions, for example, “university,” “college,”

5. We utilized data from the Document ID Dataset that is complementary to patent assignment data available on the USPTO website. The complete (unfi ltered) data sets from which we derived our data set are available here: https:// www .uspto .gov/ learning- and- resources/

electronic- data- products/ patent- assignment- dataset.

132 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

or “institution.” We do the same for private- sector organizations, searching

for “corp.,” “business,” “inc.,” or “co.,” to name a few. We also search for

the same words or abbreviations utilized in other languages, for example,

“S.p.A.” Only 7 percent of the sample is awarded to academic organiza-

tions, while 91 percent is awarded to private entities. The remaining patents

are assigned to government entities, for example, the US Department of

Defense.

Similarly, we create indicator variables for patents assigned to US fi rms

and international fi rms, based on the country of the assignee. The inter-

national fi rm data can also be more narrowly identifi ed by specifi c country

(e.g., Canada) or region (e.g., European Union). Fifty- nine percent of our

patent sample is assigned to US domestic fi rms, while 41 percent is assigned

to international fi rms. Next to the United States, fi rms from non- Chinese,

Asian nations account for 28 percent of patents in the sample. Firms from

Canada are assigned 1.2 percent of the patents, and fi rms from China,

0.4 percent.

Additionally, the USPTO data includes NBER classifi cation and subclas-

sifi cation for each patent (Hall, Jaff e, and Trajtenberg 2001; Marco, Carley,

et al. 2015). These subclassifi cations provide some granular detail about

the application sector for which the patent is intended. We create indicator

variables for NBER subclassifi cations related to chemicals (NBER subclass

11, 12, 13, 14, 15, 19), communications (21), computer hardware and soft-

ware (22), computer science peripherals (23), data and storage (24), business

software (25), medical fi elds (31, 32, 33, and 39), electronics fi elds (41, 42,

43, 44, 45, 46, and 49), automotive fi elds (53, 54, 55), mechanical fi elds (51,

52, 59), and other fi elds (remaining). The vast majority of these patents (71

percent) are in NBER subclass 22, computer hardware and software. Sum-

mary statistics of the distribution of patents across application sectors are

provided in table 4.5.

Table 4.5

Distribution of patents across application sectors

Mean

Std. dev.

Chemicals

.007

.08

Communications

.044

.20

Computer hardware and software

.710

.45

Computer peripherals

.004

.06

Data and storage

.008

.09

Business software

.007

.09

All computer science

.773

.42

Medical

.020

.14

Electronics

.073

.26

Automotive

.023

.15

Mechanical

.075

.26

Other

.029

.16

Observations

13,615

The Impact of Artifi cial Intelligence on Innovation 133

4.6 Deep Learning as a GPT: An Exploratory Empirical Analysis

These data allow us to begin examining the claim that the technologies

of deep learning may be the nucleus of a general purpose invention for the

method of invention.
>
We begin in fi gures 4.1A and 4.1B with a simple description of the evolu-

tion over time of the three main fi elds identifi ed in the corpus of patents and

Fig. 4.1A Publications by AI fi eld over time

Fig. 4.1B Patents by AI fi eld over time

134 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

papers. The fi rst insight is that the overall fi eld of AI has experienced sharp

growth since 1990. While there are only a small handful of papers (less than

one hundred per year) at the beginning of the period, each of the three fi elds

now generates more than one thousand papers per year. At the same time,

there is a striking divergence in activity across fi elds: each start from a similar

base, but there is a steady increase in the deep learning publications relative

to robotics and symbolic systems, particularly after 2009. Interestingly, at

least through the end of 2014, there is more similarity in the patterns for

all three fi elds in terms of patenting, with robotics patenting continuing to

hold a lead over learning and symbolic systems. However, there does seem

to be an acceleration of learning- oriented patents in the last few years of the

sample, and so there may be a relative shift toward learning over the last few

years, which will manifest itself over time as publication and examination

lags work their way through.

Within the publication data, there are striking variations across geogra-

phies. Figure 4.2A shows the overall growth in learning publications for the

United States versus rest- of-world, and fi gure 4.2B maps the fraction of

publications within each geography that are learning related. In the United

States, learning is far more variable. Prior to 2000 the United States has a

roughly equivalent share of learning- related publications, but the United

States then falls signifi cantly behind, only catching up again around 2013.

This is consistent with the suggestion in qualitative histories of AI that

learning research has had a “faddish” quality in the United States, with the

additional insight that the rest of the world (notably Canada) seems to have

taken advantage of this inconsistent focus in the United States to develop

capabilities and comparative advantage in this fi eld.

Fig. 4.2A Academic institution publication fraction by AI fi eld

The Impact of Artifi cial Intelligence on Innovation 135

Fig. 4.2B Fraction of learning publications by US versus world

With these broad patterns in mind, we turn to our key empirical exercise:

whether late in the fi rst decade of the twenty- fi rst century deep learning

shifted more toward “application- oriented” research than either robotics or

symbolic systems. We begin in fi gure 4.3 with a simple graph that examines

the number of publications over time (across all three fi elds) in computer

science journals versus application- oriented outlets. While there has actually

been a stagnation (even a small decline) in the overall number of AI publi-

cations in computer science journals, there has been a dramatic increase in

the number of AI- related publications in application- oriented outlets. By

the end of 2015, we estimate that nearly two- thirds of all publications in AI

were in fi elds beyond computer science.

In fi gure 4.4 we then look at this division by fi eld. Several patterns are

worthy of note. First, as earlier, we can see the relative growth through 2009

of publications in learning versus the two other fi elds. Also, consistent with

more qualitative accounts of the fi elds, we see the relative stagnation of

symbolic systems research relative to robotics and learning. But, after 2009,

there is a signifi cant increase in application publications in both robotics and

learning, but that the learning boost is both steeper and more long- lived.

Over the course of just seven years, learning- oriented application publica-

tions more than double in number, and now represent just under 50 percent

of all AI publications.6

These patterns are, if anything, even more striking if one disaggregates

6. The precise number of publications for 2015 is estimated from the experience of the fi rst nine months (the Web of Science data run through September 30, 2015). We apply a linear multiplier for the remaining three months (i.e., estimating each category by 4/ 3).

136 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

Fig. 4.3 Publications in computer science versus application journals

Fig. 4.4 Publications in computer science versus application journals by AI fi eld

them by the geographic origin of the publication. In fi gure 4.5, we chart

rates of publication in computer science versus applications for the United

States as compared to the rest of the world. The striking upward swing in

AI application papers that begins in 2009 turns out to be overwhelmingly

driven by publications ex United States, though US researchers begin a

period of catch-up at an accelerating pace toward the fi nal few years of the

sample.

The Impact of Artifi cial Intelligence on Innovation 137

Fig. 4.5 Learning publications in computer science versus applications by United

States versus ROW

Finally, we look at how publications have varied across application sectors

over time. In table 4.6, we examine the number of publications by applica-

tion fi eld in each of the three areas of AI across two three- year cohorts

(2004– 2006 and 2013– 2015). There are a number of patterns of interest.

First and most important, in a range of application fi elds including medi-

cine, radiology, and economics, there is a large relative increase in learning-

oriented publications relative to robotics and symbolic systems. A number

of other sectors, including neuroscience and biology, realize a large increase

in both learning- oriented research as well as other AI fi elds. There are also

some more basic fi elds such as mathematics that have experienced a relative

decline in publications (indeed, learning- oriented publications in mathe-

matics experienced a small absolute decline, a striking diff erence relative

to most other fi elds in the sample). Overall, though it would be useful to

identify more precisely the type of research that is being conducted and

what is happening at the level of particular subfi elds, these results are con-

sistent with our broader hypothesis that, alongside the overall growth of

AI, learning- oriented research may represent a general purpose technology

that is now beginning to be exploited far more systematically across a wide

range of application sectors. (See table 4.7.)

Together, these preliminary fi ndings provide some direct empirical evi-

dence for at least one of our hypotheses: learning- oriented AI seems to

have some of the signature hallmarks of a general purpose technology. Bib-

liometric indicators of innovation show that it is rapidly developing, and is

being applied in many sectors—and these application sectors themselves

include some of the most technologically dynamic parts of the economy.

. Sci.

8

18

–

36

827

3,889

4582

1,431<
br />
1,322

1,125

Comp

39

39

51

88

73

291

404

653

401

–

elecom.T

gy

2

94

98

47

25

47

82

56

186

–

– 3

adioloR

gy

6

58

18

15

22

47

172

272

200

Ener

. o

31

35

73

271

970

258

139

348

109

Neur

terials

36

32

209

429

105

225

525

101

216

Ma

th

1

45

80

78

54

60

11

417

414

–

Ma

2015

7

51

24

92

325

490

283

139

149

ersus 2013–

Chemistry

2006 v

3

69

83

20

96

84

231

516

123

– 1

Medicine

eld, 2004–

ysics

13

52

68

84

343

388

122

135

125

Ph

8

45

10

12

20

10

25

292

423

oss sectors by AI fi

Economics

gy

33

65

97

93

13

258

600

133

‹ Prev Next ›