Book Read Free

The Economics of Artificial Intelligence

Page 24

by Ajay Agrawal


  .com/ journals/.

  4. We utilized data from the Historical Patent Data Files. The complete (unfi ltered) data sets from which we derived our data set are available here: https:// www .uspto .gov/ learning- and

  - resources/ electronic- data- products/ historical- patent- data- fi les.

  The Impact of Artifi cial Intelligence on Innovation 131

  Table 4.4

  Patent data summary statistics

  Mean

  Std. dev.

  Min.

  Max.

  Application year

  2003

  6.68

  1982

  2014

  Patent year

  2007

  6.98

  1990

  2014

  Symbolic systems

  .29

  .45

  0

  1

  Learning systems

  .28

  .45

  0

  1

  Robotics

  .41

  .49

  0

  1

  Artifi cial intelligence

  .04

  .19

  0

  1

  Computer science

  .77

  .42

  0

  1

  Other applications

  .23

  .42

  0

  1

  US domestic fi rms

  .59

  .49

  0

  1

  International fi rms

  .41

  .49

  0

  1

  Org. type academic

  .07

  .26

  0

  1

  Org. type private

  .91

  .29

  0

  1

  Observations

  13,615

  search on patents, with the search terms being the same keywords used to

  identify academic publications in AI.5 This provides an additional 8,640 AI

  patents. We then allocate each patent into an AI fi eld by associating the rele-

  vant search term with one of the overarching fi elds. For example, a patent

  that is found through the search term “neural network,” is then classifi ed as

  a “learning” patent. Some patents found through this search method will be

  duplicative of those identifi ed by USPC search, that is, the USPC class will

  be 706 or 901. We drop those duplicates. Together these two subsets create

  a sample of 13,615 unique AI patents. Summary statistics are provided in

  table 4.4.

  In contrast to the distribution of learning systems, symbolic systems, and

  robotics in the publication data, the three fi elds are more evenly distributed

  in the patent data: 3,832 (28 percent) learning system patents, 3,930 (29 per-

  cent) symbolic system patents, and 5,524 (40 percent) robotics patents. The

  remaining patents are broadly classifi ed only as AI.

  Using ancillary data sets to the USPTO Historical Masterfi le, we are able

  to integrate variables of interest related to organization type, location, and

  application space. For example, patent assignment data tracks ownership

  of patents across time. Our interest in this analysis relates to upstream inno-

  vative work, and for this reason we capture the initial patent assignee by

  organization for each patent in our sample. This data enables the creation of

  indicator variables for organization type and location. We create an indica-

  tor for academic organization type by searching the name of the assignee for

  words relating to academic institutions, for example, “university,” “college,”

  5. We utilized data from the Document ID Dataset that is complementary to patent assignment data available on the USPTO website. The complete (unfi ltered) data sets from which we derived our data set are available here: https:// www .uspto .gov/ learning- and- resources/

  electronic- data- products/ patent- assignment- dataset.

  132 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

  or “institution.” We do the same for private- sector organizations, searching

  for “corp.,” “business,” “inc.,” or “co.,” to name a few. We also search for

  the same words or abbreviations utilized in other languages, for example,

  “S.p.A.” Only 7 percent of the sample is awarded to academic organiza-

  tions, while 91 percent is awarded to private entities. The remaining patents

  are assigned to government entities, for example, the US Department of

  Defense.

  Similarly, we create indicator variables for patents assigned to US fi rms

  and international fi rms, based on the country of the assignee. The inter-

  national fi rm data can also be more narrowly identifi ed by specifi c country

  (e.g., Canada) or region (e.g., European Union). Fifty- nine percent of our

  patent sample is assigned to US domestic fi rms, while 41 percent is assigned

  to international fi rms. Next to the United States, fi rms from non- Chinese,

  Asian nations account for 28 percent of patents in the sample. Firms from

  Canada are assigned 1.2 percent of the patents, and fi rms from China,

  0.4 percent.

  Additionally, the USPTO data includes NBER classifi cation and subclas-

  sifi cation for each patent (Hall, Jaff e, and Trajtenberg 2001; Marco, Carley,

  et al. 2015). These subclassifi cations provide some granular detail about

  the application sector for which the patent is intended. We create indicator

  variables for NBER subclassifi cations related to chemicals (NBER subclass

  11, 12, 13, 14, 15, 19), communications (21), computer hardware and soft-

  ware (22), computer science peripherals (23), data and storage (24), business

  software (25), medical fi elds (31, 32, 33, and 39), electronics fi elds (41, 42,

  43, 44, 45, 46, and 49), automotive fi elds (53, 54, 55), mechanical fi elds (51,

  52, 59), and other fi elds (remaining). The vast majority of these patents (71

  percent) are in NBER subclass 22, computer hardware and software. Sum-

  mary statistics of the distribution of patents across application sectors are

  provided in table 4.5.

  Table 4.5

  Distribution of patents across application sectors

  Mean

  Std. dev.

  Chemicals

  .007

  .08

  Communications

  .044

  .20

  Computer hardware and software

  .710

  .45

  Computer peripherals

  .004

  .06

  Data and storage

  .008

  .09

  Business software

  .007

  .09

  All computer science

  .773

  .42

  Medical

  .020

  .14

  Electronics

  .073

  .26

  Automotive

  .023

  .15

  Mechanical

  .075

  .26

  Other

  .029

  .16

  Observations

  13,615

  The Impact of Artifi cial Intelligence on Innovation 133

  4.6 Deep Learning as a GPT: An Exploratory Empirical Analysis

  These data allow us to begin examining the claim that the technologies

  of deep learning may be the nucleus of a general purpose invention for the

  method of invention.
>
  We begin in fi gures 4.1A and 4.1B with a simple description of the evolu-

  tion over time of the three main fi elds identifi ed in the corpus of patents and

  Fig. 4.1A Publications by AI fi eld over time

  Fig. 4.1B Patents by AI fi eld over time

  134 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

  papers. The fi rst insight is that the overall fi eld of AI has experienced sharp

  growth since 1990. While there are only a small handful of papers (less than

  one hundred per year) at the beginning of the period, each of the three fi elds

  now generates more than one thousand papers per year. At the same time,

  there is a striking divergence in activity across fi elds: each start from a similar

  base, but there is a steady increase in the deep learning publications relative

  to robotics and symbolic systems, particularly after 2009. Interestingly, at

  least through the end of 2014, there is more similarity in the patterns for

  all three fi elds in terms of patenting, with robotics patenting continuing to

  hold a lead over learning and symbolic systems. However, there does seem

  to be an acceleration of learning- oriented patents in the last few years of the

  sample, and so there may be a relative shift toward learning over the last few

  years, which will manifest itself over time as publication and examination

  lags work their way through.

  Within the publication data, there are striking variations across geogra-

  phies. Figure 4.2A shows the overall growth in learning publications for the

  United States versus rest- of-world, and fi gure 4.2B maps the fraction of

  publications within each geography that are learning related. In the United

  States, learning is far more variable. Prior to 2000 the United States has a

  roughly equivalent share of learning- related publications, but the United

  States then falls signifi cantly behind, only catching up again around 2013.

  This is consistent with the suggestion in qualitative histories of AI that

  learning research has had a “faddish” quality in the United States, with the

  additional insight that the rest of the world (notably Canada) seems to have

  taken advantage of this inconsistent focus in the United States to develop

  capabilities and comparative advantage in this fi eld.

  Fig. 4.2A Academic institution publication fraction by AI fi eld

  The Impact of Artifi cial Intelligence on Innovation 135

  Fig. 4.2B Fraction of learning publications by US versus world

  With these broad patterns in mind, we turn to our key empirical exercise:

  whether late in the fi rst decade of the twenty- fi rst century deep learning

  shifted more toward “application- oriented” research than either robotics or

  symbolic systems. We begin in fi gure 4.3 with a simple graph that examines

  the number of publications over time (across all three fi elds) in computer

  science journals versus application- oriented outlets. While there has actually

  been a stagnation (even a small decline) in the overall number of AI publi-

  cations in computer science journals, there has been a dramatic increase in

  the number of AI- related publications in application- oriented outlets. By

  the end of 2015, we estimate that nearly two- thirds of all publications in AI

  were in fi elds beyond computer science.

  In fi gure 4.4 we then look at this division by fi eld. Several patterns are

  worthy of note. First, as earlier, we can see the relative growth through 2009

  of publications in learning versus the two other fi elds. Also, consistent with

  more qualitative accounts of the fi elds, we see the relative stagnation of

  symbolic systems research relative to robotics and learning. But, after 2009,

  there is a signifi cant increase in application publications in both robotics and

  learning, but that the learning boost is both steeper and more long- lived.

  Over the course of just seven years, learning- oriented application publica-

  tions more than double in number, and now represent just under 50 percent

  of all AI publications.6

  These patterns are, if anything, even more striking if one disaggregates

  6. The precise number of publications for 2015 is estimated from the experience of the fi rst nine months (the Web of Science data run through September 30, 2015). We apply a linear multiplier for the remaining three months (i.e., estimating each category by 4/ 3).

  136 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

  Fig. 4.3 Publications in computer science versus application journals

  Fig. 4.4 Publications in computer science versus application journals by AI fi eld

  them by the geographic origin of the publication. In fi gure 4.5, we chart

  rates of publication in computer science versus applications for the United

  States as compared to the rest of the world. The striking upward swing in

  AI application papers that begins in 2009 turns out to be overwhelmingly

  driven by publications ex United States, though US researchers begin a

  period of catch-up at an accelerating pace toward the fi nal few years of the

  sample.

  The Impact of Artifi cial Intelligence on Innovation 137

  Fig. 4.5 Learning publications in computer science versus applications by United

  States versus ROW

  Finally, we look at how publications have varied across application sectors

  over time. In table 4.6, we examine the number of publications by applica-

  tion fi eld in each of the three areas of AI across two three- year cohorts

  (2004– 2006 and 2013– 2015). There are a number of patterns of interest.

  First and most important, in a range of application fi elds including medi-

  cine, radiology, and economics, there is a large relative increase in learning-

  oriented publications relative to robotics and symbolic systems. A number

  of other sectors, including neuroscience and biology, realize a large increase

  in both learning- oriented research as well as other AI fi elds. There are also

  some more basic fi elds such as mathematics that have experienced a relative

  decline in publications (indeed, learning- oriented publications in mathe-

  matics experienced a small absolute decline, a striking diff erence relative

  to most other fi elds in the sample). Overall, though it would be useful to

  identify more precisely the type of research that is being conducted and

  what is happening at the level of particular subfi elds, these results are con-

  sistent with our broader hypothesis that, alongside the overall growth of

  AI, learning- oriented research may represent a general purpose technology

  that is now beginning to be exploited far more systematically across a wide

  range of application sectors. (See table 4.7.)

  Together, these preliminary fi ndings provide some direct empirical evi-

  dence for at least one of our hypotheses: learning- oriented AI seems to

  have some of the signature hallmarks of a general purpose technology. Bib-

  liometric indicators of innovation show that it is rapidly developing, and is

  being applied in many sectors—and these application sectors themselves

  include some of the most technologically dynamic parts of the economy.

  . Sci.

  8

  18

  –

  36

  827

  3,889

  4582

  1,431<
br />
  1,322

  1,125

  Comp

  39

  39

  51

  88

  73

  291

  404

  653

  401

  –

  elecom.T

  gy

  2

  94

  98

  47

  25

  47

  82

  56

  186

  –

  – 3

  adioloR

  gy

  6

  58

  18

  15

  22

  47

  172

  272

  200

  Ener

  . o

  31

  35

  73

  271

  970

  258

  139

  348

  109

  Neur

  terials

  36

  32

  209

  429

  105

  225

  525

  101

  216

  Ma

  th

  1

  45

  80

  78

  54

  60

  11

  417

  414

  –

  Ma

  2015

  7

  51

  24

  92

  325

  490

  283

  139

  149

  ersus 2013–

  Chemistry

  2006 v

  3

  69

  83

  20

  96

  84

  231

  516

  123

  – 1

  Medicine

  eld, 2004–

  ysics

  13

  52

  68

  84

  343

  388

  122

  135

  125

  Ph

  8

  45

  10

  12

  20

  10

  25

  292

  423

  oss sectors by AI fi

  Economics

  gy

  33

  65

  97

  93

  13

  258

  600

  133

 

‹ Prev