Book Read Free

The Economics of Artificial Intelligence

Page 23

by Ajay Agrawal


  deep learning are by and large innovations that require a signifi cant level of

  human planning and that apply to a relatively narrow domain of problem-

  solving (e.g., face recognition, playing Go, picking up a particular object,

  etc.) While it is, of course, possible that further breakthroughs will lead to

  a technology that can meaningfully mimic the nature of human subjective

  intelligence and emotion, the recent advances that have attracted scientifi c

  and commercial attention are well removed from these domains.

  Second, though most economic and policy analysis of AI draws out con-

  sequences from the last two decades of automation to consider the future

  economic impact of AI (e.g., in job displacement for an ever- increasing

  number of tasks), it is important to emphasize that there is a sharp diff erence

  between the advances in robotics that were a primary focus of applications

  of AI research during the fi rst decade of the twenty- fi rst century and the

  potential applications of deep learning that have come to the fore over the

  last few years.

  As we suggested earlier, current advances in robotics are by and large

  associated with applications that are highly specialized and that are focused

  126 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

  on end- user applications rather than on the innovation process itself, and

  these advances do not seem as of yet to have translated to a more gener-

  ally applicable IMI. Robotics is therefore an area where we might focus

  on the impact of innovation (improved performance) and diff usion (more

  widespread application) in terms of job displacement versus job enhance-

  ment. We see limited evidence as yet of widespread applications of robotics

  outside industrial automation, or of the scale of improvements in the ability

  to sense, react to, and manipulate the physical environment that the use of

  robotics outside manufacturing probably requires. But there are exceptions:

  developments in the capabilities of “pick and place” robots and rapid pro-

  gress in autonomous vehicles point to the possibility for robotics to escape

  manufacturing and become much more broadly used. Advances in robotics

  may well reveal this area of AI be a GPT, as defi ned by the classic criteria.

  Some research tools/ IMIs based on algorithms have transformed the

  nature of research in some fi elds, but have lacked generality. These types

  of algorithmic research tools, based on a static set of program instructions,

  are a valuable IMI, but do not appear to have wide applicability outside a

  specifi c domain and do not qualify as GPTs. For example, while far from

  perfect, powerful algorithms to scan brain images (so- called functional mag-

  netic resonance imaging [MRI]) have transformed our understanding of the

  human brain, not only through the knowledge they have generated, but also

  by establishing an entirely new paradigm and protocol for brain research.

  However, despite its role as a powerful IMI, fMRI lacks the type of general

  purpose applicability that has been associated with the most important

  GPTs. In contrast, the latest advances in deep learning have the potential to

  be both a general purpose IMI and a classic GPT.

  Table 4.1 summarizes these ideas.

  How might the promise of deep learning as a general purpose IMI be

  realized? Deep learning promises to be an enormously powerful new tool

  that allows for the unstructured “prediction” of physical or logical events

  in contexts where algorithms based on a static set of program instructions

  (such as classic statistical methods) perform poorly. The development of this

  new approach to prediction enables a new approach to undertaking scientifi c

  and technical research. Rather than focusing on small well- characterized

  data sets or testing settings, it is now possible to proceed by identifying large

  pools of unstructured data that can be used to dynamically develop highly

  accurate predictions of technical and behavioral phenomena. In pioneering

  an unstructured approach to predictive drug candidate selection that brings

  together a vast array of previously disparate clinical and biophysical data,

  for example, Atomwise may fundamentally reshape the “ideas production

  function” in drug discovery.

  If advances in deep learning do represent the arrival of a general purpose

  IMI, it is clear that there are likely to be very signifi cant long- run economic,

  social, and technological consequences. First, as this new IMI diff uses across

  many application sectors, the resulting explosion in technological oppor-

  The Impact of Artifi cial Intelligence on Innovation 127

  Table 4.1

  General purpose technologies versus methods of invention

  General purpose technology

  NO

  YES

  Industrial robots (e.g.,

  “Sense & react robots (e.g.,

  NO

  Fanuc R2000)

  autonomous vehicles)

  Invention of a

  method of invention

  YES

  Statically coded algorithmic

  Deep learning

  tools (e.g., fMRI)

  tunities and increased productivity of research and development (R&D)

  seem likely to generate economic growth that can eclipse any near- term

  impact of AI on jobs, organizations, and productivity. A more subtle impli-

  cation of this point is that “past is not prologue”: even if automation over

  the recent past has resulted in job displacement (e.g., Acemoglu and Restrepo

  2017), AI is likely to have at least as important an impact through its ability to

  enhance the potential for “new tasks” (as in Acemoglu and Restrepo 2018).

  Second, the arrival of a general purpose IMI is a suffi

  ciently uncom-

  mon occurrence that its impact could be profound for economic growth

  and its broader impact on society. There have been only a handful of pre-

  vious general purpose IMIs and each of these has had an enormous impact,

  not primarily through their direct eff ects (e.g., spectacles, in the case of the

  invention of optical lenses), but through their ability to reshape the ideas

  production function itself (e.g., telescopes and microscopes). It would there-

  fore be helpful to understand the extent to which deep learning is, or will,

  cause researchers to signifi cantly shift or reorient their approach in order to

  enhance research productivity (in the spirit of Jones [2009]).

  Finally, if deep learning does indeed prove to be a general purpose IMI,

  it will be important to develop institutions and a policy environment that

  is conductive to enhancing innovation through this approach, and to do so

  in a way that promotes competition and social welfare. A central concern

  here may be the interplay between a key input required for deep learning—

  large unstructured databases that provide information about physical or

  logical events—and the nature of competition. While the underlying algo-

  rithms for deep learning are in the public domain (and can and are being

  improved on rapidly), the data pools that are essential to generate predic-

  tions may be public or private, and access to them will depend on orga-

  nizational b
oundaries, policy, and institutions. Because the performance

  of deep learning algorithms depends critically on the training data that

  they are created from, it may be possible, in a particular application area,

  for a specifi c company (either an incumbent or start-up) to gain a signifi -

  cant, persistent innovation advantage through their control over data that is

  independent of traditional economies of scale or demand- side network

  eff ects. This “competition for the market” is likely to have several conse-

  128 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

  quences. First, it creates incentives for duplicative racing to establish a data

  advantage in particular application sectors (say, search, autonomous driv-

  ing, or cytology) followed by the establishment of durable barriers to entry

  that may be of signifi cant concern for competition policy. Perhaps even

  more important, this kind of behavior could result in a balkanization of

  data within each sector, not only reducing innovative productivity within the

  sector, but also reducing spillovers back to the deep learning GPT sector, and

  to other application sectors. This suggests that the proactive development

  of institutions and policies that encourage competition, data sharing, and

  openness is likely to be an important determinant of economic gains from

  the development and application of deep learning.

  Our discussion so far has been largely speculative, and it would be useful

  to know whether our claim that deep learning may be both a general purpose

  IMI and a GPT, while symbolic logic and robotics are probably not, have

  any empirical basis. We turn in the next section to a preliminary examination

  of the evolution of AI as revealed by bibliometric data, with an eye toward

  answering this question.

  4.5 Data

  This analysis draws upon two distinct data sets, one that captures a set of

  AI publications from Thompson Reuters Web of Science, and another that

  identifi es a set of AI patents issued by the US Patent and Trademark Offi

  ce

  (USPTO). In this section, we provide detail on the assembly of these data

  sets and summary statistics for variables in the sample.

  As previously discussed, peer- reviewed and public domain literature on

  AI points to the existence of three distinct fi elds within AI: robotics, learn-

  ing systems, and symbol systems, each composed of numerous subfi elds. To

  track development of each of these using this data, we began by identifying

  the publications and patents falling into each of these three fi elds based on

  keywords. Appendix table 4A.1 lists the terms we used to defi ne each fi eld

  and identify the papers and patents belonging to it.2 In short, the robotics

  fi eld includes approaches in which a system engages with and responds to

  environmental conditions; the symbolic systems fi eld attempts to represent

  complex concepts through logical manipulation of symbolic representa-

  tions, and the learning systems fi eld processes data through analytical pro-

  grams modeled on neurologic systems.

  4.5.1 Publication Sample and Summary Statistics

  Our analysis focuses on journal articles and book publications through

  the Web of Science from 1955 to 2015. We conducted a keyword search

  utilizing the keywords described in appendix table 4A.1 (we tried several

  2. Ironically enough, we relied upon human intelligence rather than machine learning to develop this classifi cation system and apply it to this data set.

  The Impact of Artifi cial Intelligence on Innovation 129

  variants of these keywords and alternative algorithmic approaches, but this

  did not result in a meaningful diff erence in the publication set). We are able

  to gather detailed information about each publication, including publica-

  tion year, journal information, topical information, as well as author and

  institutional affi

  liations.

  This search yields 98,124 publications. We then code each publication into

  one of the three main fi elds of AI, as described earlier. Overall, relative to an

  initial data set of 98,124, we are able to uniquely classify 95,840 publications

  as symbolic systems, learning systems, robotics, or “general” AI (we drop

  papers that involve combinations of these three fi elds). Table 4.2 reports the

  summary statistics for this sample.

  Of the 95,840 publications in the sample, 11,938 (12.5 percent) are clas-

  sifi ed as symbolic systems, 58,853 (61.4 percent) as learning, and 20,655

  (21.6 percent) as robotics, with the remainder being in the general fi eld of

  “artifi cial intelligence.” To derive a better understanding of the factors that

  have shaped the evolution of AI, we create indicators for variables of interest

  including organization type (private versus academic), location type (US

  domestic versus international), and application type (computer science ver-

  sus other application area, in addition to individual subject spaces, e.g.,

  biology, materials science, medicine, physics, economics, etc.).

  We identify organization type as academic if the organization of one of

  the authors on the publication is an academic institution; 81,998 publica-

  tions (85.5 percent) and 13,842 (14.4 percent) are produced by academic and

  private- sector authors, respectively. We identify publication location as US

  domestic if one of the authors on the publication lists the United States as

  his or her primary location; 22,436 publications (25 percent of the sample)

  are produced domestically.

  We also diff erentiate between subject matter. Forty- four percent of the

  publications are classifi ed as computer science, with 56 percent classifi ed as

  other applications. Summary statistics on the other applications are pro-

  vided in table 4.3. The other subjects with the largest number of publica-

  tions in the sample include telecommunications (5.5 percent), mathematics

  Table 4.2

  Publication data summary statistics

  Mean

  Std. dev.

  Min.

  Max.

  Publication year

  2007

  6.15

  1990

  2015

  Symbolic systems

  .12

  .33

  0

  1

  Learning systems

  .61

  .48

  0

  1

  Robotics

  .21

  .41

  0

  1

  Artifi cial intelligence

  .06

  .23

  0

  1

  Computer science

  .44

  .50

  0

  1

  Other applications

  .56

  .50

  0

  1

  US domestic

  .25

  .43

  0

  1

  International

  .75

  .43

  0

  1

  Observations

  95,840

  130 Iain M. Cockburn, Rebecca Henderson, and Scott Stern

  Table 4.3

  Distribution of publications across subjects

  Mean

  Std. dev.

  Biology

  .034
<
br />   .18

  Economics

  .028

  .16

  Physics

  .034

  .18

  Medicine

  .032

  .18

  Chemistry

  .038

  .19

  Mathematics

  .042

  .20

  Materials science

  .029

  .17

  Neurology

  .038

  .19

  Energy

  .015

  .12

  Radiology

  .015

  .12

  Telecommunications

  .055

  .23

  Computer science

  .44

  .50

  Observations

  95,840

  (4.2), neurology (3.8), chemistry (3.7), physics (3.4), biology (3.4), and medi-

  cine (3.1).

  Finally, we create indicator variables to document publication quality

  including journal quality (top ten, top twenty- fi ve, and top fi fty journals

  by impact factor)3 and a count variable for cumulative citation counts. Less

  than 1 percent of publications are in a top ten journal, with 2 percent and

  10 percent in top twenty- fi ve and top fi fty journals, respectively. The average

  citation count for a publication in the sample is 4.9.

  4.5.2 Patent Sample and Summary Statistics

  We undertake a similar approach for gathering a data set of AI patents.

  We start with the public- use fi le of USPTO patents (Marco, Carley, et al.

  2015; Marco, Myers, et al. 2015), and fi lter the data in two ways. First,

  we assemble a subset of data by fi ltering the USPTO Historical Master-

  fi le on the US Patent Classifi cation System (USPC) number.4 Specifi cally,

  USPC numbers 706 and 901 represent “artifi cial intelligence” and “robots,”

  respectively. Within USPC 706, there are numerous subclasses including

  “fuzzy logic hardware,” “plural processing systems,” “machine learning,”

  and “knowledge processing systems,” to name a few. We then use the USPC

  subclass to identify patents in AI fi elds of symbolic systems, learning sys-

  tems, and robotics. We drop patents prior to 1990, providing a sample of

  7,347 patents through 2014.

  Second, we assemble another subset of AI patents by conducting a title

  3. The rankings are collected from Guide2Research, found here: http:// www .guide2research

 

‹ Prev