The Economics of Artificial Intelligence

Page 29

by Ajay Agrawal

5.4 A Combinatorial- Based Knowledge Production

Function with Team Production: An Extended Model

Our basic model assumes that researchers working alone combine the

knowledge to which they have access, A, to discover new knowledge. In

reality, new discoveries are increasingly being made by research teams (Jones

2009; Nielsen 2012; Agrawal, Goldfarb, and Teodoridis 2016). Assuming

5. K elements in Kauff man’s original notation.

162 Ajay Agrawal, John McHale, and Alexander Oettl

initially no redundancy in the knowledge that individual members bring to

the team—that is, collective team knowledge is the sum of the knowledge

of the individual team members—combining individual researchers into

teams can greatly expand the knowledge base from which new combina-

tions of existing knowledge can be made. This also opens up the possibility

of a positive interaction between factors that facilitate the operation of

larger teams and factors that raise the size of the fi shing- out/ complexity

parameter, . New meta technologies such as deep learning can be more

eff ective in a world where they are operating on a larger knowledge base

due to the ability of researchers to more eff ectively pool their knowledge by

forming larger teams.

We thus extend in this section the basic model to allow for new knowledge

to be discovered by research teams. For a team with m members and no

overlap in the knowledge of its members, the total knowledge access for the

team is simply mA. (We later relax the assumption of no knowledge overlap

within a team.) Innovations occur as a result of the team combining exist-

ing knowledge to produce new knowledge. Knowledge can be combined by

the team a ideas at a time, where a = 0, 1 . . . mA. For a given team j with m members, the total number of possible combinations of units of existing knowledge (including singletons and the null set) given their combined

knowledge access is

mA

(16)

Z =

mA

= 2 mA .

j

a

a=0

Assuming again for convenience that A and Z can be treated as continu-

ous, the per- period translation of potential combinations into valuable new

knowledge by a team is again given by the (asymptotic) constant elasticity

discovery function

Z

1

(17)

A =

j

=

(2 mA )

1 for 0 <

1

j

= ln Z = ln(2 mA ) = ln (2) mA for = 0,

j

where use is again made of L’Hôpital’s rule for the limiting case of = 0.

The number of researchers in the economy at a point in time is again L

A

(which we now assume is measured discretely). Research teams can poten-

tially be formed from any possible combination of the L researchers. For

A

each of these potential teams, a entrepreneur can coordinate the team.

However, for a potential team with m members to form, the entrepreneur

must have relationships with all m members. The need for a relationship

thus places a constraint on feasible teams. The probability of a relationship

existing between the entrepreneur and any given researcher is , and thus

the probability of relationships existing between all members of a team of

size m is m. Using the formula for a binomial expansion, the expected total number of feasible teams is

Artifi cial Intelligence and Recombinant Growth 163

Fig. 5.3 Example of how the distribution of team size varies with ␩

LA

L

(18)

S =

A

m = (1+ ) LA.

m

m=0

The average feasible team size is then given by

LA

LA

mm

m=0

m

(19)

m =

.

LA

LA

m

m=0

m

Factorizing the numerator and substituting in the denominator using equa-

tion (18), we obtain a simple expression for the average feasible team size:

LA

LA

mm

m=0

m

1

L

(20)

m =

= (1+ ) LA

A =

L .

A

L

(1 + ) LA

1 +

A

LA

m

m=0

m

Figure 5.3 shows an example of the full distribution of teams sizes (with

L = 50) for two diff erent values of . An increase in (i.e., an improvement A

in the capability to form teams) will push the distribution to the right and

increase the average team size.

We can now write down the form that the knowledge production func-

tion would take if all possible research teams could form (ignoring for the

moment any stepping- on- toes eff ects):

LA

L

(2 mA )

1

(21)

A =

A

m

for 0 <

1.

m

m=0

164 Ajay Agrawal, John McHale, and Alexander Oettl

We next allow for the fact that only a fraction of the feasible teams will

actually form. Recognising obvious time constraints on the ability of a given

researcher to be part of multiple research teams, we impose the constraint

that each researcher can only be part of one team. However, we assume the

size of any team that successfully forms is drawn from the same distribution

over sizes as the potential teams. Therefore, the expected average team size is

also given by equation (18). With this restriction, we can solve for the total

number of teams, N, from the equation L = N[/ (1 + )] L , which implies A

A

N = (1 + )/ .

Given the assumption that the distribution of actual team sizes is drawn

from the same distribution as the feasible team sizes, the aggregate knowl-

edge production function (assuming > 0) is then given by

LA

L

(2 mA )

1

(22)

A = (1 + ) /

A

m

(1 + ) LA

m

m=0

LA

L

(2 mA )

1

=

1

A

m

,

(1 + ) L 1

A

m

m=0

where the fi rst term is the actual number of teams as a fraction of the poten-

tially feasible number of teams. For = 0 the aggregate knowledge produc-

tion function takes the form

LA

L

(23)

A =

1

A

mm ln(2) A

(1 + ) L 1

A

m

m=0

=

1

(1 + ) L 1

A

L

ln(2) A

(

)

A

(1 + ) L 1

> A

= L ln(2) A .

A

To see intuitively how an increase in could aff ect aggregate knowledge

discovery when > 0, note that from equation (20) an increase in will

increase the average team size of the teams that form. From equation (16),

we see that for a given knowledge access by an individual researcher, the

number of potential combinations increases exponentially with the size of

the team, m (see fi gure 5.4). This implies that combining two teams of size

mʹ to create a team of size 2mʹ will more than double the new knowledge output of the team. Hence, there is a positive interaction between and .

On the other hand, when = 0, combining the two teams will exactly double

the new knowledge output given the linearity of the relationship between

team size and knowledge output. In this case, the aggregate knowledge is

invariant to the distribution of team sizes.

To see this formally, note that from equation (23) we know that when = 0,

the partial derivative of A with respect to must be zero since does not

appear in the fi nal form of the knowledge production function. This results

Artifi cial Intelligence and Recombinant Growth 165

Fig. 5.4 Team knowledge production and team size

from the balancing of two eff ects as increases. The fi rst (negative) eff ect is

that the number of teams as a share of the potentially possible teams falls.

The second (positive) eff ect is that the amount of new knowledge production

if all possible teams do form rises. We can now ask what happens if we raise

to a strictly positive value. The fi rst of these eff ects is unchanged. But that

second eff ect will be stronger provided that the knowledge production of a

team for any given team size rises with . A suffi

cient condition for this to

be true is that

1/

(24)

A >

1

for all m > 0.

ln(2) m

We assume that the starting size of the knowledge stock is large enough so

that this condition holds. Moreover, the partial derivative of A with respect

to will be larger the larger is the value of . We show these eff ects for a

particular example in fi gure 5.5.

The possibilities of knowledge overlap at the level of the team and dupli-

cation of knowledge outputs between teams creates additional complica-

tions. To allow for stepping- on- toes eff ects, it is useful to fi rst rewrite equa-

tion (20) as

L

1

A

L

(2 mA )

1

(25) A = 1 +

L

A

m

.

1 +

A (1+ ) L 1

A

L

m

A

m=0

We introduce two stepping- on- toes eff ects. First, we allow for knowledge

overlap within teams to introduce the potential for redundancy of knowl-

edge. A convenient way to introduce this eff ect is to assume that the overlap

166 Ajay Agrawal, John McHale, and Alexander Oettl

Fig. 5.5 Relationships between new knowledge production, ␩, and ␪

reduces the eff ective average team size in the economy from the viewpoint

of generating new knowledge. More specifi cally, we assume the eff ective

team size is given by

(26)

me = m =

L

,

1 +

A

where 0 ≤ ≤ 1. The extreme case of = 0 (full overlap) has each team act-

ing as if it had eff ectively a single member; the opposite extreme of = 1

(no overlap) has no knowledge redundancy at the level of the team. Second,

we allow for the possibility that new ideas are duplicated across teams. The

eff ective number of non- idea- duplicating teams is given by

(27)

N e = N 1

= 1+

1

,

where 0 ≤ ≤ 1. The extreme case of = 0 (no duplication) implies that

the eff ective number of teams is equal to the actual number of teams; the

extreme case of = 1 (full duplication) implies that a single team produces

the same number of new ideas as the full set of teams.

We can now add the stepping- on- toes eff ects—knowledge redundancy

within teams and discovery duplication between teams—to yield the general

form of the knowledge production function for > 0:

L

1

A

L

(2 mA )

1

(28) A = 1 +

1

L

A

m

.

1 +

A

(1 + ) L 1

A

L

m

A

m=0

If we take the limit of equation (24) as goes to zero, we reproduce the

limiting case of the knowledge production function. Ignoring integer con-

straints on L , this knowledge production function again has the form of

A

the Romer/ Jones function:

Artifi cial Intelligence and Recombinant Growth 167

1

LA

L

(29) A = 1 +

1

L

A

m

ln(2) mA

1 +

A

(1 + ) L 1

A

L

m

A

m=0

(1 + ) L 1

A

L

= 1+

1

L

A

ln(2) A

1 +

A

(1 + ) L 1

A

LA

= 1+

1

ln(2) L A .

1 +

A

We note fi nally the presence of the relationship parameter in the knowl-

edge production equation. This can be taken to refl ect in part the impor-

tance of (social) relationships in the forming of research teams. Advances

in computer- based technologies such as email and fi le sharing (as well as

policies and institutions) could also aff ect this parameter (see, e.g., Agrawal

and Goldfarb [2008] on the eff ects of the introduction of precursors to

today’s internet on collaboration between researchers). Although not the

main focus of this chapter, being able to incorporate the eff ects of changes

in collaboration technologies increases the richness of the framework for

considering the determinants of the effi

ciency of knowledge production.

5.5 Discussion

5.5.1 Something New under the Sun? Deep

Learning as a New Tool for Discovery

Two key observations motivate the model developed above. First, using

the analogy of fi nding a needle in a haystack, signifi cant obstacles to dis-

covery in numerous domains of science and technology result from highly

nonlinear relationships of causes and eff ect in high- dimensional data. Sec-

ond, advances in algorithms such as deep learning (combined with increased

availability of data and computing power) off er the potential to fi nd relevant

knowledge and predict combinations that will yield valuable new discoveries.

Even a cursory review of the
scientifi c and engineering literatures indi-

cates that needle- in-the- haystack problems are pervasive in many frontier

fi elds of innovation, especially in areas where matter is manipulated at the

molecular or submolecular level. In the fi eld of genomics, for example, com-

plex genotype- phenotype interactions make it diffi

cult to identify therapies

that yield valuable improvements in human health or agricultural produc-

tivity. In the fi eld of drug discovery, complex interactions between drug

compounds and biological systems present an obstacle to identifying prom-

ising new drug therapies. And in the fi eld of material sciences, including

nanotechnology, complex interactions between the underlying physical and

chemical mechanisms increases the challenge of predicting the performance

of potential new materials with potential applications ranging from new

168 Ajay Agrawal, John McHale, and Alexander Oettl

materials to prevent traumatic brain injury to lightweight materials for use

in transportation to reduce dependence on carbon- based fuels (National

Science and Technology Council 2011).

The apparent speed with which deep learning is being applied in these and

other fi elds suggests it represents a breakthrough general purpose meta tech-

nology for predicting valuable new combinations in highly complex spaces.

Although an in-depth discussion of the technical advances underlying deep

learning is beyond the scope of this chapter, two aspects are worth highlight-

ing. First, previous generations of machine learning were constrained by the

need to extract features (or explanatory variables) by hand before statistical

analysis. A major advance in machine learning involves the use of “repre-

sentation learning” to automatically extract the relevant features.6 Second,

the development and optimization of multilayer neural networks allows

for substantial improvement in the ability to predict outcomes in high-

dimensional spaces with complex nonlinear interactions (LeCun, Bengio,

and Hinton 2015). A recent review of the use of deep learning in computa-

tional biology, for instance, notes that the “rapid increase in biological data

dimensions and acquisition rates is challenging conventional analysis strate-

gies,” and that “[m]odern machine learning methods, such as deep learning,

promise to leverage very large data sets for fi nding hidden structure within

them, and for making accurate predictions” (Angermueller et al. 2016, 1).

Another review of the use of deep learning in computational chemistry

highlights how deep learning has a “ubiquity and broad applicability to a

wide range of challenges in the fi eld, including quantitative activity relation-

ship, virtual screening, protein structure prediction, quantum chemistry,

‹ Prev Next ›