The Economics of Artificial Intelligence
Page 29
5.4 A Combinatorial- Based Knowledge Production
Function with Team Production: An Extended Model
Our basic model assumes that researchers working alone combine the
knowledge to which they have access, A, to discover new knowledge. In
reality, new discoveries are increasingly being made by research teams (Jones
2009; Nielsen 2012; Agrawal, Goldfarb, and Teodoridis 2016). Assuming
5. K elements in Kauff man’s original notation.
162 Ajay Agrawal, John McHale, and Alexander Oettl
initially no redundancy in the knowledge that individual members bring to
the team—that is, collective team knowledge is the sum of the knowledge
of the individual team members—combining individual researchers into
teams can greatly expand the knowledge base from which new combina-
tions of existing knowledge can be made. This also opens up the possibility
of a positive interaction between factors that facilitate the operation of
larger teams and factors that raise the size of the fi shing- out/ complexity
parameter, . New meta technologies such as deep learning can be more
eff ective in a world where they are operating on a larger knowledge base
due to the ability of researchers to more eff ectively pool their knowledge by
forming larger teams.
We thus extend in this section the basic model to allow for new knowledge
to be discovered by research teams. For a team with m members and no
overlap in the knowledge of its members, the total knowledge access for the
team is simply mA. (We later relax the assumption of no knowledge overlap
within a team.) Innovations occur as a result of the team combining exist-
ing knowledge to produce new knowledge. Knowledge can be combined by
the team a ideas at a time, where a = 0, 1 . . . mA. For a given team j with m members, the total number of possible combinations of units of existing knowledge (including singletons and the null set) given their combined
knowledge access is
mA
(16)
Z =
mA
= 2 mA .
j
a
a=0
Assuming again for convenience that A and Z can be treated as continu-
ous, the per- period translation of potential combinations into valuable new
knowledge by a team is again given by the (asymptotic) constant elasticity
discovery function
Z
1
(17)
A =
j
=
(2 mA )
1 for 0 <
1
j
= ln Z = ln(2 mA ) = ln (2) mA for = 0,
j
where use is again made of L’Hôpital’s rule for the limiting case of = 0.
The number of researchers in the economy at a point in time is again L
A
(which we now assume is measured discretely). Research teams can poten-
tially be formed from any possible combination of the L researchers. For
A
each of these potential teams, a entrepreneur can coordinate the team.
However, for a potential team with m members to form, the entrepreneur
must have relationships with all m members. The need for a relationship
thus places a constraint on feasible teams. The probability of a relationship
existing between the entrepreneur and any given researcher is , and thus
the probability of relationships existing between all members of a team of
size m is m. Using the formula for a binomial expansion, the expected total number of feasible teams is
Artifi cial Intelligence and Recombinant Growth 163
Fig. 5.3 Example of how the distribution of team size varies with
LA
L
(18)
S =
A
m = (1+ ) LA.
m
m=0
The average feasible team size is then given by
LA
LA
mm
m=0
m
(19)
m =
.
LA
LA
m
m=0
m
Factorizing the numerator and substituting in the denominator using equa-
tion (18), we obtain a simple expression for the average feasible team size:
LA
LA
mm
m=0
m
1
L
(20)
m =
= (1+ ) LA
A =
L .
A
L
(1 + ) LA
1 +
A
LA
m
m=0
m
Figure 5.3 shows an example of the full distribution of teams sizes (with
L = 50) for two diff erent values of . An increase in (i.e., an improvement A
in the capability to form teams) will push the distribution to the right and
increase the average team size.
We can now write down the form that the knowledge production func-
tion would take if all possible research teams could form (ignoring for the
moment any stepping- on- toes eff ects):
LA
L
(2 mA )
1
(21)
A =
A
m
for 0 <
1.
m
m=0
164 Ajay Agrawal, John McHale, and Alexander Oettl
We next allow for the fact that only a fraction of the feasible teams will
actually form. Recognising obvious time constraints on the ability of a given
researcher to be part of multiple research teams, we impose the constraint
that each researcher can only be part of one team. However, we assume the
size of any team that successfully forms is drawn from the same distribution
over sizes as the potential teams. Therefore, the expected average team size is
also given by equation (18). With this restriction, we can solve for the total
number of teams, N, from the equation L = N[/ (1 + )] L , which implies A
A
N = (1 + )/ .
Given the assumption that the distribution of actual team sizes is drawn
from the same distribution as the feasible team sizes, the aggregate knowl-
edge production function (assuming > 0) is then given by
LA
L
(2 mA )
1
(22)
A = (1 + ) /
A
m
(1 + ) LA
m
m=0
LA
L
(2 mA )
1
=
1
A
m
,
(1 + ) L 1
A
m
m=0
where the fi rst term is the actual number of teams as a fraction of the poten-
tially feasible number of teams. For = 0 the aggregate knowledge produc-
tion function takes the form
LA
L
(23)
A =
1
A
mm ln(2) A
(1 + ) L 1
A
m
m=0
=
1
(1 + ) L 1
A
L
ln(2) A
(
)
A
(1 + ) L 1
> A
= L ln(2) A .
A
To see intuitively how an increase in could aff ect aggregate knowledge
discovery when > 0, note that from equation (20) an increase in will
increase the average team size of the teams that form. From equation (16),
we see that for a given knowledge access by an individual researcher, the
number of potential combinations increases exponentially with the size of
the team, m (see fi gure 5.4). This implies that combining two teams of size
mʹ to create a team of size 2mʹ will more than double the new knowledge output of the team. Hence, there is a positive interaction between and .
On the other hand, when = 0, combining the two teams will exactly double
the new knowledge output given the linearity of the relationship between
team size and knowledge output. In this case, the aggregate knowledge is
invariant to the distribution of team sizes.
To see this formally, note that from equation (23) we know that when = 0,
the partial derivative of A with respect to must be zero since does not
appear in the fi nal form of the knowledge production function. This results
Artifi cial Intelligence and Recombinant Growth 165
Fig. 5.4 Team knowledge production and team size
from the balancing of two eff ects as increases. The fi rst (negative) eff ect is
that the number of teams as a share of the potentially possible teams falls.
The second (positive) eff ect is that the amount of new knowledge production
if all possible teams do form rises. We can now ask what happens if we raise
to a strictly positive value. The fi rst of these eff ects is unchanged. But that
second eff ect will be stronger provided that the knowledge production of a
team for any given team size rises with . A suffi
cient condition for this to
be true is that
1/
(24)
A >
1
for all m > 0.
ln(2) m
We assume that the starting size of the knowledge stock is large enough so
that this condition holds. Moreover, the partial derivative of A with respect
to will be larger the larger is the value of . We show these eff ects for a
particular example in fi gure 5.5.
The possibilities of knowledge overlap at the level of the team and dupli-
cation of knowledge outputs between teams creates additional complica-
tions. To allow for stepping- on- toes eff ects, it is useful to fi rst rewrite equa-
tion (20) as
L
1
A
L
(2 mA )
1
(25) A = 1 +
L
A
m
.
1 +
A (1+ ) L 1
A
L
m
A
m=0
We introduce two stepping- on- toes eff ects. First, we allow for knowledge
overlap within teams to introduce the potential for redundancy of knowl-
edge. A convenient way to introduce this eff ect is to assume that the overlap
166 Ajay Agrawal, John McHale, and Alexander Oettl
Fig. 5.5 Relationships between new knowledge production, , and
reduces the eff ective average team size in the economy from the viewpoint
of generating new knowledge. More specifi cally, we assume the eff ective
team size is given by
(26)
me = m =
L
,
1 +
A
where 0 ≤ ≤ 1. The extreme case of = 0 (full overlap) has each team act-
ing as if it had eff ectively a single member; the opposite extreme of = 1
(no overlap) has no knowledge redundancy at the level of the team. Second,
we allow for the possibility that new ideas are duplicated across teams. The
eff ective number of non- idea- duplicating teams is given by
(27)
N e = N 1
= 1+
1
,
where 0 ≤ ≤ 1. The extreme case of = 0 (no duplication) implies that
the eff ective number of teams is equal to the actual number of teams; the
extreme case of = 1 (full duplication) implies that a single team produces
the same number of new ideas as the full set of teams.
We can now add the stepping- on- toes eff ects—knowledge redundancy
within teams and discovery duplication between teams—to yield the general
form of the knowledge production function for > 0:
L
1
A
L
(2 mA )
1
(28) A = 1 +
1
L
A
m
.
1 +
A
(1 + ) L 1
A
L
m
A
m=0
If we take the limit of equation (24) as goes to zero, we reproduce the
limiting case of the knowledge production function. Ignoring integer con-
straints on L , this knowledge production function again has the form of
A
the Romer/ Jones function:
Artifi cial Intelligence and Recombinant Growth 167
1
LA
L
(29) A = 1 +
1
L
A
m
ln(2) mA
1 +
A
(1 + ) L 1
A
L
m
A
m=0
(1 + ) L 1
A
L
= 1+
1
L
A
ln(2) A
1 +
A
(1 + ) L 1
A
LA
= 1+
1
ln(2) L A .
1 +
A
We note fi nally the presence of the relationship parameter in the knowl-
edge production equation. This can be taken to refl ect in part the impor-
tance of (social) relationships in the forming of research teams. Advances
in computer- based technologies such as email and fi le sharing (as well as
policies and institutions) could also aff ect this parameter (see, e.g., Agrawal
and Goldfarb [2008] on the eff ects of the introduction of precursors to
today’s internet on collaboration between researchers). Although not the
main focus of this chapter, being able to incorporate the eff ects of changes
in collaboration technologies increases the richness of the framework for
considering the determinants of the effi
ciency of knowledge production.
5.5 Discussion
5.5.1 Something New under the Sun? Deep
Learning as a New Tool for Discovery
Two key observations motivate the model developed above. First, using
the analogy of fi nding a needle in a haystack, signifi cant obstacles to dis-
covery in numerous domains of science and technology result from highly
nonlinear relationships of causes and eff ect in high- dimensional data. Sec-
ond, advances in algorithms such as deep learning (combined with increased
availability of data and computing power) off er the potential to fi nd relevant
knowledge and predict combinations that will yield valuable new discoveries.
Even a cursory review of the
scientifi c and engineering literatures indi-
cates that needle- in-the- haystack problems are pervasive in many frontier
fi elds of innovation, especially in areas where matter is manipulated at the
molecular or submolecular level. In the fi eld of genomics, for example, com-
plex genotype- phenotype interactions make it diffi
cult to identify therapies
that yield valuable improvements in human health or agricultural produc-
tivity. In the fi eld of drug discovery, complex interactions between drug
compounds and biological systems present an obstacle to identifying prom-
ising new drug therapies. And in the fi eld of material sciences, including
nanotechnology, complex interactions between the underlying physical and
chemical mechanisms increases the challenge of predicting the performance
of potential new materials with potential applications ranging from new
168 Ajay Agrawal, John McHale, and Alexander Oettl
materials to prevent traumatic brain injury to lightweight materials for use
in transportation to reduce dependence on carbon- based fuels (National
Science and Technology Council 2011).
The apparent speed with which deep learning is being applied in these and
other fi elds suggests it represents a breakthrough general purpose meta tech-
nology for predicting valuable new combinations in highly complex spaces.
Although an in-depth discussion of the technical advances underlying deep
learning is beyond the scope of this chapter, two aspects are worth highlight-
ing. First, previous generations of machine learning were constrained by the
need to extract features (or explanatory variables) by hand before statistical
analysis. A major advance in machine learning involves the use of “repre-
sentation learning” to automatically extract the relevant features.6 Second,
the development and optimization of multilayer neural networks allows
for substantial improvement in the ability to predict outcomes in high-
dimensional spaces with complex nonlinear interactions (LeCun, Bengio,
and Hinton 2015). A recent review of the use of deep learning in computa-
tional biology, for instance, notes that the “rapid increase in biological data
dimensions and acquisition rates is challenging conventional analysis strate-
gies,” and that “[m]odern machine learning methods, such as deep learning,
promise to leverage very large data sets for fi nding hidden structure within
them, and for making accurate predictions” (Angermueller et al. 2016, 1).
Another review of the use of deep learning in computational chemistry
highlights how deep learning has a “ubiquity and broad applicability to a
wide range of challenges in the fi eld, including quantitative activity relation-
ship, virtual screening, protein structure prediction, quantum chemistry,