Book Read Free

The Economics of Artificial Intelligence

Page 18

by Ajay Agrawal


  the game moves back to 1. With probability , the decision maker gains

  i

  this knowledge. The decision maker can then take an action, uncertainty

  is resolved and payoff s are realized, and we move to a new decision stage

  (back to 1). If no action is taken, a period of time elapses and the current

  decision stage continues.

  3. The decision maker chooses whether to apply judgment to the other

  state. If an action is chosen, uncertainty is resolved and payoff s are realized

  and we move to a new decision stage (back to 1).

  4. If judgment is chosen, with probability, 1 – , they do not fi nd out

  – i

  the payoff s for the risky action in that state, a period of time elapses and

  the game moves back to 1. With probability , the decision maker gains

  – i

  this knowledge. The decision maker then chooses an action, uncertainty

  8. The experience frame is considered in Agrawal, Gans, and Goldfarb (2018a).

  98 Ajay Agrawal, Joshua Gans, and Avi Goldfarb

  Table 3.1

  Model parameters

  Parameter

  Description

  S

  Known payoff from the safe action

  R

  Potential payoff from the risky action in a given state

  r

  Potential payoff from the risky action in a given state

  θ

  Label of state i ∈ {1,2}

  i

  Probability of state 1

  v

  Prior probability that the payoff in a given state is R

  λ

  Probablilty that decision maker learns the payoff to the risky action θ if

  i

  i

  judgment is applied for one period

  δ

  Discount factor

  is resolved and payoff s are realized, and we move to a new decision stage

  (back to 1).

  When prediction is available, it will become available prior to the begin-

  ning of a decision stage. The various parameters are listed in table 3.1.

  Suppose that the decision maker focuses on judging the optimal action

  (i.e., assessing the payoff ) for . Then the expected present discount payoff

  i

  from applying judgment is

  ( vR + (1 v) S) + (1

  )

  ( vR + (1 v) S) + (1

  ) t t

  ( vR + (1 v) S)

  i

  i

  i

  i

  i

  t =2

  =

  i

  ( vR + (1 v) S).

  1

  (1

  )

  i

  The decision maker eventually can learn what to do and will earn a higher

  payoff than without judgment, but will trade this off against a delay in the

  payoff .

  This calculation presumes that the decision maker knows the state—that

  is true—prior to engaging in judgment. If this is not the case, then the

  i

  expected present discounted payoff to judgment on, say, alone is

  1

  1

  max

  ( { v(μ R+(1 μ)( vR+(1 v) r))+(1 v)(μ r+(1 μ)( vR+(1 v) r)), S}) 1 (1

  )

  1

  =

  1

  max { v(μ R + (1 μ)( vR + (1 v) r)), S} + (1 v) S

  (

  ),

  1 (1

  )

  1

  where the last step follows from equation (A1). To make exposition simpler,

  we suppose that = = . In addition, let ˆ = / 1

  ( – (1– ) ); ˆ can be

  1

  2

  given a similar interpretation to , the quality of judgment.

  If the strategy were to apply judgment on one state only and then make

  a decision, this would be the relevant payoff to consider. However, because

  judgment is possible in both states, there are several cases to consider.

  First, the decision maker might apply judgment to both states in sequence.

  In this case, the expected present discounted payoff is

  A Theory of Decision- Making and Artifi cial Intelligence 99

  ˆ 2( v 2 R + v(1 v)max μ

  { R + (1 μ) r, S}

  + v(1 v)max μ r + (1 μ) R, S

  {

  } + (1 v)2 S)

  = ˆ 2 ( v 2 R + (1 v 2) S),

  where the last step follows from equation (A1).

  Second, the decision maker might apply judgment to, say, fi rst and then,

  1

  contingent on the outcome there, apply judgment to . If the decision maker

  2

  chooses to pursue judgment on if the outcome for is that the risky action

  2

  1

  is optimal, the payoff becomes

  ˆ ( vˆ ( vR + (1 v)max μ

  { R + (1 μ) r, S})

  + (1 v)max μ

  { r + (1 μ)( vR + (1 v) r), S})

  = ˆ ( vˆ ( vR + (1 v) S) + (1 v) S).

  If the decision maker chooses to pursue judgment on after determining

  2

  that the outcome for is that the safe action is optimal, the payoff becomes

  1

  ˆ ( v max μ

  { R + (1 μ)( vR + (1 v) r), S}

  + (1 v) ˆ ( v max μ

  { r + (1 μ) R, S} + (1 v) S)

  = ˆ v max μ

  { R + (1 μ)( vR + (1 v) r), S} + (1 v)ˆ S

  (

  ).

  Note that this is option is dominated by not applying further judgment at

  all if the outcome for is that the safe action is optimal.

  1

  Given this we can prove the following:

  Proposition 1: Under (A1) and (A2), and in the absence of any signal

  about the state, (a) judging both states and (b) continuing after the discovery

  that the safe action is preferred in a state are never optimal.

  Proof: Note that judging two states is optimal if

  ˆ >

  S

  v max μ r + (1 μ) R, S

  {

  } + (1 v) S

  μ R + (1 μ)( vR + (1 v) r)

  ˆ > vR +(1 v)max μ R+(1 μ) r, S

  {

  }.

  As (A2) implies that r + (1 – ) R ≤ S, the fi rst condition reduces to ˆ > 1. Thus, (a) judging two states is dominated by judging one state and

  continuing to explore only if the risk is found to be optimal in that state.

  Turning to the strategy of continuing to apply judgment only if the

  safe action is found to be preferred in a state, we can compare this to the

  payoff from applying judgment to one state and then acting immediately.

  Note that

  100 Ajay Agrawal, Joshua Gans, and Avi Goldfarb

  ˆ ( v max μ

  { R + (1 μ)( vR + (1 v) r), S}+ (1 v)ˆ S)

  > ˆ v max μ

  { R + (1 μ)( vR + (1 v) r), S} + (1 v) S

  (

  ).

  This can never hold, proving that (b) is dominated.

  The intuition is similar to Propositions 1 and 2 in Bolton and Faure-

  Grimaud (2009). In particular, applying judgment is only useful if it is going

  to lead to the decision maker switching to the risky action. Thus, it is never

  worthwhile to unconditionally explore a second state as it may not change

  the action taken. Similarly, if judging one state leads to knowledge the safe

  action continues to be optimal in that state, in the presence of uncertainty

  about the state, even if knowledge is gained of the payoff to the risky action

  in the second state, that action will never be
chosen. Hence, further judgment

  is not worthwhile. Hence, it is better to choose immediately at that point

  rather than delay the inevitable.

  Given this proposition, there are only two strategies that are potentially

  optimal (in the absence of prediction). One strategy (we will term here J1)

  is where judgment is applied to one state and if the risky action is optimal,

  then that action is taken immediately; otherwise, the safe default is taken

  immediately. The state where judgment is applied fi rst is the state most likely

  to arise. This will be state 1 if > 1/ 2. This strategy might be chosen if

  ˆ v max μ

  { R + (1 μ)( vR + (1 v) r), S} + (1 v) S

  (

  ) > S

  S

  ˆ > ˆ

  ,

  J 1

  v max μ R + (1 μ)( vR + (1 v) r ), S

  {

  } + (1 v) S

  which clearly requires that R + (1 – )( vR + (1 – v) r) > S.

  The other strategy (we will term here J2) is where judgment is applied to

  one state and if the risky action is optimal, then judgment is applied to the

  next state; otherwise, the safe default is taken immediately. Note that J2 is

  preferred to J1 if

  ˆ ( vˆ ( vR + (1 v) S) + (1 v) S)

  > ˆ v max μ

  { R + (1 μ)( vR + (1 v) r), S} + (1 v) S

  (

  )

  ˆ v( vR + (1 v) S) > v max μ

  { R + (1 μ)( vR + (1 v) r), S}

  max μ

  { R + (1 μ)( vR + (1 v) r), S}

  ˆ >

  .

  vR + (1 v) S

  This is intuitive. Basically, it is only when the effi

  ciency of judgment is suf-

  fi ciently high that more judgment is applied. However, for this inequality to

  be relevant, J2 must also be preferred to the status quo yielding a payoff of

  S. Thus, J2 is not dominated if

  A Theory of Decision- Making and Artifi cial Intelligence 101

  {

  (

  ) } S(4 v 2 R+ S(1+2 v 3 v 2)) (1 v) S

  max μ R + (1 μ) vR+(1 v) r , S

  ˆ > ˆ

  max

  ,

  J 2

  vR + (1 v) S

  2 v vR + (1 v) S

  (

  )

  ,

  where the fi rst term is the range where J2 dominates J1, while the second

  term is where J2 dominates S alone; so for J2 to be optimal, it must exceed

  both. Note also that as →( S – r)/ ( R – r) (its highest possible level consistent with [A1] and [A2]), then ˆ → 1.

  J 2

  If R + (1 – )( vR + (1 – v) r) > S, note that

  μ R+(1 μ)( vR+(1 v) r)

  S

  ˆ > ˆ

  >

  J 2

  J 1

  vR + (1 v) S

  v (μ R + (1 μ)( vR + (1 v) r))+ (1 v) S

  (1 v) S μ

  ( R+(1 μ) vR

  ( +(1 v) r) S)> v RS μ

  ( R+(1 μ) vR

  ( +(1 v) r))2

  (

  ),

  which may not hold for v suffi

  ciently high. However, it can be shown that

  when ˆ + ˆ , then the two terms of âre equal and the second term

  J 2

  J 1

  J 2

  exceeds the fi rst when ˆ ˆ . This implies that in the range where ˆ < ˆ ,

  J 2

  J 1

  J 2

  J 1

  J2 dominates J1.

  This analysis implies there are two types of regimes with judgment only.

  If ˆ > ˆ , then easier decisions (with high ˆ) involve using J2, the next

  J 2

  J 1

  tranche of decisions use J1 (with intermediate ˆ ) while the remainder

  in volves no exercise of judgment at all. On the other hand, if ˆ < ˆ , then

  J 2

  J 1

  the easier decisions involve using J2 while the remainder do not involve

  judgment at all.

  3.4.2 Prediction in the Absence of Judgment

  Next, we consider the model with prediction but no judgment. Suppose

  that there exists an AI that can, if deployed, identify the state prior to a

  decision being made. In other words, prediction, if it occurs, is perfect; an

  assumption we will relax in a later section. Initially, suppose there is no

  judgment mechanism to determine what the optimal action is in each state.

  Recall that, in the absence of prediction or judgment, (A1) ensures that

  the safe action will be chosen. If the decision maker knows the state, then

  the risky action in a given state is chosen if

  vR + (1 – v) r > S.

  This contradicts (A1). Thus, the expected payoff is

  V = S,

  P

  which is the same outcome if there is no judgment or prediction.

  3.4.3 Prediction and Judgment Together

  Both prediction and judgment can be valuable on their own. The question

  we next wish to consider is whether they are complements or substitutes.

  While perfect prediction allows you to choose an action based on the

  102 Ajay Agrawal, Joshua Gans, and Avi Goldfarb

  actual rather than expected state, it also aff ords the same opportunity with

  respect to judgment. As judgment is costly, it is useful not to waste con-

  sidering what action might be taken in a state that does not arise. This was

  not possible when there was no prediction. But if you receive a prediction

  regarding the state, you can then apply judgment exclusively to actions in

  relation to that state. To be sure, that judgment still involves a cost, but at

  the same time does not lead to any wasted cognitive resources.

  Given this, if the decision maker were the apply judgment after the state

  is predicted, their expected discounted payoff would be

  V

  = max ˆ

  { ( vR + (1 v) S), S}.

  PJ

  This represents the highest expected payoff possible (net of the costs of

  judgment). A necessary condition for both prediction and judgment to be

  optimal is that: ˆ ≥ ˆ ≡ s/ [ vR + (1 – v) S]. Note that ˆ ≤ ˆ , ˆ .

  PJ

  PJ

  J 1

  J 2

  3.4.4 Complements or Substitutes?

  To evaluate whether prediction and judgment are complements or sub-

  stitutes, we adopt the following parameterization for the eff ectiveness of

  prediction: we assume that with probability e an AI yields a prediction, while

  otherwise, the decision must be made in its absence (with judgment only).

  With this parameterization, we can prove the following:

  Proposition 2: In the range of where ˆ < ˆ , e and are complements,

  J2

  otherwise they are substitutes.

  Proof: Step 1. Is ˆ > R/ [2( vR + (1 – v) S)]? First, note that J 2

  max μ

  { R + (1 μ)( vR + (1 v) r), S}

  >

  R

  vR + (1 v) S

  2( vR + (1 v) S )

  max μ

  { R + (1 μ) vR + (1 v) r

  (

  ), S} > 1 R.

  2

  Note that by (A2) and since > (1/ 2), S > R + (1 – ) r > (1/ 2) R so this inequality always holds.

  Second, note that

  S (4 v 2 R + S(1 + 2 v 3 v 2)) (1 v) S

  2 v( vR + (1 v) S )

  >

  R

  2( vR + (1 v) S )

  S (4 v 2 R + S(1+ 2 v 3 v 2)) > ( vR + (1 v) S)2

 
S( S

  2 R) > v( R 2

  6 RS + S 2),

  which holds as the left- hand side is always positive while the right- hand side

  is always negative.

  Step 2: Suppose that R + (1 – )( vR + (1 – v) r) ≤ S; then J1 is never optimal. In this case, the expected payoff is

  A Theory of Decision- Making and Artifi cial Intelligence 103

  eV + (1 e) V = eˆ vR

  ( + (1 v) S) + (1 e)ˆ vˆ

  ( vR

  ( + (1 v) S) + (1 v) S).

  PJ

  J 2

  This mixed partial derivative with respect to (e, ˆ ) is v( R – 2 ˆ ( vR +

  (1 – v) S)). This is positive if R/ [2( vR + (1 – v) S)] ≥ ˆ. By Step 1, this implies that for ˆ < ˆ , prediction and judgment are complements; otherwise, they

  J 2

  are substitutes.

  Step 3: Suppose that that R + (1 – )( vR + (1 – v) r) > S. Note that for ˆ ˆ < ˆ , J1 is preferred to J2. In this case, the expected payoff to prediction

  J 1

  J 2

  and judgment is

  eˆ vR

  ( + (1 v) S)+(1 e)ˆ v( max μ

  { R+(1 μ) vR

  ( + (1 v) r), S} + (1 v) S).

  This mixed partial derivative with respect to (e, ˆ ) is v( R – max{ R +

  (1 – )( vR + (1 – v) r), S}) > 0. By Step 1, this implies that for ˆ < ˆ , predic-J 2

  tion and judgment are complements; otherwise, they are substitutes.

  The intuition is as follows. When ˆ < ˆ , then, in the absence of prediction

  J 2

  either no judgment is applied or, alternatively, strategy J1 (with one round

  of judgment) is optimal; e parameterizes the degree of diff erence between

  the expected value with both prediction and judgment and the expected

  value without prediction with an increase in , increasing both. However,

  with one round of judgment, the increase when judgment is used alone is

  less than that when both are used together. Thus, when ˆ < ˆ , prediction

  J 2

  and judgment are complements.

  By contrast, when ˆ > ˆ , then strategy J2 (with two rounds of judgment)

  J 2

  is used in the absence of prediction. In this case, increasing increases the

  expected payoff from judgment alone disproportionately more because judg-

  ment is applied on both states, whereas under prediction and judgment it

  is only applied on one. Thus, improving the quality of judgment reduces

  the returns to prediction. And so, when ˆ >

‹ Prev