by Ajay Agrawal
the game moves back to 1. With probability , the decision maker gains
i
this knowledge. The decision maker can then take an action, uncertainty
is resolved and payoff s are realized, and we move to a new decision stage
(back to 1). If no action is taken, a period of time elapses and the current
decision stage continues.
3. The decision maker chooses whether to apply judgment to the other
state. If an action is chosen, uncertainty is resolved and payoff s are realized
and we move to a new decision stage (back to 1).
4. If judgment is chosen, with probability, 1 – , they do not fi nd out
– i
the payoff s for the risky action in that state, a period of time elapses and
the game moves back to 1. With probability , the decision maker gains
– i
this knowledge. The decision maker then chooses an action, uncertainty
8. The experience frame is considered in Agrawal, Gans, and Goldfarb (2018a).
98 Ajay Agrawal, Joshua Gans, and Avi Goldfarb
Table 3.1
Model parameters
Parameter
Description
S
Known payoff from the safe action
R
Potential payoff from the risky action in a given state
r
Potential payoff from the risky action in a given state
θ
Label of state i ∈ {1,2}
i
Probability of state 1
v
Prior probability that the payoff in a given state is R
λ
Probablilty that decision maker learns the payoff to the risky action θ if
i
i
judgment is applied for one period
δ
Discount factor
is resolved and payoff s are realized, and we move to a new decision stage
(back to 1).
When prediction is available, it will become available prior to the begin-
ning of a decision stage. The various parameters are listed in table 3.1.
Suppose that the decision maker focuses on judging the optimal action
(i.e., assessing the payoff ) for . Then the expected present discount payoff
i
from applying judgment is
( vR + (1 v) S) + (1
)
( vR + (1 v) S) + (1
) t t
( vR + (1 v) S)
i
i
i
i
i
t =2
=
i
( vR + (1 v) S).
1
(1
)
i
The decision maker eventually can learn what to do and will earn a higher
payoff than without judgment, but will trade this off against a delay in the
payoff .
This calculation presumes that the decision maker knows the state—that
is true—prior to engaging in judgment. If this is not the case, then the
i
expected present discounted payoff to judgment on, say, alone is
1
1
max
( { v(μ R+(1 μ)( vR+(1 v) r))+(1 v)(μ r+(1 μ)( vR+(1 v) r)), S}) 1 (1
)
1
=
1
max { v(μ R + (1 μ)( vR + (1 v) r)), S} + (1 v) S
(
),
1 (1
)
1
where the last step follows from equation (A1). To make exposition simpler,
we suppose that = = . In addition, let ˆ = / 1
( – (1– ) ); ˆ can be
1
2
given a similar interpretation to , the quality of judgment.
If the strategy were to apply judgment on one state only and then make
a decision, this would be the relevant payoff to consider. However, because
judgment is possible in both states, there are several cases to consider.
First, the decision maker might apply judgment to both states in sequence.
In this case, the expected present discounted payoff is
A Theory of Decision- Making and Artifi cial Intelligence 99
ˆ 2( v 2 R + v(1 v)max μ
{ R + (1 μ) r, S}
+ v(1 v)max μ r + (1 μ) R, S
{
} + (1 v)2 S)
= ˆ 2 ( v 2 R + (1 v 2) S),
where the last step follows from equation (A1).
Second, the decision maker might apply judgment to, say, fi rst and then,
1
contingent on the outcome there, apply judgment to . If the decision maker
2
chooses to pursue judgment on if the outcome for is that the risky action
2
1
is optimal, the payoff becomes
ˆ ( vˆ ( vR + (1 v)max μ
{ R + (1 μ) r, S})
+ (1 v)max μ
{ r + (1 μ)( vR + (1 v) r), S})
= ˆ ( vˆ ( vR + (1 v) S) + (1 v) S).
If the decision maker chooses to pursue judgment on after determining
2
that the outcome for is that the safe action is optimal, the payoff becomes
1
ˆ ( v max μ
{ R + (1 μ)( vR + (1 v) r), S}
+ (1 v) ˆ ( v max μ
{ r + (1 μ) R, S} + (1 v) S)
= ˆ v max μ
{ R + (1 μ)( vR + (1 v) r), S} + (1 v)ˆ S
(
).
Note that this is option is dominated by not applying further judgment at
all if the outcome for is that the safe action is optimal.
1
Given this we can prove the following:
Proposition 1: Under (A1) and (A2), and in the absence of any signal
about the state, (a) judging both states and (b) continuing after the discovery
that the safe action is preferred in a state are never optimal.
Proof: Note that judging two states is optimal if
ˆ >
S
v max μ r + (1 μ) R, S
{
} + (1 v) S
μ R + (1 μ)( vR + (1 v) r)
ˆ > vR +(1 v)max μ R+(1 μ) r, S
{
}.
As (A2) implies that r + (1 – ) R ≤ S, the fi rst condition reduces to ˆ > 1. Thus, (a) judging two states is dominated by judging one state and
continuing to explore only if the risk is found to be optimal in that state.
Turning to the strategy of continuing to apply judgment only if the
safe action is found to be preferred in a state, we can compare this to the
payoff from applying judgment to one state and then acting immediately.
Note that
100 Ajay Agrawal, Joshua Gans, and Avi Goldfarb
ˆ ( v max μ
{ R + (1 μ)( vR + (1 v) r), S}+ (1 v)ˆ S)
> ˆ v max μ
{ R + (1 μ)( vR + (1 v) r), S} + (1 v) S
(
).
This can never hold, proving that (b) is dominated.
The intuition is similar to Propositions 1 and 2 in Bolton and Faure-
Grimaud (2009). In particular, applying judgment is only useful if it is going
to lead to the decision maker switching to the risky action. Thus, it is never
worthwhile to unconditionally explore a second state as it may not change
the action taken. Similarly, if judging one state leads to knowledge the safe
action continues to be optimal in that state, in the presence of uncertainty
about the state, even if knowledge is gained of the payoff to the risky action
in the second state, that action will never be
chosen. Hence, further judgment
is not worthwhile. Hence, it is better to choose immediately at that point
rather than delay the inevitable.
Given this proposition, there are only two strategies that are potentially
optimal (in the absence of prediction). One strategy (we will term here J1)
is where judgment is applied to one state and if the risky action is optimal,
then that action is taken immediately; otherwise, the safe default is taken
immediately. The state where judgment is applied fi rst is the state most likely
to arise. This will be state 1 if > 1/ 2. This strategy might be chosen if
ˆ v max μ
{ R + (1 μ)( vR + (1 v) r), S} + (1 v) S
(
) > S
S
ˆ > ˆ
,
J 1
v max μ R + (1 μ)( vR + (1 v) r ), S
{
} + (1 v) S
which clearly requires that R + (1 – )( vR + (1 – v) r) > S.
The other strategy (we will term here J2) is where judgment is applied to
one state and if the risky action is optimal, then judgment is applied to the
next state; otherwise, the safe default is taken immediately. Note that J2 is
preferred to J1 if
ˆ ( vˆ ( vR + (1 v) S) + (1 v) S)
> ˆ v max μ
{ R + (1 μ)( vR + (1 v) r), S} + (1 v) S
(
)
ˆ v( vR + (1 v) S) > v max μ
{ R + (1 μ)( vR + (1 v) r), S}
max μ
{ R + (1 μ)( vR + (1 v) r), S}
ˆ >
.
vR + (1 v) S
This is intuitive. Basically, it is only when the effi
ciency of judgment is suf-
fi ciently high that more judgment is applied. However, for this inequality to
be relevant, J2 must also be preferred to the status quo yielding a payoff of
S. Thus, J2 is not dominated if
A Theory of Decision- Making and Artifi cial Intelligence 101
{
(
) } S(4 v 2 R+ S(1+2 v 3 v 2)) (1 v) S
max μ R + (1 μ) vR+(1 v) r , S
ˆ > ˆ
max
,
J 2
vR + (1 v) S
2 v vR + (1 v) S
(
)
,
where the fi rst term is the range where J2 dominates J1, while the second
term is where J2 dominates S alone; so for J2 to be optimal, it must exceed
both. Note also that as →( S – r)/ ( R – r) (its highest possible level consistent with [A1] and [A2]), then ˆ → 1.
J 2
If R + (1 – )( vR + (1 – v) r) > S, note that
μ R+(1 μ)( vR+(1 v) r)
S
ˆ > ˆ
>
J 2
J 1
vR + (1 v) S
v (μ R + (1 μ)( vR + (1 v) r))+ (1 v) S
(1 v) S μ
( R+(1 μ) vR
( +(1 v) r) S)> v RS μ
( R+(1 μ) vR
( +(1 v) r))2
(
),
which may not hold for v suffi
ciently high. However, it can be shown that
when ˆ + ˆ , then the two terms of âre equal and the second term
J 2
J 1
J 2
exceeds the fi rst when ˆ ˆ . This implies that in the range where ˆ < ˆ ,
J 2
J 1
J 2
J 1
J2 dominates J1.
This analysis implies there are two types of regimes with judgment only.
If ˆ > ˆ , then easier decisions (with high ˆ) involve using J2, the next
J 2
J 1
tranche of decisions use J1 (with intermediate ˆ ) while the remainder
in volves no exercise of judgment at all. On the other hand, if ˆ < ˆ , then
J 2
J 1
the easier decisions involve using J2 while the remainder do not involve
judgment at all.
3.4.2 Prediction in the Absence of Judgment
Next, we consider the model with prediction but no judgment. Suppose
that there exists an AI that can, if deployed, identify the state prior to a
decision being made. In other words, prediction, if it occurs, is perfect; an
assumption we will relax in a later section. Initially, suppose there is no
judgment mechanism to determine what the optimal action is in each state.
Recall that, in the absence of prediction or judgment, (A1) ensures that
the safe action will be chosen. If the decision maker knows the state, then
the risky action in a given state is chosen if
vR + (1 – v) r > S.
This contradicts (A1). Thus, the expected payoff is
V = S,
P
which is the same outcome if there is no judgment or prediction.
3.4.3 Prediction and Judgment Together
Both prediction and judgment can be valuable on their own. The question
we next wish to consider is whether they are complements or substitutes.
While perfect prediction allows you to choose an action based on the
102 Ajay Agrawal, Joshua Gans, and Avi Goldfarb
actual rather than expected state, it also aff ords the same opportunity with
respect to judgment. As judgment is costly, it is useful not to waste con-
sidering what action might be taken in a state that does not arise. This was
not possible when there was no prediction. But if you receive a prediction
regarding the state, you can then apply judgment exclusively to actions in
relation to that state. To be sure, that judgment still involves a cost, but at
the same time does not lead to any wasted cognitive resources.
Given this, if the decision maker were the apply judgment after the state
is predicted, their expected discounted payoff would be
V
= max ˆ
{ ( vR + (1 v) S), S}.
PJ
This represents the highest expected payoff possible (net of the costs of
judgment). A necessary condition for both prediction and judgment to be
optimal is that: ˆ ≥ ˆ ≡ s/ [ vR + (1 – v) S]. Note that ˆ ≤ ˆ , ˆ .
PJ
PJ
J 1
J 2
3.4.4 Complements or Substitutes?
To evaluate whether prediction and judgment are complements or sub-
stitutes, we adopt the following parameterization for the eff ectiveness of
prediction: we assume that with probability e an AI yields a prediction, while
otherwise, the decision must be made in its absence (with judgment only).
With this parameterization, we can prove the following:
Proposition 2: In the range of where ˆ < ˆ , e and are complements,
J2
otherwise they are substitutes.
Proof: Step 1. Is ˆ > R/ [2( vR + (1 – v) S)]? First, note that J 2
max μ
{ R + (1 μ)( vR + (1 v) r), S}
>
R
vR + (1 v) S
2( vR + (1 v) S )
max μ
{ R + (1 μ) vR + (1 v) r
(
), S} > 1 R.
2
Note that by (A2) and since > (1/ 2), S > R + (1 – ) r > (1/ 2) R so this inequality always holds.
Second, note that
S (4 v 2 R + S(1 + 2 v 3 v 2)) (1 v) S
2 v( vR + (1 v) S )
>
R
2( vR + (1 v) S )
S (4 v 2 R + S(1+ 2 v 3 v 2)) > ( vR + (1 v) S)2
S( S
2 R) > v( R 2
6 RS + S 2),
which holds as the left- hand side is always positive while the right- hand side
is always negative.
Step 2: Suppose that R + (1 – )( vR + (1 – v) r) ≤ S; then J1 is never optimal. In this case, the expected payoff is
A Theory of Decision- Making and Artifi cial Intelligence 103
eV + (1 e) V = eˆ vR
( + (1 v) S) + (1 e)ˆ vˆ
( vR
( + (1 v) S) + (1 v) S).
PJ
J 2
This mixed partial derivative with respect to (e, ˆ ) is v( R – 2 ˆ ( vR +
(1 – v) S)). This is positive if R/ [2( vR + (1 – v) S)] ≥ ˆ. By Step 1, this implies that for ˆ < ˆ , prediction and judgment are complements; otherwise, they
J 2
are substitutes.
Step 3: Suppose that that R + (1 – )( vR + (1 – v) r) > S. Note that for ˆ ˆ < ˆ , J1 is preferred to J2. In this case, the expected payoff to prediction
J 1
J 2
and judgment is
eˆ vR
( + (1 v) S)+(1 e)ˆ v( max μ
{ R+(1 μ) vR
( + (1 v) r), S} + (1 v) S).
This mixed partial derivative with respect to (e, ˆ ) is v( R – max{ R +
(1 – )( vR + (1 – v) r), S}) > 0. By Step 1, this implies that for ˆ < ˆ , predic-J 2
tion and judgment are complements; otherwise, they are substitutes.
The intuition is as follows. When ˆ < ˆ , then, in the absence of prediction
J 2
either no judgment is applied or, alternatively, strategy J1 (with one round
of judgment) is optimal; e parameterizes the degree of diff erence between
the expected value with both prediction and judgment and the expected
value without prediction with an increase in , increasing both. However,
with one round of judgment, the increase when judgment is used alone is
less than that when both are used together. Thus, when ˆ < ˆ , prediction
J 2
and judgment are complements.
By contrast, when ˆ > ˆ , then strategy J2 (with two rounds of judgment)
J 2
is used in the absence of prediction. In this case, increasing increases the
expected payoff from judgment alone disproportionately more because judg-
ment is applied on both states, whereas under prediction and judgment it
is only applied on one. Thus, improving the quality of judgment reduces
the returns to prediction. And so, when ˆ >