The Economics of Artificial Intelligence Page 16 Read online free by Ajay Agrawal

Home > Other > The Economics of Artificial Intelligence > Page 16

The Economics of Artificial Intelligence Page 16

Athey, Susan, and Guido Imbens. 2016. “Recursive Partitioning for Heterogeneous

Causal Eff ects.” Proceedings of the National Academy of Sciences 113:7353– 60.

Bengio, Yoshua, and Yann LeCun. 2007. “Scaling Learning Algorithms towards

AI.” Large- Scale Kernel Machines 34 (5): 1– 41.

13. The exception to this is web search, which has been eff ectively solved through AI.

86 Matt Taddy

Bottou, Léon, and Oliver Bousquet. 2008. “The Tradeoff s of Large Scale Learning.”

In Advances in Neural Information Processing Systems, 161– 68. NIPS Foundation.

http:// books.nips.cc.

Bresnahan, Timothy. 2010. “General Purpose Technologies.” Handbook of the Eco-

nomics of Innovation 2:761– 91.

Deaton, Angus, and John Muellbauer. 1980. “An Almost Ideal Demand System.”

American Economic Review 70:312– 26.

Duchi, John, Elad Hazan, and Yoram Singer. 2011. “Adaptive Subgradient Methods

for Online Learning and Stochastic Optimization.” Journal of Machine Learning

Research 12:2121– 59.

Feurer, Matthias, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel

Blum, and Frank Hutter. 2015. “Effi

cient and Robust Automated Machine Learn-

ing.” In Advances in Neural Information Processing Systems, 2962– 70. Cambridge, MA: MIT Press.

Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. Cam-

bridge, MA: MIT Press.

Goodfellow, Ian, Jean Pouget- Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley,

Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. “Generative Adversarial

Nets.” In Advances in Neural Information Processing Systems, 2672– 80. Cam-

bridge, MA: MIT Press.

Hardt, Moritz, Ben Recht, and Yoram Singer. 2016. “Train Faster, Generalize Better:

Stability of Stochastic Gradient Descent.” In Proceedings of the 33rd International Conference on Machine Learning 48:1225– 34. http:// proceedings.mlr.press/ v48

/ hardt16 .pdf.

Hartford, Jason, Greg Lewis, Kevin Leyton- Brown, and Matt Taddy. 2017. “Deep

IV: A Flexible Approach for Counterfactual Prediction.” In Proceedings of the

34th International Conference on Machine Learning 70:1414– 23. http:// proceedings

.mlr.press/ v70/ hartford17a .html.

Haugeland, John. 1985. Artifi cial Intelligence: The Very Idea. Cambridge, MA: MIT

Press.

He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. “Deep Residual

Learning for Image Recognition.” In Proceedings of the IEEE Conference on

Computer Vision and Pattern Recognition, 770– 78. https:// www .doi .org/ 10.1109

/ CVPR.2016.90.

Heckman, James J. 1977. “Sample Selection Bias as a Specifi cation Error (with an

Application to the Estimation of Labor Supply Functions).” NBER Working

Paper no. 172, Cambridge, MA.

Hinton, Geoff rey E., Simon Osindero, and Yee- Whye Teh. 2006. “A Fast Learning

Algorithm for Deep Belief Nets.” Neural Computation 18 (7): 1527– 54.

Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long Short- Term Memory.”

Neural Computation 9 (8): 1735– 80.

Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. 1989. “Multilayer Feed-

forward Networks are Universal Approximators.” Neural Networks 2:359– 66.

Karpathy, Andrej, and Li Fei- Fei. 2015. “Deep Visual- Semantic Alignments for

Generating Image Descriptions.” In Proceedings of the IEEE Conference on Com-

puter Vision and Pattern Recognition 39: (4) 3128– 37.

Kendall, Alex, and Yarin Gal. 2017. “What Uncertainties Do We Need in Bayesian

Deep Learning for Computer Vision?” arXiv preprint arXiv:1703.04977. https://

arxiv .org/ abs/ 1703.04977.

Kingma, Diederik, and Jimmy Ba. 2015. “ADAM: A Method for Stochastic Opti-

mization.” In Third International Conference on Learning Representations (ICLR).

https:// arxiv .org/ abs/ 1412.6980.

The Technological Elements of Artifi cial Intelligence 87

Krizhevsky, Alex, Ilya Sutskever, and Geoff rey E. Hinton. 2012. “Imagenet Clas-

sifi cation with Deep Convolutional Neural Networks.” In Advances in Neural

Information Processing Systems 1:1097– 105.

Lanier, Jaron. 2014. Who Owns the Future? New York: Simon & Schuster.

LeCun, Yann, and Yoshua Bengio. 1995. “Convolutional Networks for Images,

Speech, and Time Series.” In The Handbook of Brain Theory and Neural Networks,

255– 58. Cambridge, MA: MIT Press.

LeCun, Yann, Léon Bottou, Yoshua Bengio, and Patrick Haff ner. 1998. “Gradient-

Based Learning Applied to Document Recognition.” Proceedings of the IEEE

86:2278– 324.

McFadden, Daniel. 1980. “Econometric Models for Probabilistic Choice among

Products.” Journal of Business 53 (3): S13– 29.

Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean.

2013.”Distributed Representations of Words and Phrases and Their Composi-

tionality.” In Advances in Neural Information Processing Systems 2:3111– 19.

Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness,

Marc G. Bellemare, Alex Graves, et al. 2015. “Human- Level Control through

Deep Reinforcement Learning.” Nature 518 (7540): 529– 33.

Neal, Radford M. 2012. Bayesian Learning for Neural Networks, vol. 118. New York: Springer Science & Business Media.

Nielsen, Michael A. 2015. Neural Networks and Deep Learning. Determination Press.

http:// neuralnetworksanddeeplearning .com/.

Robbins, Herbert, and Sutton Monro. 1951. “A Stochastic Approximation Method.”

Annals of Mathematical Statistics, 22 (3): 400– 407.

Rosenblatt, Frank. 1958. “The Perceptron: A Probabilistic Model for Information

Storage and Organization in the Brain.” Psychological Review 65:386.

Rumelhart, David E., Geoff rey E. Hinton, and Ronald J. Williams. 1988. “Learning

Representations by Back- Propagating Errors.” Cognitive Modeling 5 (3): 1.

Sabour, Sara, Nicholas Frosst, and Geoff rey E. Hinton. 2017. “Dynamic Rout-

ing between Capsules.” In Advances in Neural Information Processing Systems,

3857– 67.

Silver, David, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George

Van Den Driessche, Julian Schrittwieser, et al. 2016. “Mastering the Game of Go

with Deep Neural Networks and Tree Search.” Nature 529:484– 89.

Silver, David, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja

Huang, Arthur Guez, et al. 2017. “Mastering the Game of Go without Human

Knowledge.” Nature 550:354– 59.

Srivastava, Nitish, Geoff rey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan

Salakhutdinov. 2014. “Dropout: A Simple Way to Prevent Neural Networks from

Overfi tting.” Journal of Machine Learning Research 15 (1): 1929– 58.

Taddy, Matt, Herbert K. H. Lee, Genetha A. Gray, and Joshua D Griffi

n. 2009.

“Bayesian Guided Pattern Search for Robust Local Optimization.” Technometrics

51 (4): 389– 401.

Thompson, William R. 1933. “On the Likelihood That One Unknown Probability

Exceeds Another in View of the Evidence of Two Samples.” Biometrika 25:285– 94.

Toulis, Panagiotis, Edoardo Airoldi, and Jason Rennie. 2014. “Statistical Analysis

of Stochastic G
radient Methods for Generalized Linear Models.” In International

Conference on Machine Learning, 667– 75.

van Seijen, Harm, Mehdi Fatemi, Joshua Romoff , Romain Laroche, Tavian Barnes,

and Jeff rey Tsang. 2017. “Hybrid Reward Architecture for Reinforcement Learn-

ing.” arXiv:1706.04208. https:// arxiv .org/ abs/ 1706.04208.

3

Prediction, Judgment,

and Complexity

A Theory of Decision- Making

and Artifi cial Intelligence

Ajay Agrawal, Joshua Gans, and Avi Goldfarb

3.1 Introduction

There is widespread discussion regarding the impact of machines on

employment (see Autor 2015). In some sense, the discussion mirrors a long-

standing literature on the impact of the accumulation of capital equipment

on employment; specifi cally, whether capital and labor are substitutes or

complements (Acemoglu 2003). But the recent discussion is motivated by

the integration of software with hardware and whether the role of machines

goes beyond physical tasks to mental ones as well (Brynjolfsson and McAfee

2014). As mental tasks were seen as always being present and essential,

human comparative advantage in these was seen as the main reason why, at

least in the long term, capital accumulation would complement employment

by enhancing labor productivity in those tasks.

The computer revolution has blurred the line between physical and men-

Ajay Agrawal is the Peter Munk Professor of Entrepreneurship at the Rotman School of

Management, University of Toronto, and a research associate of the National Bureau of Economic Research. Joshua Gans is professor of strategic management and holder of the Jeff rey S.

Skoll Chair of Technical Innovation and Entrepreneurship at the Rotman School of Management, University of Toronto (with a cross appointment in the Department of Economics), and a research associate of the National Bureau of Economic Research. Avi Goldfarb holds the Rotman Chair in Artifi cial Intelligence and Healthcare and is professor of marketing at the Rotman School of Management, University of Toronto, and is a research associate of the National Bureau of Economic Research.

Our thanks to Andrea Prat, Scott Stern, Hal Varian, and participants at the AEA (Chicago), NBER Summer Institute (2017), NBER Economics of AI Conference (Toronto), Columbia

Law School, Harvard Business School, MIT, and University of Toronto for helpful comments.

Responsibility for all errors remains our own. The latest version of this chapter is available at joshuagans .com. For acknowledgments, sources of research support, and disclosure of the authors’ material fi nancial relationships, if any, please see http:// www .nber .org/ chapters

/ c14010.ack.

89

90 Ajay Agrawal, Joshua Gans, and Avi Goldfarb

tal tasks. For instance, the invention of the spreadsheet in the late 1970s

fundamentally changed the role of bookkeepers. Prior to that invention,

there was a time- intensive task involving the recomputation of outcomes in

spreadsheets as data or assumptions changed. That human task was substi-

tuted by the spreadsheet software that could produce the calculations more

quickly, cheaply, and frequently. However, at the same time, the spreadsheet

made the jobs of accountants, analysts, and others far more productive.

In the accounting books, capital was substituting for labor, but the mental

productivity of labor was being changed. Thus, the impact on employment

critically depended on whether there were tasks the “computers cannot do.”

These assumptions persist in models today. Acemoglu and Restrepo

(2017) observe that capital substitutes for labor in certain tasks while at the

same time technological progress creates new tasks. They make what they

call a “natural assumption” that only labor can perform the new tasks as

they are more complex than previous ones.1 Benzell et al. (2015) consider

the impact of software more explicitly. Their environment has two types of

labor—high- tech (who can, among other things, code) and low- tech (who

are empathetic and can handle interpersonal tasks). In this environment,

it is the low- tech workers who cannot be replaced by machines while the

high- tech ones are employed initially to create the code that will eventually

displace their kind. The results of the model depend, therefore, on a class

of worker who cannot be substituted directly for capital, but also on the

inability of workers themselves to substitute between classes.

In this chapter, our approach is to delve into the weeds of what is hap-

pening currently in the fi eld of artifi cial intelligence (AI). The recent wave

of developments in AI all involve advances in machine learning. Those

advances allow for automated and cheap prediction; that is, providing a

forecast (or nowcast) of a variable of interest from available data (Agrawal,

Gans and Goldfarb 2018b). In some cases, prediction has enabled full auto-

mation of tasks—for example, self- driving vehicles where the process of

data collection, prediction of behavior and surroundings, and actions are

all conducted without a human in the loop. In other cases, prediction is a

standalone tool—such as image recognition or fraud detection—that may

or may not lead to further substitution of human users of such tools by

machines. Thus far, substitution between humans and machines has focused

mainly on cost considerations. Are machines cheaper, more reliable, and

more scalable (in their software form) than humans? This chapter, however,

considers the role of prediction in decision- making explicitly and from that

examines the complementary skills that may be matched with prediction

within a task.

1. To be sure, their model is designed to examine how automation of tasks causes a change in factor prices that biases innovation toward the creation of new tasks that labor is more suited to.

A Theory of Decision- Making and Artifi cial Intelligence 91

Our focus, in this regard, is on what we term judgment. While judgment

is a term with broad meaning, here we use it to refer to a very specifi c skill.

To see this, consider a decision. That decision involves choosing an action,

x, from a set, X. The payoff (or reward) from that action is defi ned by a function, u( x, ) where is a realization of an uncertain state drawn from a distribution, F(). Suppose that, prior to making a decision, a prediction (or signal), s, can be generated that results in a posterior, F(| s). Thus, the decision maker would solve

max

u x,

( ) dF s

( ).

x X

In other words, a standard problem of choice under uncertainty. In this

standard world, the role of prediction is to improve decision- making. The

payoff , or utility function, is known.

To create a role for judgment, we depart from this standard set-up in

statistical decision theory and ask how a decision maker comes to know the

function, u( x, )? We assume that this is not simply given or a primitive of the decision- making model. Instead, it requires a human to undertake a costly

process that allows the mapping from ( x, ) to a particular payoff value, u, to be discovered. This is a reasonable assumption given that beyond some rudimentary experimentation in closed environments, there is no current way for

an AI to impute a utility function that resides with humans. Additionally,

this
process separates the costs of providing the mapping for each pair, ( x, ).

(Actually, we focus, without loss in generality, on situations where u( x, ) ≠

u( x) for all and presume that if a payoff to an action is state independent that payoff is known.) In other words, while prediction can obtain a signal

of the underlying state, judgment is the process by which the payoff s from

actions that arise based on that state can be determined. We assume that

this process of determining payoff s requires human understanding of the

situation: it is not a prediction problem.

For intuition on the diff erence between prediction and judgment, consider

the example of credit card fraud. A bank observes a credit card transaction.

That transaction is either legitimate or fraudulent. The decision is whether

to approve the transaction. If the bank knows for sure that the transaction

is legitimate, the bank will approve it. If the bank knows for sure that it is

fraudulent, the bank will refuse the transaction. Why? Because the bank

knows the payoff of approving a legitimate transaction is higher than the

payoff of refusing that transaction. Things get more interesting if the bank

is uncertain about whether the transaction is legitimate. The uncertainty

means that the bank also needs to know the payoff from refusing a legitimate

transaction and from approving a fraudulent transaction. In our model,

judgment is the process of determining these payoff s. It is a costly activity,

in the sense that it requires time and eff ort.

As the new developments regarding AI all involve making prediction

more readily available, we ask, how does judgment and its endogenous appli-

92 Ajay Agrawal, Joshua Gans, and Avi Goldfarb

cation change the value of prediction? Are prediction and judgment sub-

stitutes or complements? How does the value of prediction change mono-

tonically with the diffi

culty of applying judgment? In complex environments

(as they relate to automation, contracting, and the boundaries of the fi rm),

how do improvements in prediction aff ect the value of judgment?

We proceed by fi rst providing supportive evidence for our assumption that

recent developments in AI overwhelmingly impact the costs of prediction.

We then use the example of radiology to provide a context for understand-

‹ Prev Next ›