The Economics of Artificial Intelligence

Home > Other > The Economics of Artificial Intelligence > Page 16
The Economics of Artificial Intelligence Page 16

by Ajay Agrawal


  Athey, Susan, and Guido Imbens. 2016. “Recursive Partitioning for Heterogeneous

  Causal Eff ects.” Proceedings of the National Academy of Sciences 113:7353– 60.

  Bengio, Yoshua, and Yann LeCun. 2007. “Scaling Learning Algorithms towards

  AI.” Large- Scale Kernel Machines 34 (5): 1– 41.

  13. The exception to this is web search, which has been eff ectively solved through AI.

  86 Matt Taddy

  Bottou, Léon, and Oliver Bousquet. 2008. “The Tradeoff s of Large Scale Learning.”

  In Advances in Neural Information Processing Systems, 161– 68. NIPS Foundation.

  http:// books.nips.cc.

  Bresnahan, Timothy. 2010. “General Purpose Technologies.” Handbook of the Eco-

  nomics of Innovation 2:761– 91.

  Deaton, Angus, and John Muellbauer. 1980. “An Almost Ideal Demand System.”

  American Economic Review 70:312– 26.

  Duchi, John, Elad Hazan, and Yoram Singer. 2011. “Adaptive Subgradient Methods

  for Online Learning and Stochastic Optimization.” Journal of Machine Learning

  Research 12:2121– 59.

  Feurer, Matthias, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel

  Blum, and Frank Hutter. 2015. “Effi

  cient and Robust Automated Machine Learn-

  ing.” In Advances in Neural Information Processing Systems, 2962– 70. Cambridge, MA: MIT Press.

  Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. Cam-

  bridge, MA: MIT Press.

  Goodfellow, Ian, Jean Pouget- Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley,

  Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. “Generative Adversarial

  Nets.” In Advances in Neural Information Processing Systems, 2672– 80. Cam-

  bridge, MA: MIT Press.

  Hardt, Moritz, Ben Recht, and Yoram Singer. 2016. “Train Faster, Generalize Better:

  Stability of Stochastic Gradient Descent.” In Proceedings of the 33rd International Conference on Machine Learning 48:1225– 34. http:// proceedings.mlr.press/ v48

  / hardt16 .pdf.

  Hartford, Jason, Greg Lewis, Kevin Leyton- Brown, and Matt Taddy. 2017. “Deep

  IV: A Flexible Approach for Counterfactual Prediction.” In Proceedings of the

  34th International Conference on Machine Learning 70:1414– 23. http:// proceedings

  .mlr.press/ v70/ hartford17a .html.

  Haugeland, John. 1985. Artifi cial Intelligence: The Very Idea. Cambridge, MA: MIT

  Press.

  He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. “Deep Residual

  Learning for Image Recognition.” In Proceedings of the IEEE Conference on

  Computer Vision and Pattern Recognition, 770– 78. https:// www .doi .org/ 10.1109

  / CVPR.2016.90.

  Heckman, James J. 1977. “Sample Selection Bias as a Specifi cation Error (with an

  Application to the Estimation of Labor Supply Functions).” NBER Working

  Paper no. 172, Cambridge, MA.

  Hinton, Geoff rey E., Simon Osindero, and Yee- Whye Teh. 2006. “A Fast Learning

  Algorithm for Deep Belief Nets.” Neural Computation 18 (7): 1527– 54.

  Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long Short- Term Memory.”

  Neural Computation 9 (8): 1735– 80.

  Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. 1989. “Multilayer Feed-

  forward Networks are Universal Approximators.” Neural Networks 2:359– 66.

  Karpathy, Andrej, and Li Fei- Fei. 2015. “Deep Visual- Semantic Alignments for

  Generating Image Descriptions.” In Proceedings of the IEEE Conference on Com-

  puter Vision and Pattern Recognition 39: (4) 3128– 37.

  Kendall, Alex, and Yarin Gal. 2017. “What Uncertainties Do We Need in Bayesian

  Deep Learning for Computer Vision?” arXiv preprint arXiv:1703.04977. https://

  arxiv .org/ abs/ 1703.04977.

  Kingma, Diederik, and Jimmy Ba. 2015. “ADAM: A Method for Stochastic Opti-

  mization.” In Third International Conference on Learning Representations (ICLR).

  https:// arxiv .org/ abs/ 1412.6980.

  The Technological Elements of Artifi cial Intelligence 87

  Krizhevsky, Alex, Ilya Sutskever, and Geoff rey E. Hinton. 2012. “Imagenet Clas-

  sifi cation with Deep Convolutional Neural Networks.” In Advances in Neural

  Information Processing Systems 1:1097– 105.

  Lanier, Jaron. 2014. Who Owns the Future? New York: Simon & Schuster.

  LeCun, Yann, and Yoshua Bengio. 1995. “Convolutional Networks for Images,

  Speech, and Time Series.” In The Handbook of Brain Theory and Neural Networks,

  255– 58. Cambridge, MA: MIT Press.

  LeCun, Yann, Léon Bottou, Yoshua Bengio, and Patrick Haff ner. 1998. “Gradient-

  Based Learning Applied to Document Recognition.” Proceedings of the IEEE

  86:2278– 324.

  McFadden, Daniel. 1980. “Econometric Models for Probabilistic Choice among

  Products.” Journal of Business 53 (3): S13– 29.

  Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean.

  2013.”Distributed Representations of Words and Phrases and Their Composi-

  tionality.” In Advances in Neural Information Processing Systems 2:3111– 19.

  Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness,

  Marc G. Bellemare, Alex Graves, et al. 2015. “Human- Level Control through

  Deep Reinforcement Learning.” Nature 518 (7540): 529– 33.

  Neal, Radford M. 2012. Bayesian Learning for Neural Networks, vol. 118. New York: Springer Science & Business Media.

  Nielsen, Michael A. 2015. Neural Networks and Deep Learning. Determination Press.

  http:// neuralnetworksanddeeplearning .com/.

  Robbins, Herbert, and Sutton Monro. 1951. “A Stochastic Approximation Method.”

  Annals of Mathematical Statistics, 22 (3): 400– 407.

  Rosenblatt, Frank. 1958. “The Perceptron: A Probabilistic Model for Information

  Storage and Organization in the Brain.” Psychological Review 65:386.

  Rumelhart, David E., Geoff rey E. Hinton, and Ronald J. Williams. 1988. “Learning

  Representations by Back- Propagating Errors.” Cognitive Modeling 5 (3): 1.

  Sabour, Sara, Nicholas Frosst, and Geoff rey E. Hinton. 2017. “Dynamic Rout-

  ing between Capsules.” In Advances in Neural Information Processing Systems,

  3857– 67.

  Silver, David, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George

  Van Den Driessche, Julian Schrittwieser, et al. 2016. “Mastering the Game of Go

  with Deep Neural Networks and Tree Search.” Nature 529:484– 89.

  Silver, David, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja

  Huang, Arthur Guez, et al. 2017. “Mastering the Game of Go without Human

  Knowledge.” Nature 550:354– 59.

  Srivastava, Nitish, Geoff rey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan

  Salakhutdinov. 2014. “Dropout: A Simple Way to Prevent Neural Networks from

  Overfi tting.” Journal of Machine Learning Research 15 (1): 1929– 58.

  Taddy, Matt, Herbert K. H. Lee, Genetha A. Gray, and Joshua D Griffi

  n. 2009.

  “Bayesian Guided Pattern Search for Robust Local Optimization.” Technometrics

  51 (4): 389– 401.

  Thompson, William R. 1933. “On the Likelihood That One Unknown Probability

  Exceeds Another in View of the Evidence of Two Samples.” Biometrika 25:285– 94.

  Toulis, Panagiotis, Edoardo Airoldi, and Jason Rennie. 2014. “Statistical Analysis

  of Stochastic G
radient Methods for Generalized Linear Models.” In International

  Conference on Machine Learning, 667– 75.

  van Seijen, Harm, Mehdi Fatemi, Joshua Romoff , Romain Laroche, Tavian Barnes,

  and Jeff rey Tsang. 2017. “Hybrid Reward Architecture for Reinforcement Learn-

  ing.” arXiv:1706.04208. https:// arxiv .org/ abs/ 1706.04208.

  3

  Prediction, Judgment,

  and Complexity

  A Theory of Decision- Making

  and Artifi cial Intelligence

  Ajay Agrawal, Joshua Gans, and Avi Goldfarb

  3.1 Introduction

  There is widespread discussion regarding the impact of machines on

  employment (see Autor 2015). In some sense, the discussion mirrors a long-

  standing literature on the impact of the accumulation of capital equipment

  on employment; specifi cally, whether capital and labor are substitutes or

  complements (Acemoglu 2003). But the recent discussion is motivated by

  the integration of software with hardware and whether the role of machines

  goes beyond physical tasks to mental ones as well (Brynjolfsson and McAfee

  2014). As mental tasks were seen as always being present and essential,

  human comparative advantage in these was seen as the main reason why, at

  least in the long term, capital accumulation would complement employment

  by enhancing labor productivity in those tasks.

  The computer revolution has blurred the line between physical and men-

  Ajay Agrawal is the Peter Munk Professor of Entrepreneurship at the Rotman School of

  Management, University of Toronto, and a research associate of the National Bureau of Economic Research. Joshua Gans is professor of strategic management and holder of the Jeff rey S.

  Skoll Chair of Technical Innovation and Entrepreneurship at the Rotman School of Management, University of Toronto (with a cross appointment in the Department of Economics), and a research associate of the National Bureau of Economic Research. Avi Goldfarb holds the Rotman Chair in Artifi cial Intelligence and Healthcare and is professor of marketing at the Rotman School of Management, University of Toronto, and is a research associate of the National Bureau of Economic Research.

  Our thanks to Andrea Prat, Scott Stern, Hal Varian, and participants at the AEA (Chicago), NBER Summer Institute (2017), NBER Economics of AI Conference (Toronto), Columbia

  Law School, Harvard Business School, MIT, and University of Toronto for helpful comments.

  Responsibility for all errors remains our own. The latest version of this chapter is available at joshuagans .com. For acknowledgments, sources of research support, and disclosure of the authors’ material fi nancial relationships, if any, please see http:// www .nber .org/ chapters

  / c14010.ack.

  89

  90 Ajay Agrawal, Joshua Gans, and Avi Goldfarb

  tal tasks. For instance, the invention of the spreadsheet in the late 1970s

  fundamentally changed the role of bookkeepers. Prior to that invention,

  there was a time- intensive task involving the recomputation of outcomes in

  spreadsheets as data or assumptions changed. That human task was substi-

  tuted by the spreadsheet software that could produce the calculations more

  quickly, cheaply, and frequently. However, at the same time, the spreadsheet

  made the jobs of accountants, analysts, and others far more productive.

  In the accounting books, capital was substituting for labor, but the mental

  productivity of labor was being changed. Thus, the impact on employment

  critically depended on whether there were tasks the “computers cannot do.”

  These assumptions persist in models today. Acemoglu and Restrepo

  (2017) observe that capital substitutes for labor in certain tasks while at the

  same time technological progress creates new tasks. They make what they

  call a “natural assumption” that only labor can perform the new tasks as

  they are more complex than previous ones.1 Benzell et al. (2015) consider

  the impact of software more explicitly. Their environment has two types of

  labor—high- tech (who can, among other things, code) and low- tech (who

  are empathetic and can handle interpersonal tasks). In this environment,

  it is the low- tech workers who cannot be replaced by machines while the

  high- tech ones are employed initially to create the code that will eventually

  displace their kind. The results of the model depend, therefore, on a class

  of worker who cannot be substituted directly for capital, but also on the

  inability of workers themselves to substitute between classes.

  In this chapter, our approach is to delve into the weeds of what is hap-

  pening currently in the fi eld of artifi cial intelligence (AI). The recent wave

  of developments in AI all involve advances in machine learning. Those

  advances allow for automated and cheap prediction; that is, providing a

  forecast (or nowcast) of a variable of interest from available data (Agrawal,

  Gans and Goldfarb 2018b). In some cases, prediction has enabled full auto-

  mation of tasks—for example, self- driving vehicles where the process of

  data collection, prediction of behavior and surroundings, and actions are

  all conducted without a human in the loop. In other cases, prediction is a

  standalone tool—such as image recognition or fraud detection—that may

  or may not lead to further substitution of human users of such tools by

  machines. Thus far, substitution between humans and machines has focused

  mainly on cost considerations. Are machines cheaper, more reliable, and

  more scalable (in their software form) than humans? This chapter, however,

  considers the role of prediction in decision- making explicitly and from that

  examines the complementary skills that may be matched with prediction

  within a task.

  1. To be sure, their model is designed to examine how automation of tasks causes a change in factor prices that biases innovation toward the creation of new tasks that labor is more suited to.

  A Theory of Decision- Making and Artifi cial Intelligence 91

  Our focus, in this regard, is on what we term judgment. While judgment

  is a term with broad meaning, here we use it to refer to a very specifi c skill.

  To see this, consider a decision. That decision involves choosing an action,

  x, from a set, X. The payoff (or reward) from that action is defi ned by a function, u( x, ) where is a realization of an uncertain state drawn from a distribution, F(). Suppose that, prior to making a decision, a prediction (or signal), s, can be generated that results in a posterior, F(| s). Thus, the decision maker would solve

  max

  u x,

  ( ) dF s

  ( ).

  x X

  In other words, a standard problem of choice under uncertainty. In this

  standard world, the role of prediction is to improve decision- making. The

  payoff , or utility function, is known.

  To create a role for judgment, we depart from this standard set-up in

  statistical decision theory and ask how a decision maker comes to know the

  function, u( x, )? We assume that this is not simply given or a primitive of the decision- making model. Instead, it requires a human to undertake a costly

  process that allows the mapping from ( x, ) to a particular payoff value, u, to be discovered. This is a reasonable assumption given that beyond some rudimentary experimentation in closed environments, there is no current way for

  an AI to impute a utility function that resides with humans. Additionally,

  this
process separates the costs of providing the mapping for each pair, ( x, ).

  (Actually, we focus, without loss in generality, on situations where u( x, ) ≠

  u( x) for all and presume that if a payoff to an action is state independent that payoff is known.) In other words, while prediction can obtain a signal

  of the underlying state, judgment is the process by which the payoff s from

  actions that arise based on that state can be determined. We assume that

  this process of determining payoff s requires human understanding of the

  situation: it is not a prediction problem.

  For intuition on the diff erence between prediction and judgment, consider

  the example of credit card fraud. A bank observes a credit card transaction.

  That transaction is either legitimate or fraudulent. The decision is whether

  to approve the transaction. If the bank knows for sure that the transaction

  is legitimate, the bank will approve it. If the bank knows for sure that it is

  fraudulent, the bank will refuse the transaction. Why? Because the bank

  knows the payoff of approving a legitimate transaction is higher than the

  payoff of refusing that transaction. Things get more interesting if the bank

  is uncertain about whether the transaction is legitimate. The uncertainty

  means that the bank also needs to know the payoff from refusing a legitimate

  transaction and from approving a fraudulent transaction. In our model,

  judgment is the process of determining these payoff s. It is a costly activity,

  in the sense that it requires time and eff ort.

  As the new developments regarding AI all involve making prediction

  more readily available, we ask, how does judgment and its endogenous appli-

  92 Ajay Agrawal, Joshua Gans, and Avi Goldfarb

  cation change the value of prediction? Are prediction and judgment sub-

  stitutes or complements? How does the value of prediction change mono-

  tonically with the diffi

  culty of applying judgment? In complex environments

  (as they relate to automation, contracting, and the boundaries of the fi rm),

  how do improvements in prediction aff ect the value of judgment?

  We proceed by fi rst providing supportive evidence for our assumption that

  recent developments in AI overwhelmingly impact the costs of prediction.

  We then use the example of radiology to provide a context for understand-

 

‹ Prev