by Ajay Agrawal
gate .net/ publication/ 321902459_Behavior_Revealed_in_Mobile_Phone_Usage
_Predicts_Loan_Repayment.
Blei, D. M. 2012. “Probabilistic Topic Models.” Communications of the ACM 55
(4): 77.
Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research, 3 (Jan): 993– 1022.
Chapelle, O., and L. Li. 2011. “An Empirical Evaluation of Thompson Sampling.”
Proceedings of the Conference on Neural Information Processing Systems. https://
papers.nips.cc/ paper/ 4321-an- empirical- evaluation- of-thompson- sampling.
Chernozhukov, V., D. Chetverikov, M. Demirer, E. Duo, C. Hansen, and W. Newey.
2017. “Double/ Debiased/ Neyman Machine Learning of Treatment Eff ects. Janu-
ary. Cornell University Library. http:// arxiv .org/ abs/ 1701.08687.
Chernozhukov, V., C. Hansen, and M. Spindler. 2015. “Valid Post- Selection and
Post- Regularization Inference: An Elementary, General Approach.” January.
www .doi .org/ 10.1146/ annurev- economics- 012315-015826.
Dimakopoulou, M., S. Athey, and G. Imbens. 2017. “Estimation Considerations in
Contextual Bandits.” Cornell University Library. https:// arxiv .org/ abs/ 1711.07077.
Doudchenko, N., and G. W. Imbens. 2016. “Balancing, Regression, Diff erence-
in-Diff erences and Synthetic Control Methods: A Synthesis.” Technical report,
National Bureau of Economic Research.
The Impact of Machine Learning on Economics 545
Dudik, M., D. Erhan, J. Langford, and L. Li. 2014. “Doubly Robust Policy Evalu-
ation and Optimization.” Statistical Science 29 (4): 485– 511.
Dudik, M., J. Langford, and L. Li. 2011. “Doubly Robust Policy Evaluation and
Learning.” International Conference on Machine Learning.
Dufl o, E. 2017. “The Economist as Plumber.” NBER Working Paper no. 23213,
Cambridge, MA.
Eckles, D., B. Karrer, J. Ugander, L. Adamic, I. Dhillon, Y. Koren, R. Ghani,
P. Senator, J. Bradley, and R. Parekh. 2016. “Design and Analysis of Experiments
in Networks: Reducing Bias from Interference.” Journal of Causal Inference 1– 62.
www .doi .org/ 10.1515/ jci- 2015-0021.
Egami, N., C. Fong, J. Grimmers, M. Roberts, and B. Stewart. 2016. “How to Make
Causal Inferences Using Text.” Working paper. https:// polmeth.polisci.wisc .edu
/ Papers/ ais .pdf.
Engstrom, R., J. Hersh, and D. Newhouse. 2017. “Poverty from Space: Using High-
Resolution Satellite Imagery for Estimating Economic Well- Being (English).”
Policy Research Working Paper no. WPS 8284, Washington, DC, World Bank
Group.
Feraud, R., R. Allesiardo, T. Urvoy, and F. Clerot. 2016. “Random Forest for the Con-
textual Bandit Problem.” Proceedings of Machine Learning Research 51:93– 101.
Glaeser, E. L., A. Hillis, S. D. Kominers, and M. Luca. 2016. “Predictive Cities
Crowdsourcing City Government: Using Tournaments to Improve Inspection
Accuracy.” American Economic Review 106 (5): 114– 18.
Glaeser, E. L., S. D. Kominers, M. Luca, and N. Naik. 2015. “Big Data and Big
Cities: The Promises and Limitations of Improved Measures of Urban Life.”
NBER Working Paper no. 21778, Cambridge, MA.
Glaeser, E. L., S. D. Kominers, M. Luca, and N. Naik. 2018. “Big Data and Big
Cities: The Promises and Limitations of Improved Measures of Urban Life.”
Economic Inquiry 56 (1): 114– 37.
Goel, S., J. M. Rao, and R. Shroff . 2016. “Precinct or Prejudice? Understanding
Racial Disparities in New York City’s Stop- and- Frisk Policy.” Annals of Applied
Statistics 10 (1): 365– 94.
Goldenshluger, A., and A. Zeevi. 2013. “A Linear Response Bandit Problem.” Sto-
chastic Systems 3 (1): 230– 61.
Goldman, M., and J. M. Rao. 2014. “Experiments as Instruments: Heteroge-
neous Position Eff ects in Sponsored Search Auctions.” The Third Conference
on Auctions, Market Mechanisms and Their Applications. www .doi .org/ 10.4108
/ eai.8-8-2015.2261043.
Gopalan, P., J. M. Hofman, and D. M. Blei. 2015. “Scalable Recommendation with
Hierarchical Poisson Factorization.” Proceedings of the Thirty- First Conference
on Uncertainty in Artifi cial Intelligence, 326– 35.
Hahn, J. 1998. “On the Role of the Propensity Score in Effi
cient Semiparametric
Estimation of Average Treatment Eff ects.” Econometrica:315– 31.
Hartford, J., G. Lewis, and M. Taddy. 2016. “Counterfactual Prediction with
Deep Instrumental Variables Networks.” Working paper. https:// arxiv .org/ pdf
/ 1612.09596 .pdf.
Imai, K., and M. Ratkovic. 2013. “Estimating Treatment Eff ect Heterogeneity in
Randomized Program Evaluation.” Annals of Applied Statistics 7 (1): 443– 70.
Imbens, G. W., and D. B. Rubin. 2015. Causal Inference in Statistics, Social, and
Biomedical Sciences. Cambridge: Cambridge University Press.
Imbens, G. W., and J. M. Wooldridge. 2009. “Recent Developments in the Econo-
metrics of Program Evaluation.” Journal of Economic Literature 47 (1): 5– 86.
546 Susan Athey
Jean, N., M. Burke, M. Xie, W. M. Davis, D. B. Lobell, and S. Ermon. 2016. “Com-
bining Satellite Imagery and Machine Learning to Predict Poverty.” Science 353
(6301): 790– 94.
Jiang, N., and L. Li. 2016. “Doubly Robust Off - Policy Value Evaluation for Rein-
forcement Learning.” Proceedings of the 33rd International Conference on Machine
Learning, vol. 48, 652– 61.
Kallus, N. 2017. “Balanced Policy Evaluation and Learning.” Cornell University
Library. https:// arxiv .org/ abs/ 1705.07384.
Kitagawa, T., and A. Tetenov. 2015. “Who Should Be Treated? Empirical Welfare
Maximization Methods for Treatment Choice.” Technical report, Centre for
Microdata Methods and Practice, Institute for Fiscal Studies.
Kleinberg, J., J. Ludwig, S. Mullainathan, and Z. Obermeyer. 2015. “Prediction
Policy Problems.” American Economic Review 105 (5): 491– 95.
Kleinberg, J., S. Mullainathan, and M. Raghavan. 2016. “Inherent Trade- Off s in
the Fair Determination of Risk Scores.” Cornell University Library. https:// arxiv
.org/ abs/ 1609.05807.
Komarova, T., D. Nekipelov, and E. Yakovlev. 2015. “Estimation of Treatment
Eff ects from Combined Data: Identifi cation versus Data Security.” In Economic
Analysis of the Digital Economy, edited by A. Goldfarb, S. M. Greenstein, and
C. Tucker, 279– 308. Chicago: University of Chicago Press.
Künzel, S., J. Sekhon, P. Bickel, and B. Yu. 2017. “Meta- Learners for Estimating
Heterogeneous Treatment Eff ects Using Machine Learning.” Cornell University
Library. https:// arxiv .org/ abs/ 1706.03461.
Laff ont, J.-J., H. Ossard, and Q. Vuong. 1995. “Econometrics of First- Price Auc-
tions.” Econometrica: Journal of the Econometric Society 63 (4): 953– 80.
Lewis, R. A., and J. M. Rao. 2015. “The Unfavorable Economics of Measuring the
Returns to Advertising.” Quarterly Journal of Economics 130 (4): 1941– 73.
Li, L., S. Chen, J. Kleban, and A. Gupta. 2014. “Counterfactual Estimation and
Optimization of Click Metrics for Search Engines.” Cornell University Library.
https:// arxiv .org/ abs/ 1403.1891.
Li, L., W. Chu, J.
Langford, T. Moon, and X. Wang. 2012. “An Unbiased Offl
ine
Evaluation of Contextual Bandit Algorithms with Generalized Linear Models.”
Journal of Machine Learning Research Workshop and Conference Proceedings
26:19– 36.
Li, L., W. Chu, J. Langford, and R. Schapire. 2010. “A Contextual- bandit Approach
to Personalized News Article Recommendation.” International World Wide Web
Conference. https:// dl.acm .org/ citation .cfm?doid=1772690.1772758.
Li, L., Y. Lu, and D. Zhou. 2017. “Provably Optimal Algorithms for Generalized
Linear Contextual Bandits.” International Conference on Machine Learning.
https:// arxiv .org/ abs/ 1703.00048.
McFadden, D. 1973. “Conditional Logit Analysis of Qualitative Choice Behavior.”
In Frontiers in Econometrics, edited by P. Zarembka. New York: Wiley.
Mullainathan, S., and J. Spiess. 2017. “Machine Learning: An Applied Econometric
Approach.” Journal of Economic Perspectives 31 (2): 87– 106.
Naik, N., S. D. Kominers, R. Raskar, E. L. Glaeser, and C. A. Hidalgo. 2017. “Com-
puter Vision Uncovers Predictors of Physical Urban Change.” Proceedings of the
National Academy of Sciences 114 (29): 7571– 76.
Naik, N., J. Philipoom, R. Raskar, and C. Hidalgo. 2014. “Streetscore- Predicting the
Perceived Safety of One Million Streetscapes.” In Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition Workshops, 779– 85.
Ng, A. Y., and S. J. Russell. 2000. “Algorithms for Inverse Reinforcement Learning.”
The Impact of Machine Learning on Economics 547
In Proceedings of the Seventeenth International Conference on Machine Learning, 663– 70. https:// dl.acm .org/ citation .cfm?id=657801.
Peysakhovich, A., and A. Lada. 2016. “Combining Observational and Experimen-
tal Data to Find Heterogeneous Treatment Eff ects.” Cornell University Library.
http:// arxiv .org/ abs/ 1611.02385.
Robinson, P. M. 1988. “Root- n- Consistent Semiparametric Regression.” Economet-
rica: Journal of the Econometric Society 56 (4): 931– 54.
Roth, A. E. 2002. “The Economist as Engineer: Game Theory, Experimentation,
and Computation as Tools for Design Economics.” Econometrica 70 (4): 1341– 78.
Ruiz, F. J., S. Athey, and D. M. Blei. 2017. “Shopper: A Probabilistic Model of Con-
sumer Choice with Substitutes and Complements.” Cornell University Library.
https:// arxiv .org/ abs/ 1711.03560.
Scott, S. L. 2010. “A Modern Bayesian Look at the Multi- Armed Bandit.” Applied
Stochastic Models in Business and Industry 26 (6): 639– 58.
Strehl, A., J. Langford, L. Li, and S. Kakade. 2010. “Learning from Logged Implicit
Exploration Data.” Proceedings of the 23rd International Conference on Neural
Information Processing Systems, vol. 2, 2217– 25.
Swaminathan, A., and T. Joachims. 2015. “Batch Learning from Logged Bandit
Feedback through Counterfactual Risk Minimization.” Journal of Machine
Learning Research 16 (Sep.): 1731−55.
Thomas, P., and E. Brunskill. 2016. “Data- Effi
cient Off - Policy Policy Evaluation
for Reinforcement Learning.” Proceedings of the 33rd International Conference
on Machine Learning, vol. 48, 2139– 48.
Tibshirani, R., and T. Hastie. 1987. “Local Likelihood Estimation.” Journal of the
American Statistical Association 82 (398): 559– 67.
Ugander, J., B. Karrer, L. Backstrom, and J. Kleinberg. 2013. “Graph Cluster Ran-
domization.” In Proceedings of the 19th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining—KDD ’13, 329. New York: ACM Press.
ISBN 9781450321747. doi: 10.1145/ 2487575.2487695. http:// dl.acm .org/ citation
.cfm?doid=2487575.2487695
van der Laan, M. J., and D. Rubin. 2006. “Targeted Maximum Likelihood Learn-
ing.” Working Paper no. 213, UC Berkeley Division of Biostatistics.
Varian, H. R. 2014. “Big Data: New Tricks for Econometrics.” Journal of Economic
Perspectives 28 (2): 3– 27.
Wager, S., and S. Athey. Forthcoming. “Estimation and Inference of Heterogeneous
Treatment Eff ects Using Random Forests.” Journal of the American Statistical
Association.
Wan, M., D. Wang, M. Goldman, M. Taddy, J. Rao, J. Liu, D. Lymberopoulos, and
J. McAuley. 2017. “Modeling Consumer Preferences and Price Sensitivities from
Large- Scale Grocery Shopping Transaction Logs.” In Proceedings of the 26th
International Conference on World Wide Web, 1103– 12.
White, H. 1992. Artifi cial Neural Networks: Approximation and Learning Theory.
Hoboken, NJ: Blackwell Publishers.
Yeomans, M., A. K. Shah, and J. Kleinberg. 2016. “Making Sense of Recommenda-
tions.” Working paper, Department of Economics, Harvard University. https://
scholar.harvard .edu/ fi les/ sendhil/ fi les/ recommenders55 .pdf.
Zeileis, A., T. Hothorn, and K. Hornik. 2008. “Model- Based Recursive Partition-
ing.” Journal of Computational and Graphical Statistics 17 (2): 492– 514.
Zubizarreta, J. R. 2015. “Stable Weights That Balance Covariates for Estimation
with Incomplete Outcome Data.” Journal of the American Statistical Association
110 (511): 910– 22.
548 Mara Lederman
Comment Mara Lederman
Athey provides a comprehensive, accessible, and exciting summary of the
impact that machine learning (ML) is having—and will continue to have—
on the fi eld of economics. It is a thorough, thoughtful, and optimistic chap-
ter that makes clear the unique strengths of ML and the unique strengths
of traditional econometrics- based techniques for causal inference and high-
lights both the opportunities to combine these approaches as well as the sorts
of tasks and problems that are likely to remain in each domain. The chapter
contains several useful and practical examples that illustrate the application
of ML techniques to questions and problems that are of interest to econo-
mists including allocating health care procedures, pricing, and measuring
the impact of advertising.
At a broad level, the chapter has four main sections. The chapter begins
by off ering straightforward defi nitions of unsupervised and supervised ML.
Athey puts it quite simply: unsupervised ML uses algorithms to identify
observations that are similar in their covariates, while supervised ML uses
algorithms to predict an outcome variable from observations on covariates.
It is important to emphasize, and I will return to this, that the observations
and variables that ML algorithms can handle often do not look like the
typical quantitative data that economists use in empirical analysis. Both
unsupervised and supervised machine- learning techniques can be applied
to text, images, and video. For example, unsupervised ML algorithms can be
used to identify similar videos (without needing to specify in advance what
makes these videos similar) or similar restaurant reviews (again, without
needed to specify which reviews are positive or negative or what words or
phrases makes a review positive or negative). Supervised ML algorithms
can be used to predict variables such as the sentiment of a tweet or the slant
of a newspaper article, without having to specify e
x ante what the relevant
covariates are.
The chapter then discusses a number of ways in which off - the- shelf ML
techniques can be directly integrated into traditional economics research.
For example, both unsupervised and supervised ML can be used to create
variables that can be used in standard econometric analyses. In addition,
ML techniques can be directly applied to what Kleinberg et al. (2015) call
“prediction policy problems.” These are policy problems or decisions that
inherently involve a prediction component and, in these cases, ML tech-
niques may be superior to other statistical methodologies. These problems
may involve novel sources of so-called “big data”—such as satellite image
data used in Glaeser et al. (2018)—but need not. They are simply policy
Mara Lederman is associate professor of strategic management at Rotman School of Man-
agement, University of Toronto.
For acknowledgments, sources of research support, and disclosure of the author’s material fi nancial relationships, if any, please see http:// www .nber .org/ chapters/ c14036.ack.
Comment 549
problems in which the predicted value of an unknown variable acts an input
into a decision.
The third and most substantial section of the chapter discusses the grow-
ing literature at the intersection of machine learning, statistics, and econo-
metrics. As Athey puts it, this literature is developing novel methodolo-
gies that “harass the strengths of ML algorithms to solve causal inference
problems.” Athey provides details on a number of recent contributions
in this area, highlighting the parts of the estimation approaches that are
improved by ML and the parts that continue to rely on traditional econo-
metric approaches and assumptions. Athey predicts that these techniques
will soon become commonly used in applied empirical work in economics.
Finally, the chapter concludes with a discussion of some of the broader
eff ects that ML might have on the economics profession, beyond the impact
on the way we do empirical research, including the types of questions econo-
mists will ask, the degree of cross- disciplinary collaboration, the production
function for research and the emergence of the “economist as an engineer,”
working with business and government to implement policies, and experi-
ments in a digital environment.