The Economics of Artificial Intelligence

Home > Other > The Economics of Artificial Intelligence > Page 97
The Economics of Artificial Intelligence Page 97

by Ajay Agrawal


  in the interference graph. It was hoped that, by including this term in the pricing formula, the auction would be able to off er higher prices to and buy the rights of stations that pose particularly diffi

  cult problems by interfering with many other stations.

  How AI and Machine Learning Can Impact Market Design 575

  high probability. But how can we know the distribution and how can such

  an algorithm be found?

  The FCC auction used a feasibility checker developed by a team of Auc-

  tionomics researchers at the University of British Columbia, led by Pro-

  fessor Kevin Leyton- Brown. There were many steps in the development,

  as reported by Newman, Fréchette, and Leyton- Brown (forthcoming), but

  here we emphasize the role of machine learning. Auctionomics’ goal was to

  be able to solve 99 percent of the problem instances in one minute or less.

  The development eff ort began by simulating the planned auction to gen-

  erate feasibility problems like those that might be encountered in a real

  auction. Running many simulations generated about 1.4 million problem

  instances that could be used for training and testing a feasibility- checking

  algorithm. The fi rst step of the analysis was to formulate the problem as

  mixed integer programs and test standard commercial software—CPLEX

  and Gurobi—to see how close those could come to meeting the performance

  objectives. The answer was: not close. Using a 100-seconds cutoff , Gurobi

  could solve only about 10 percent of the problems and CPLEX only about

  25 percent. These were not nearly good enough for decent performance in

  a real- time auction.

  Next, the same problems were formulated as satisfi ability problems and

  tested using seventeen research solvers that had participated in recent SAT-

  solving tournaments. These were better, but none could solve as many as

  two- thirds of the problems within the same 100-second cutoff . The goal

  remained 99 percent in sixty seconds.

  The next step was to use automated algorithm confi guration, a proce-

  dure developed by Hutter, Hoos, and Leyton- Brown (2011) and applied in

  this setting by Leyton- Brown and his students at the University of British

  Columbia. The idea is to start with a highly parameterized algorithm for

  solving satisfi ability problems4 and to train a random forest model of the

  algorithm performance, given the parameters. To do that, we fi rst ran simu-

  lated auctions with what we regarded as plausible behavior by the bidders

  to generate a large data set of representative problems. Then, we solved

  those problems using a variety of diff erent parameter settings to determine

  the distribution of solution times for each vector of parameters. This gen-

  erated a data set with parameters and performance measures. Two of the

  most interesting performance characteristics were the median run time and

  4. There are no known algorithms for NP- complete problems that are guaranteed to be fast, so the best existing algorithms are all heuristics. These algorithms weight various characteristics of the problem to decide about such things as the order in which to check diff erent branches of a search tree. These weights are among the parameters that can be set and adapted to work well for a particular class of problems, such as those that arise in the incentive auction application.

  The particular software algorithm that we used was CLASP, which had more than 100 exposed parameters that could be modifi ed.

  576 Paul R. Milgrom and Steven Tadelis

  the fraction of instances solved within one minute. Then, using a Bayesian

  model, we incorporated uncertainty in which the experimenter “believes”

  that the actual performance is normally distributed with a mean determined

  by the random forest and a variance that depends on the distance of the

  parameter vector from the nearest points in the data set. Next, the system

  identifi es the parameter vector that maximizes the expected improvement in

  performance, given the mean and variance of the prior and the performance

  of the best- known parameter vector. Finally, the system tests the actual

  performance for the identifi ed parameters and adds that as an observation

  to the data set. Proceeding iteratively, the system identifi es more parameters

  to test, investigates them, and adds them to the data to improve the model

  accuracy until the time budget is exhausted.

  Eventually, this machine- learning method leads to diminishing returns to

  time invested. One can then create a new data set from the instances on which

  the parameterized algorithm was “slow,” for example, taking more than fi f-

  teen seconds to solve. By training a new algorithm on those instances, and

  running the two parameterized algorithms in parallel, the machine- learning

  techniques led to dramatic improvements in performance.

  For the actual auction, several other problem- specifi c tricks were also

  applied to contribute to the speed-up. For example, to some extent it proved

  possible to decompose the full problem into smaller problems, to reuse old

  solutions as starting points for a search, to store partial solutions that might

  help guide solutions of further problems, and so on. In the end, the full set

  of techniques and tricks resulted in a very fast feasibility checker that solved

  all but a tiny fraction of the relevant problems within the allotted time.

  23.3 Using AI to Promote Trust in Online Marketplaces

  Online marketplaces such as eBay, Taobao, Airbnb, and many others

  have grown dramatically since their inception just over two decades ago,

  providing businesses and individuals with previously unavailable opportu-

  nities to purchase or profi t from online trading. Wholesalers and retailers

  can market their goods or get rid of excess inventory; consumers can easily

  search marketplaces for whatever is on their mind, alleviating the need for

  businesses to invest in their own e-commerce website; individuals transform

  items they no longer use into cash; and more recently, the so called “gig

  economy” is comprised of marketplaces that allow individuals to share their

  time or assets across diff erent productive activities and earn extra income.

  The amazing success of online marketplaces was not fully anticipated,

  primarily because of the hazards of anonymous trade and asymmetric infor-

  mation. Namely, how can strangers who have never transacted with one

  another, and who may be thousands of miles apart, be willing to trust each

  other? Trust on both sides of the market is essential for parties to be willing

  to transact and for a marketplace to succeed. The early success of eBay is

  How AI and Machine Learning Can Impact Market Design 577

  often attributed to the innovation of introducing its famous feedback and

  reputation mechanism, which was adopted in one form or another by practi-

  cally every other marketplace that came after eBay. These online feedback

  and reputation mechanisms provide a modern- day version of more ancient

  reputation mechanisms used in the physical marketplaces that were the

  medieval trade fairs of Europe (see Milgrom, North, and Weingast 1990).

  Still, recent studies have shown that online reputation measures of mar-

  ketplace sellers, which are based on buyer- generated feedback, don’t accu-

 
rately refl ect their actual performance. Indeed, a growing literature has

  shown that user- generated feedback mechanisms are often biased, suff er

  from “grade infl ation,” and can be prone to manipulation by sellers.5 For

  example, the average percent positive for sellers on eBay is about 99.4 per-

  cent, with a median of 100 percent. This causes a challenge to interpret the

  true levels of satisfaction on online marketplaces.

  A natural question emerges: Can online marketplaces use the treasure

  trove of data it collects to measure the quality of a transaction and predict

  which sellers will provide a better service to their buyers? It has become

  widely known that all online marketplaces, as well as other web- based ser-

  vices, collect vast amounts of data as part of the process of trade. Some

  refer to this as the “exhausts data” generated by the millions of transactions,

  searches, and browsing that occur on these marketplaces daily. By leverag-

  ing this data, marketplaces can create an environment that would promote

  trust, not unlike the ways in which institutions emerged in the medieval trade

  fairs of Europe that helped foster trust. The scope for market design goes

  far beyond the more mainstream application like setting rules of bidding

  and reserve prices for auctions or designing tiers of services, and in our view,

  includes the design of mechanisms that help foster trust in marketplaces.

  What follows are two examples from recent research that show some of the

  many ways that marketplaces can apply AI to the data they generate to help

  create more trust and better experiences for their customers.

  23.3.1 Using AI to Assess the Quality of Sellers

  One of the ways that online marketplaces help participants build trust

  is by letting them communicate through online messaging platforms. For

  example, on eBay buyers can contact sellers to ask them questions about

  their products, which may be particularly useful for used or unique products

  for which buyers may want to get more refi ned information than is listed.

  Similarly, Airbnb allows potential renters to send messages to hosts and ask

  questions about the property that may not be answered in the original listing.

  Using Natural Language Processing (NLP), a mature area in AI, market-

  5. On bias and grade infl ation see, for example, Nosko and Tadelis (2015), Zervas, Proserpio, and Byers (2015), and Filippas, Horton, and Golden (2017). On seller manipulation of feedback scores see, for example, Mayzlin, Dover, and Chevalier (2014) and Xu et al. (2015).

  578 Paul R. Milgrom and Steven Tadelis

  places can mine the data generated by these messages in order to better

  predict the kind of features that customers value. However, there may also

  be subtler ways to apply AI to manage the quality of marketplaces. The

  messaging platforms are not restricted to pretransaction inquiries, but also

  off er the parties to send messages to each other after the transaction has been completed. An obvious question then emerges: How could a marketplace

  analyze the messages sent between buyers and sellers post the transaction to

  infer something about the quality of the transaction that feedback doesn’t

  seem to capture?

  This question was posed and answered in a recent paper by Masterov,

  Mayer, and Tadelis (2015) using internal data from eBay’s marketplace. The

  analysis they performed was divided into two stages. In the fi rst stage, the

  goal was to see if NLP can identify transactions that went bad when there

  was an independent indication that the buyer was unhappy. To do this, they

  collected internal data from transactions in which messages were sent from

  the buyer to the seller after the transaction was completed, and matched it

  with another internal data source that recorded actions by buyers indicat-

  ing that the buyer had a poor experience with the transactions. Actions that

  indicate an unhappy buyer include a buyer claiming that the item was not

  received, or that the item was signifi cantly not as described, or leaves nega-

  tive or neutral feedback, to name a few.

  The simple NLP approach they use creates a “poor- experience” indica-

  tor as the target (dependent variable) that the machine- learning model will

  try to predict, and uses the messages’ content as the independent variables.

  In its simplest form and as a proof of concept, a regular expression search

  was used that included a standard list of negative words such as “annoyed,”

  “dissatisfi ed,” “damaged,” or “negative feedback” to identify a message as

  negative. If none of the designated terms appeared, then the message was

  considered neutral. Using this classifi cation, they grouped transactions into

  three distinct types: (a) no posttransaction messages from buyer to seller,

  (b) one or more negative messages, or (c) one or more neutral messages with

  no negative messages.

  Figure 23.2, which appears in Masterov, Mayer, and Tadelis (2015),

  describes the distribution of transactions with the diff erent message classi-

  fi cations together with their association with poor experiences. The x-axis of

  fi gure 23.1 shows that approximately 85 percent of transactions fall into the

  benign fi rst category of no posttransaction messages. Buyers sent at least one

  message in the remaining 15 percent of all transactions, evenly split between

  negative and neutral messages. The top of the y- axis shows the poor expe-

  rience rate for each message type. When no messages are exchanged, only

  4 percent of buyers report a poor experience. Whenever a neutral message is

  sent, the rate of poor experiences jumps to 13 percent, and if the message’s

  content was negative, over one- third of buyers express a poor experience.

  In the second stage of the analysis, Masterov, Mayer, and Tadelis (2015)

  How AI and Machine Learning Can Impact Market Design 579

  Fig. 23.2 Message content and poor experiences on eBay

  Source: Masterov et al. 2015. ©2015 Association for Computing Machinery, Inc. Reprinted by permission. https://doi.org/10.1145/2764468.2764499.

  used the fact that negative messages are associated with poor experiences

  to construct a novel measure of seller quality based on the idea that sellers

  who receive a higher frequency of negative messages are worse sellers. For

  example, imagine that seller A and seller B both sold 100 items and that seller

  A had fi ve transactions with at least one negative message, while seller B had

  eight such transactions. The implied quality score of seller A is then 0.05

  while that of seller B is 0.08, and the premise is that seller B is a worse seller

  than seller A. Masterov, Mayer, and Tadelis (2015) show that the relation-

  ship between this ratio, which is calculated for every seller at any point in

  time using aggregated negative messages from past sales, and the likelihood

  that a current transaction will result in a poor experience, is monotonically

  increasing.

  This simple exercise is a proof of concept that shows that by using the

  message data and a simple natural language processing AI procedure, they

  were able to better predict which sellers will create poor experiences than one

  can infer from the very infl ated feedback data. eBay is not unique in allowing
<
br />   the parties to exchange messages and the lessons from this research are easily

  generalizable to other marketplaces. The key is that there is information in

  580 Paul R. Milgrom and Steven Tadelis

  communication between market participants, and past communication can

  help identify and predict the sellers or products that will cause buyers poor

  experiences and negatively impact the overall trust in the marketplace.

  23.2.2 Using AI to Create a Market for Feedback

  Aside from the fact that feedback is often infl ated as described earlier,

  another problem with feedback is that not all buyers choose not to leave

  feedback at all. In fact, through the lens of mainstream economic theory, it

  is surprising that a signifi cant fraction of online consumers leave feedback.

  After all, it is a selfl ess act that requires time, and it creates a classic free- rider problem. Furthermore, because potential buyers are attracted to buy from

  sellers or products that already have an established good track record, this

  creates a “cold- start” problem: new sellers (or products) with no feedback

  will face a barrier- to-entry in that buyers will be hesitant to give them a fair

  shot. How could we solve these free- rider and cold- start problems?

  These questions were analyzed in a recent paper by Li, Tadelis, and Zhow

  (2016) using a unique and novel implementation of a market for feedback

  on the huge Chinese marketplace Taobao where they let sellers pay buyers

  to leave them feedback. Naturally, one may be concerned about allowing

  sellers to pay for feedback as it seems like a practice in which they will only

  pay for good feedback and suppress any bad feedback, which would not add

  any value in promoting trust. However, Taobao implemented a clever use

  of NLP to solve this problem: it is the platform, using an NLP AI model,

  that decides whether feedback is relevant and not the seller who pays for the

  feedback. Hence, the reward to the buyer for leaving feedback was actually

  managed by the marketplace, and was handed out for informative feedback

  rather than for positive feedback.

  Specifi cally, in March 2012, Taobao launched a “Rebate- for- Feedback”

 

‹ Prev