The Economics of Artificial Intelligence Page 97 Read online free by Ajay Agrawal

Home > Other > The Economics of Artificial Intelligence > Page 97

The Economics of Artificial Intelligence Page 97

in the interference graph. It was hoped that, by including this term in the pricing formula, the auction would be able to off er higher prices to and buy the rights of stations that pose particularly diffi

cult problems by interfering with many other stations.

How AI and Machine Learning Can Impact Market Design 575

high probability. But how can we know the distribution and how can such

an algorithm be found?

The FCC auction used a feasibility checker developed by a team of Auc-

tionomics researchers at the University of British Columbia, led by Pro-

fessor Kevin Leyton- Brown. There were many steps in the development,

as reported by Newman, Fréchette, and Leyton- Brown (forthcoming), but

here we emphasize the role of machine learning. Auctionomics’ goal was to

be able to solve 99 percent of the problem instances in one minute or less.

The development eff ort began by simulating the planned auction to gen-

erate feasibility problems like those that might be encountered in a real

auction. Running many simulations generated about 1.4 million problem

instances that could be used for training and testing a feasibility- checking

algorithm. The fi rst step of the analysis was to formulate the problem as

mixed integer programs and test standard commercial software—CPLEX

and Gurobi—to see how close those could come to meeting the performance

objectives. The answer was: not close. Using a 100-seconds cutoff , Gurobi

could solve only about 10 percent of the problems and CPLEX only about

25 percent. These were not nearly good enough for decent performance in

a real- time auction.

Next, the same problems were formulated as satisfi ability problems and

tested using seventeen research solvers that had participated in recent SAT-

solving tournaments. These were better, but none could solve as many as

two- thirds of the problems within the same 100-second cutoff . The goal

remained 99 percent in sixty seconds.

The next step was to use automated algorithm confi guration, a proce-

dure developed by Hutter, Hoos, and Leyton- Brown (2011) and applied in

this setting by Leyton- Brown and his students at the University of British

Columbia. The idea is to start with a highly parameterized algorithm for

solving satisfi ability problems4 and to train a random forest model of the

algorithm performance, given the parameters. To do that, we fi rst ran simu-

lated auctions with what we regarded as plausible behavior by the bidders

to generate a large data set of representative problems. Then, we solved

those problems using a variety of diff erent parameter settings to determine

the distribution of solution times for each vector of parameters. This gen-

erated a data set with parameters and performance measures. Two of the

most interesting performance characteristics were the median run time and

4. There are no known algorithms for NP- complete problems that are guaranteed to be fast, so the best existing algorithms are all heuristics. These algorithms weight various characteristics of the problem to decide about such things as the order in which to check diff erent branches of a search tree. These weights are among the parameters that can be set and adapted to work well for a particular class of problems, such as those that arise in the incentive auction application.

The particular software algorithm that we used was CLASP, which had more than 100 exposed parameters that could be modifi ed.

576 Paul R. Milgrom and Steven Tadelis

the fraction of instances solved within one minute. Then, using a Bayesian

model, we incorporated uncertainty in which the experimenter “believes”

that the actual performance is normally distributed with a mean determined

by the random forest and a variance that depends on the distance of the

parameter vector from the nearest points in the data set. Next, the system

identifi es the parameter vector that maximizes the expected improvement in

performance, given the mean and variance of the prior and the performance

of the best- known parameter vector. Finally, the system tests the actual

performance for the identifi ed parameters and adds that as an observation

to the data set. Proceeding iteratively, the system identifi es more parameters

to test, investigates them, and adds them to the data to improve the model

accuracy until the time budget is exhausted.

Eventually, this machine- learning method leads to diminishing returns to

time invested. One can then create a new data set from the instances on which

the parameterized algorithm was “slow,” for example, taking more than fi f-

teen seconds to solve. By training a new algorithm on those instances, and

running the two parameterized algorithms in parallel, the machine- learning

techniques led to dramatic improvements in performance.

For the actual auction, several other problem- specifi c tricks were also

applied to contribute to the speed-up. For example, to some extent it proved

possible to decompose the full problem into smaller problems, to reuse old

solutions as starting points for a search, to store partial solutions that might

help guide solutions of further problems, and so on. In the end, the full set

of techniques and tricks resulted in a very fast feasibility checker that solved

all but a tiny fraction of the relevant problems within the allotted time.

23.3 Using AI to Promote Trust in Online Marketplaces

Online marketplaces such as eBay, Taobao, Airbnb, and many others

have grown dramatically since their inception just over two decades ago,

providing businesses and individuals with previously unavailable opportu-

nities to purchase or profi t from online trading. Wholesalers and retailers

can market their goods or get rid of excess inventory; consumers can easily

search marketplaces for whatever is on their mind, alleviating the need for

businesses to invest in their own e-commerce website; individuals transform

items they no longer use into cash; and more recently, the so called “gig

economy” is comprised of marketplaces that allow individuals to share their

time or assets across diff erent productive activities and earn extra income.

The amazing success of online marketplaces was not fully anticipated,

primarily because of the hazards of anonymous trade and asymmetric infor-

mation. Namely, how can strangers who have never transacted with one

another, and who may be thousands of miles apart, be willing to trust each

other? Trust on both sides of the market is essential for parties to be willing

to transact and for a marketplace to succeed. The early success of eBay is

How AI and Machine Learning Can Impact Market Design 577

often attributed to the innovation of introducing its famous feedback and

reputation mechanism, which was adopted in one form or another by practi-

cally every other marketplace that came after eBay. These online feedback

and reputation mechanisms provide a modern- day version of more ancient

reputation mechanisms used in the physical marketplaces that were the

medieval trade fairs of Europe (see Milgrom, North, and Weingast 1990).

Still, recent studies have shown that online reputation measures of mar-

ketplace sellers, which are based on buyer- generated feedback, don’t accu-

rately refl ect their actual performance. Indeed, a growing literature has

shown that user- generated feedback mechanisms are often biased, suff er

from “grade infl ation,” and can be prone to manipulation by sellers.5 For

example, the average percent positive for sellers on eBay is about 99.4 per-

cent, with a median of 100 percent. This causes a challenge to interpret the

true levels of satisfaction on online marketplaces.

A natural question emerges: Can online marketplaces use the treasure

trove of data it collects to measure the quality of a transaction and predict

which sellers will provide a better service to their buyers? It has become

widely known that all online marketplaces, as well as other web- based ser-

vices, collect vast amounts of data as part of the process of trade. Some

refer to this as the “exhausts data” generated by the millions of transactions,

searches, and browsing that occur on these marketplaces daily. By leverag-

ing this data, marketplaces can create an environment that would promote

trust, not unlike the ways in which institutions emerged in the medieval trade

fairs of Europe that helped foster trust. The scope for market design goes

far beyond the more mainstream application like setting rules of bidding

and reserve prices for auctions or designing tiers of services, and in our view,

includes the design of mechanisms that help foster trust in marketplaces.

What follows are two examples from recent research that show some of the

many ways that marketplaces can apply AI to the data they generate to help

create more trust and better experiences for their customers.

23.3.1 Using AI to Assess the Quality of Sellers

One of the ways that online marketplaces help participants build trust

is by letting them communicate through online messaging platforms. For

example, on eBay buyers can contact sellers to ask them questions about

their products, which may be particularly useful for used or unique products

for which buyers may want to get more refi ned information than is listed.

Similarly, Airbnb allows potential renters to send messages to hosts and ask

questions about the property that may not be answered in the original listing.

Using Natural Language Processing (NLP), a mature area in AI, market-

5. On bias and grade infl ation see, for example, Nosko and Tadelis (2015), Zervas, Proserpio, and Byers (2015), and Filippas, Horton, and Golden (2017). On seller manipulation of feedback scores see, for example, Mayzlin, Dover, and Chevalier (2014) and Xu et al. (2015).

578 Paul R. Milgrom and Steven Tadelis

places can mine the data generated by these messages in order to better

predict the kind of features that customers value. However, there may also

be subtler ways to apply AI to manage the quality of marketplaces. The

messaging platforms are not restricted to pretransaction inquiries, but also

off er the parties to send messages to each other after the transaction has been completed. An obvious question then emerges: How could a marketplace

analyze the messages sent between buyers and sellers post the transaction to

infer something about the quality of the transaction that feedback doesn’t

seem to capture?

This question was posed and answered in a recent paper by Masterov,

Mayer, and Tadelis (2015) using internal data from eBay’s marketplace. The

analysis they performed was divided into two stages. In the fi rst stage, the

goal was to see if NLP can identify transactions that went bad when there

was an independent indication that the buyer was unhappy. To do this, they

collected internal data from transactions in which messages were sent from

the buyer to the seller after the transaction was completed, and matched it

with another internal data source that recorded actions by buyers indicat-

ing that the buyer had a poor experience with the transactions. Actions that

indicate an unhappy buyer include a buyer claiming that the item was not

received, or that the item was signifi cantly not as described, or leaves nega-

tive or neutral feedback, to name a few.

The simple NLP approach they use creates a “poor- experience” indica-

tor as the target (dependent variable) that the machine- learning model will

try to predict, and uses the messages’ content as the independent variables.

In its simplest form and as a proof of concept, a regular expression search

was used that included a standard list of negative words such as “annoyed,”

“dissatisfi ed,” “damaged,” or “negative feedback” to identify a message as

negative. If none of the designated terms appeared, then the message was

considered neutral. Using this classifi cation, they grouped transactions into

three distinct types: (a) no posttransaction messages from buyer to seller,

(b) one or more negative messages, or (c) one or more neutral messages with

no negative messages.

Figure 23.2, which appears in Masterov, Mayer, and Tadelis (2015),

describes the distribution of transactions with the diff erent message classi-

fi cations together with their association with poor experiences. The x-axis of

fi gure 23.1 shows that approximately 85 percent of transactions fall into the

benign fi rst category of no posttransaction messages. Buyers sent at least one

message in the remaining 15 percent of all transactions, evenly split between

negative and neutral messages. The top of the y- axis shows the poor expe-

rience rate for each message type. When no messages are exchanged, only

4 percent of buyers report a poor experience. Whenever a neutral message is

sent, the rate of poor experiences jumps to 13 percent, and if the message’s

content was negative, over one- third of buyers express a poor experience.

In the second stage of the analysis, Masterov, Mayer, and Tadelis (2015)

How AI and Machine Learning Can Impact Market Design 579

Fig. 23.2 Message content and poor experiences on eBay

Source: Masterov et al. 2015. ©2015 Association for Computing Machinery, Inc. Reprinted by permission. https://doi.org/10.1145/2764468.2764499.

used the fact that negative messages are associated with poor experiences

to construct a novel measure of seller quality based on the idea that sellers

who receive a higher frequency of negative messages are worse sellers. For

example, imagine that seller A and seller B both sold 100 items and that seller

A had fi ve transactions with at least one negative message, while seller B had

eight such transactions. The implied quality score of seller A is then 0.05

while that of seller B is 0.08, and the premise is that seller B is a worse seller

than seller A. Masterov, Mayer, and Tadelis (2015) show that the relation-

ship between this ratio, which is calculated for every seller at any point in

time using aggregated negative messages from past sales, and the likelihood

that a current transaction will result in a poor experience, is monotonically

increasing.

This simple exercise is a proof of concept that shows that by using the

message data and a simple natural language processing AI procedure, they

were able to better predict which sellers will create poor experiences than one

can infer from the very infl ated feedback data. eBay is not unique in allowing
<
br /> the parties to exchange messages and the lessons from this research are easily

generalizable to other marketplaces. The key is that there is information in

580 Paul R. Milgrom and Steven Tadelis

communication between market participants, and past communication can

help identify and predict the sellers or products that will cause buyers poor

experiences and negatively impact the overall trust in the marketplace.

23.2.2 Using AI to Create a Market for Feedback

Aside from the fact that feedback is often infl ated as described earlier,

another problem with feedback is that not all buyers choose not to leave

feedback at all. In fact, through the lens of mainstream economic theory, it

is surprising that a signifi cant fraction of online consumers leave feedback.

After all, it is a selfl ess act that requires time, and it creates a classic free- rider problem. Furthermore, because potential buyers are attracted to buy from

sellers or products that already have an established good track record, this

creates a “cold- start” problem: new sellers (or products) with no feedback

will face a barrier- to-entry in that buyers will be hesitant to give them a fair

shot. How could we solve these free- rider and cold- start problems?

These questions were analyzed in a recent paper by Li, Tadelis, and Zhow

(2016) using a unique and novel implementation of a market for feedback

on the huge Chinese marketplace Taobao where they let sellers pay buyers

to leave them feedback. Naturally, one may be concerned about allowing

sellers to pay for feedback as it seems like a practice in which they will only

pay for good feedback and suppress any bad feedback, which would not add

any value in promoting trust. However, Taobao implemented a clever use

of NLP to solve this problem: it is the platform, using an NLP AI model,

that decides whether feedback is relevant and not the seller who pays for the

feedback. Hence, the reward to the buyer for leaving feedback was actually

managed by the marketplace, and was handed out for informative feedback

rather than for positive feedback.

Specifi cally, in March 2012, Taobao launched a “Rebate- for- Feedback”

‹ Prev Next ›