The Economics of Artificial Intelligence

Page 68

by Ajay Agrawal

prices vary with respect to quantity or quality), and

3. third degree (group pricing based on membership).

Fully personalized pricing is unrealistic, but prices based on fi ne- grained

features of consumers may well be feasible, so the line between third degree

and fi rst degree is becoming somewhat blurred. Shiller (2013) and Dubé and

Misra (2017) have investigated how much consumer surplus can be extracted

using ML models.

Second- degree price discrimination can also be viewed as pricing by

Artifi cial Intelligence, Economics, and Industrial Organization 411

group membership, but recognizing the endogeneity of group membership

and behavior. Machine learning using observational data will be of limited

help in designing such pricing schemes. However, reinforcement learning

techniques such as multiarmed bandits may also be helpful.

According to most noneconomics, the only thing worse than price dif-

ferentiation is price discrimination! However, most economists recognize

that price diff erentiation is often benefi cial from both an effi

ciency and an

equity point of view. Price diff erentiation allows markets to be served that

would otherwise not be served and often those unserved markets involve

low- income consumers.

DellaVigna and Gentzkow (2017) suggest that “the uniform pricing we

document signifi cantly increases the prices paid by poorer households rela-

tive to the rich.” This eff ect can be substantial. The authors show that “con-

sumers of [food] stores in the lowest income decile pay about 0.7 percent

higher prices than they would pay under fl exible pricing, but consumers of

stores in the top income decile pay about 9.0 percent lower prices than under

fl exible pricing.”

16.3.5 Returns to Scale

There are at least three types of returns to scale that could be relevant for

machine learning.

1. Classical supply- side returns to scale (decreasing average cost).

2. Demand- side returns to scale (network eff ects).

3. Learning by doing (improvement in quality or decrease in cost due to

experience).

Supply- Side Marginal Returns

It might seem like software is the paradigm case of supply- side returns

to scale: there is a large fi xed cost of developing the software, and a small

variable cost of distributing it. But if we compare this admittedly simple

model to the real world, there is an immediate problem.

Software development is not a one- time operation; almost all software is

updated and improved over time. Mobile phone operating systems are a case

in point: there are often monthly releases of bug fi xes and security improve-

ments coupled with yearly releases of major upgrades.

Note how diff erent this is from physical goods—true, there are bug fi xes

for mechanical problems in a car, but the capabilities of the car remain more

or less constant over time. A notable exception is the Tesla brand, where new

updated operating systems are released periodically.

As more and more products become network enabled we can expect to

see this happen more often. Your TV, which used to be a static device, will

be able to learn new tricks. Many TVs now have voice interaction, and we

can expect that machine learning will continue to advance in this area. This

412 Hal Varian

means that your TV will become more and more adept at communication

and likely will become better at discerning your preferences for various sorts

of content. The same goes for other appliances—their capabilities will no

longer be fi xed at time of sale, but will evolve over time.

This raises interesting economic questions about the distinction between

goods and services. When someone buys a mobile phone, a TV, or a car,

they are not just buying a static good, but rather a device that allows them

to access a whole panoply of services. This, in turn, raises a whole range of

questions about pricing and product design.

Demand- Side Returns to Scale

Demand- side economies of scale, or network eff ects, come in diff erent

varieties. There are direct network eff ects, where the value of a product or service to an incremental adopter depends on the total number of other adopt-

ers, and there are indirect network eff ects where there are two or more types

of complementary adopters. Users prefer an operating system with lots of

applications and developers prefer operating systems with lots of users.

Direct network eff ects could be relevant to choices of programming lan-

guages used in machine- learning systems, but the major languages are open

source. Similarly, it is possible that prospective users might prefer cloud

providers that have a lot of other users. However, it seems to me that this is

no diff erent than many other industries. Automobile purchasers may well

have a preference for popular brands since dealers, repair shops, parts, and

mechanics are readily available.

There is a concept that is circulating among lawyers and regulators called

“data network eff ects.” The model is that a fi rm with more customers can

collect more data and use this data to improve its product. This is often

true—the prospect of improving operations is what makes ML attractive—

but it is hardly novel. And it is certainly not a network eff ect! This is essen-

tially a supply- side eff ect known as “learning by doing” (also known as the

“experience curve” or “learning curve”). The classical exposition is Arrow

(1962); Spiegel and Hendel (2014) contain some up- to-date citations and a

compelling example.

Learning by Doing

Learning by doing is generally modeled as a process where unit costs

decline (or quality increases) as cumulative production or investment

increases. The rough rule of thumb is that a doubling of output leads to a

unit cost decline of 10 to 25 percent. Though the reasons for this effi

ciency

increase are not fi rmly established, the important point is that learning by

doing requires intention and investment by the fi rm and described in Stiglitz

and Greenwald (2014).

This distinguishes learning by doing from demand- side or supply- side

network eff ects that are typically thought to be more or less automatic.

Artifi cial Intelligence, Economics, and Industrial Organization 413

This is not really true either; entire books have been written about strategic

behavior in the presence of network eff ects. But there is an important dif-

ference between learning by doing and so-called “data network eff ects.” A

company can have huge amounts of data, but if it does nothing with the

data it produces no value.

In my experience the problem is not lack of resources but lack of skills.

A company that has data but no one to analyze it is in a poor position to

take advantage of that data. If there is no existing expertise internally, it is

hard to make intelligent choices about what skills are needed and how to

fi nd and hire people with those skills. Hiring good people has always been a

critical issue for competitive advantage. But since the widespread availability

of data is comparatively recent, this problem is particularly acute. Automo-

bile companies can hire people who know how to build automobiles, since

that is part of their core competency. They may or may not have suffi

cient

internal expertise to hire good data scientists, which is why we can expect

to see heterogeneity in productivity as this new skill percolates through the

labor markets. Bessen (2016, 2017) has written perceptively about this issue.

16.3.6 Algorithmic

Collusion

It has been known for decades that there are many equilibrium in repeated

games. The central result in this area is the so-called “folk theorem,” which

says that virtually any outcome can be achieved as an equilibrium in a

repeated game. For various formulations of this result, see the surveys by

Fudenberg (1992) and Pierce (1992).

Interaction of oligopolists can be viewed as a repeated game, and in this

case particular attention is focused on collusive outcomes. There are very

simple strategies that can be used to facilitate collusion.

Rapid Response Equilibrium. For example, consider the classic example

of two gas stations across the street from each other who can change prices

quickly and serve a fi xed population of consumers. Initially, they are both

pricing above marginal cost. If one drops its price by a penny, the other

quickly matches the price. In this case, both gas stations do worse off because

they are selling at a lower price. Hence, there is no reward to price cutting

and high prices prevail. Strategies of this sort may have been used in online

competition, as described in Varian (2000). Borenstein (1997) documents

related behavior in the context of airfare pricing.

Repeated Prisoner’s Dilemma. In the early 1980s, Robert Axelrod (1984)

conducted a prisoner’s dilemma tournament. Researches submitted algo-

rithmic strategies that were played against each other repeatedly.The winner

by a large margin was a simple strategy submitted by Anatol Rapoport called

“tit for tat.” In this strategy, each side starts out cooperating (charging high

prices). If either player defects (cuts its price), the other player matches.

Axelrod then constructed a tournament where strategies reproduced accord-

ing to their payoff s in the competition. He found that the best- performing

414 Hal Varian

strategies were very similar to tit for tat. This suggests that artifi cial agents

might learn to play cooperative strategies in a classic duopoly game.

NASDAQ Price Quotes. In the early 1990s, price quotes in the NASDAQ

were made in eighths of a dollar rather than cents. So if a bid was three-

eighths and an ask was two- eighths, a transaction would occur with the

buyer paying three- eighths and the seller receiving two- eighths. The diff er-

ence between the bid and the ask was the “inside spread,” which compen-

sated the traders for risk bearing and maintaining the capital necessary to

participate in the market. Note that the bigger the inside spread, the larger

the compensation to the market makers doing the trading.

In the mid- 1990s two economists, William Christie and Paul Schultz,

examined trades for the top seventy listed companies in NASDAQ and

found to their surprise that there were virtually no transactions made at odd-

eighth prices. The authors concluded that “our results most likely refl ected

an understanding or implicit agreement among the market makers to avoid

the use of odd- eighth price fractions when quoting these stocks” (Christie

and Schultz 1995, 203).

A subsequent investigation was launched by the Department of Justice

(DOJ), which was eventually settled by a $1.01 billion fi ne that, at the time,

was the largest fi ne ever paid in an antitrust case.

As these examples illustrate, it appears to be possible for implicit (or per-

haps explicit) cooperation to occur in the context of repeated interaction—

what Axelrod refers to as the “evolution of cooperation.”

Recently, issues of these sort have reemerged in the context of “algorith-

mic collusion.” In June 2017, the Organisation for Economic Co- operation

and Development (OECD) held a roundtable on algorithms and collusion

as a part of their work on competition in the digital economy. See OECD

(2017) for a background paper and Ezrachi and Stucke (2017) for a repre-

sentative contribution to the roundtable.

There are a number of interesting research questions that arise in this con-

text. The folk theorem shows that collusive outcomes can be an equilibrium

of a repeated game, but does not describe a specifi c algorithm that leads to

such an outcome. It is known that very simplistic algorithms, such as fi nite

automata with a small number of states cannot discover all equilibria (see

Rubinstein 1986).

There are auction- like mechanisms that can be used to approximate mo-

nopoly outcomes; see Segal (2003) for an example. However, I have not seen

similar mechanisms in an oligopoly context.

16.4 Structure of ML- Provision Industries

So far we have looked at industries that use machine learning, but it is also

of interest to look at companies that provide machine learning.

As noted above, it is likely that ML vendors will off er several related ser-

Artifi cial Intelligence, Economics, and Industrial Organization 415

vices. One question that immediately rises is how easy it will be to switch

among providers. Technologies such as containers have been developed

specifi cally make it easy to port applications from one cloud provider to

another. Open- source implementation such as dockers and kubernetes are

readily available. Lock in will not be a problem for small- and medium- size

applications, but of course, there could be issues involving large and complex

applications that involve customized applications.

Computer hardware also exhibits at least constant returns to scale due to

the ease of replicating hardware installations at the level of the chip, mother-

board, racks, or data centers themselves. The classic replication argument

for constant returns applies here since the basic way to increase capacity is

to just replicate what has been done before: add more core to the processors,

add more boards to racks, add more racks to the data center, and build more

data centers.

I have suggested earlier that cloud computing is more cost eff ective for

most users than building a data center from scratch. What is interesting is

that companies that require lots of data processing power have been able

to replicate their existing infrastructure and sell the additional capacity to

other, smaller entities. The result is an industry structure somewhat diff erent

than an economist might have imagined. Would an auto company build

excess capacity that it could then sell off to other companies? This is not

unheard of, but it is rare. Again it is the general purpose nature of comput-

ing that enables this model.

16.4.1 Pricing of ML Services

As with any other information- based industry, software is costly to pro-

duce and cheap to reproduce. As noted above, compu
ter hardware also

exhibits at least constant returns to scale due to the ease of replicating

hardware installations at the level of the chip, motherboard, racks, or data

centers themselves.

If services become highly standardized, then it is easy to fall into Bertrand-

like price cutting. Even in these early days, machine pricing appears to be

intensely competitive. For example, image recognition services cost about a

tenth- of-a- cent per image at all major cloud providers. Presumably, we will

see vendors try to diff erentiate themselves along dimensions of speed and

capabilities. Those fi rms that can provide better services may be able to pro-

vide premium prices, to the extent that users are willing to pay for premium

service. However, current speeds and accuracy are very high and it is unclear

how users value further improvement in these dimensions.

16.5 Policy

Questions

We have already discussed issues involving data ownership, data access,

diff erential pricing, returns to scale, and algorithmic collusion, all of which

416 Hal Varian

have signifi cant policy aspects. The major policy areas remaining are secu-

rity and privacy. I start with a few remarks about security.

16.5.1 Security

One important question that arises with respect to security is whether

fi rms have appropriate incentives in this regard. In a classic article, Ander-

son (1993) compares US and UK policy with respect to automatic teller

machines (ATMs). In the United States, the user was right unless the bank

could prove them wrong, while in the United Kingdom, the bank was right

unless the user could prove them wrong. The result of this liability assign-

ment was that US banks invested in security practices such as security cam-

eras, while the UK banks didn’t bother with such elementary precautions.

This industry indicates how important liability assignment is in creating

appropriate incentives for investment in security. The law and economics

analysis of tort law is helpful in understanding the implications of diff erent

liability assignments and what optimal assignments might look like.

One principle that emerges is that of the “due care” standard. If a fi rm fol-

lows certain standard procedures such as installing security fi xes within a few

days of their being released, implementing two- factor authentication, edu-

‹ Prev Next ›