The Economics of Artificial Intelligence

Home > Other > The Economics of Artificial Intelligence > Page 70
The Economics of Artificial Intelligence Page 70

by Ajay Agrawal


  This raises at least a few other public policy avenues to be explored. For ex-

  ample, given the public goods nature of data, there may be circumstances in

  which public investment in data creation and public ownership of the data

  thus created is worth exploring, particularly in circumstances when private

  creation of such data would lead to antitrust concerns.

  References

  Abrahamson, Zachary. 2014. “Comment: Essential Data.” Yale Law Journal 124

  (3): 867– 68.

  422 Judith Chevalier

  Meadows, Maxwell. 2014. “The Essential Facilities Doctrine in Information Econo-

  mies: Illustrating Why the Antitrust Duty to Deal is Still Necessary in the New

  Economy.” Fordham Intellectual Property, Media, and Entertainment Law Journal

  25 (3): 795– 830.

  Pate, R. Hewitt. 2006. “Refusals to Deal and Essential Facilities.” Testimony of

  R. Hewitt Pate, DOJ/ FTC Hearings on Single- Firm Conduct, Washington

  DC, July 18. https:// www .justice .gov/ atr/ refusals- deal- and- essential- facilities- r

  - hewitt- pate- statement.

  Segal, I., and M. Whinston. 2007. “Antitrust in Innovative Industries.” American

  Economic Review 97 (5): 1703– 30.

  Vesteger, Margrethe. 2016. “Making Data Work for Us.” Speech at the Data Ethics

  event on Data as Power, Copenhagen, Sept. 9. https:// ec.europa.eu/ commission

  / commissioners/ 2014– 2019/ vestager/ announcements/ making- data- work- us_en.

  17

  Privacy, Algorithms, and

  Artifi cial Intelligence

  Catherine Tucker

  Imagine the following scenario. You are late for a hospital appointment and

  searching frantically for a parking spot. You know that you often forget

  where you parked your car, so you use an app you downloaded called “Find

  my Car.” The app takes a photo of your car and then geocodes the photo,

  enabling you to easily fi nd the right location when you come to retrieve your

  car. The app accurately predicts when it should provide a prompt. This all

  sounds very useful. However, this example illustrates a variety of privacy

  concerns in a world of artifi cial intelligence.

  1. Data Persistence: This data, once created, may potentially persist longer

  than the human that created it, given the low costs of storing such data.

  2. Data Repurposing: It is not clear how such data could be used in the

  future. Once created, such data can be indefi nitely repurposed. For example,

  in a decade’s time parking habits may be part of the data used by health

  insurance companies to allocate an individual to a risk premium.

  3. Data Spillovers: There are potential spillovers for others who did not

  take the photo. The photo may record other people and they may be identifi -

  able through facial recognition, or incidentally captured cars may be identi-

  fi able through license plate databases. These other people did not choose to

  create the data, but my choice to create data may have spillovers for them

  in the future.

  Catherine Tucker is the Sloan Distinguished Professor of Management Science at MIT

  Sloan School of Management and a research associate of the National Bureau of Economic Research.

  For acknowledgments, sources of research support, and disclosure of the author’s material fi nancial relationships, if any, please see http:// www .nber .org/ chapters/ c14011.ack.

  423

  424 Catherine Tucker

  This article will discuss these concerns in detail, after considering how

  the theory of the economics of privacy relates to artifi cial intelligence (AI).

  17.1 The Theory of Privacy in Economics and Artifi cial Intelligence

  17.1.1 Current Models of Economics and Privacy and Their Flaws

  The economics of privacy has long being plagued by a lack of clarity

  about how to model privacy over data. Most theoretical economic models

  model privacy as an intermediate good (Varian 1996; Farrell 2012). This

  implies that an individual desire for data privacy will depend on how they

  anticipate that data’s eff ect on future economic outcomes. If, for example,

  this data leads a fi rm to charge higher prices based on the behavior they

  observe in the data, a consumer may desire privacy. If a datum may lead

  a fi rm to intrude on their time, then again a consumer may desire privacy.

  However, this contrasts with, or at the very least has a diff erent emphasis

  on, how many policymakers and even consumers think about privacy policy

  and choice.

  First, much of the policy debate involves whether or not consumers are

  capable of making the right choice surrounding the decision to provide data,

  and whether “notice and consent” provides suffi

  cient information to con-

  sumers so they make the right choice. Work such as McDonald and Cranor

  (2008) emphasizes that even ten years ago it was unrealistic to think that con-

  sumers would have time to properly inform themselves about how their data

  may be used, as reading through privacy policies would take an estimated

  244 hours each year. Since that study, the amount of devices (thermostats,

  smart phones, apps, cars) collecting data has increased dramatically, suggest-

  ing that it is, if anything, more implausible now that a consumer has the time

  to actually understand the choice they are making in each of these instances.

  Relatedly, even if customers are assumed to have been adequately in-

  formed, a new “behavioral” literature on privacy shows that well- documented

  eff ects from behavioral economics, such as the endowment eff ect or “anchor-

  ing,” may also distort the ways customers make decisions surrounding their

  data (Acquisti, Taylor, and Wagman 2016). Such distortions may allow for

  policy interventions of the “nudge” type to allow consumers to make better

  decisions (Acquisti 2010).

  Third, this theory presupposes that customers will only desire privacy if

  their data is actually used for something, rather than experiencing distaste

  at the idea of their data being collected. Indeed, in some of the earliest

  work on privacy in the internet era, Varian (1996) states, “I don’t really care

  if someone has my telephone number as long as they don’t call me during

  dinner and try to sell me insurance. Similarly, I don’t care if someone has

  my address, as long as they don’t send me lots of offi

  cial- looking letters

  off ering to refi nance my house or sell me mortgage insurance.”

  Privacy, Algorithms, and Artifi cial Intelligence 425

  However, there is evidence to suggest that people do care about the mere

  fact of collection of their data to the extent of changing their behavior,

  even if the chance of their suff ering meaningfully adverse consequences

  from that collection is very small. Empirical analysis of people’s reactions

  to the knowledge that their search queries (Marthews and Tucker 2014)

  had been collected by the US National Security Agency (NSA), shows a

  signifi cant shift in behavior even when that data was not going to be used

  by the government to identify terrorists, as it was simply personally embar-

  rassing. Legally speaking, the Fourth Amendment of the US Constitution

  covers the “unreasonable seizure” as well as the “u
nreasonable search” of

  people’s “papers and eff ects,” suggesting that governments, and fi rms acting

  on government’s behalf, cannot entirely ignore seizure of data and focus

  only on whether a search is reasonable. Consequently, a growing consumer

  market has emerged for “data- light” and “end- to-end encrypted” com-

  munications and software solutions, where the fi rm collects much less or

  no data about their consumers’ activities on their platform. These kinds of

  concern suggest that the fact of data collection may matter as well as how

  the data is used.

  Last, often economic theory assumes that while customers desire fi rms

  to have information that allows them to better match their horizontally

  diff erentiated preferences, they do not desire fi rms to have information that

  might inform their willingness to pay (Varian 1996). However, this idea

  that personalization in a horizontal sense may be sought by customers goes

  against popular reports of consumers fi nding personalization repugnant or

  creepy (Lambrecht and Tucker 2013). Instead, it appears that personaliza-

  tion of products using horizontally diff erentiated taste information is only

  acceptable or successful if accompanied by a sense of control or ownership

  over the data used, even where such control is ultimately illusory (Tucker

  2014; Athey, Catalini, and Tucker 2017).

  17.1.2 Artifi cial Intelligence and Privacy

  Like “privacy,” artifi cial intelligence is often used loosely to mean many

  things. This article follows (Agrawal, Gans, and Goldfarb 2016) and focuses

  on AI as being associated with reduced costs of prediction. The obvious

  eff ect that this will have on the traditional model of privacy is that more

  types of data will be used to predict a wider variety of economic objectives.

  Again, the desire (or lack of desire) for privacy will be a function of an

  individual’s anticipation of the consequences of their data being used in a

  predictive algorithm. If they anticipate that they will face worse economic

  outcomes if the AI uses their data, they may desire to restrict their data

  sharing or creating behavior.

  It may be that the simple dislike or distaste for data collection will transfer

  to the use of automated predictive algorithms to process their data. The

  creepiness that leads to a desire for privacy that is attached to the use of

  426 Catherine Tucker

  data would be transferred to algorithms. Indeed, there is some evidence of

  a similar behavioral process where some customers only accept algorithmic

  prediction if it is accompanied by a sense of control (Dietvorst, Simmons,

  and Massey 2016).

  In this way, the question of AI algorithms seems simply a continuation

  of the tension that has plagued earlier work in the economics of privacy. So,

  a natural question is whether AI presents new or diff erent problems. This

  article argues that many of the questions of AI and privacy choices will

  constrain the ability of customers in our traditional model of privacy to

  make choices regarding the sharing of their data. I emphasize three themes

  that I think may distort this process in important and economically interest-

  ing ways.

  17.2 Data Persistence, AI, and Privacy

  Data persistence refers to the fact that once digital data is created, it is

  diffi

  cult to delete completely. This is true from a technical perspective (Adee

  2015). Unlike analog records, which can be destroyed with reasonable ease,

  the intentional deletion of digital data requires resources, time, and care.

  17.2.1 Unlike in Previous Eras, Data Created Now Is Likely to Persist

  Cost constraints that used to mean that only the largest fi rms could aff ord

  to store extensive data, and even then for a limited time, have essentially

  disappeared.

  Large shifts in the data- supply infrastructure have rendered the tools for

  gathering and analyzing large swaths of digital data commonplace. Cloud-

  based resources such as Amazon, Microsoft, and Rackspace make these

  tools not dependent on scale,1 and storage costs for data continue to fall,

  so that some speculate they may eventually approach zero.2 This allows

  ever- smaller fi rms to have access to powerful and inexpensive computing

  resources. This decrease in costs suggests that data may be stored indefi nitely

  and can be used in predictive exercises should it be thought of as a useful

  predictor.

  The chief resource constraint on the deployment of big data solutions

  is a lack of human beings with the data- science skills to draw appropriate

  conclusions from analysis of large data sets (Lambrecht and Tucker 2017).

  As time and skills evolve, this constraint may become less pressing.

  Digital persistence may be concerning from a privacy point of view

  because privacy preferences may change over time. The privacy preference

  1. http:// betanews .com/ 2014/ 06/ 27/ comparing- the- top- three- cloud- storage- providers/.

  2. http:// www .enterprisestorageforum .com/ storage- management/ can- cloud- storage- costs

  - fall- to-zero- 1 .html.

  Privacy, Algorithms, and Artifi cial Intelligence 427

  that an individual may have felt when they created the data may be incon-

  sistent with the privacy preference of their older self. This is something we

  documented in Goldfarb and Tucker (2012). We showed that while younger

  people tended to be more open with data, as they grew older their prefer-

  ence for withholding data grew. This was a stable eff ect that persisted across

  cohorts. It is not the case that young people today are unusually casual about

  data; all generations when younger are more casual about data, but this pat-

  tern was simply less visible previously because social media, and other ways

  of sharing and creating potentially embarrassing data, did not yet exist.

  This implies that one concern regarding AI and privacy is that it may use

  data that was created a long time in the past, which in retrospect the indi-

  vidual regrets creating.

  Data that was created at t = 0 may have seemed innocuous at the time,

  and in isolation may still be innocuous at t = t + 1, but increased computing power may be able to derive much more invasive conclusions from aggregations of otherwise innocuous data at t + 1 relative to t. Second, there is a whole variety of data generated on individuals that individuals do not

  necessarily consciously choose to create. This not only includes incidental

  collection of the data such as being photographed by another party, but

  also data generated by the increased passive surveillance of public spaces,

  and the use of cellphone technology without full appreciation of how much

  data about an individual and location it discloses to third parties, including

  the government.

  Though there has been substantial work in bringing in the insights of

  behavioral economics into the study of the economics of privacy, there has

  been less work on time- preference consistency, despite the fact that it is one

  of the oldest and most studied (Strotz 1955; Rubinstein 2006) phenomena

  in behavioral economics. Introducing the potential for myopia or hyperbolic

  discounting into the way we model privacy
choices over the creation of data

  seems, therefore, an important step. Even if the economist concerned rejects

  behavioral economics or myopia as an acceptable solution, at the very least

  it is useful to emphasize that privacy choices should be modeled not as

  something where the time between the creation of the data and the use of

  the data is trivial, but instead is more acceptably modeled as a decision that

  may be played out over an extended amount of time.

  17.2.2 How Long Will Data’s Predictive Power Persist?

  If we assume that any data created will probably persist, given low stor-

  age costs, it may be that the more important question for understanding

  the dynamics of privacy is the question of how long data’s predictive power

  persists.

  It seems reasonable to think that much of the data created today does not

  have much predictive power tomorrow. This is something we investigated in

  428 Catherine Tucker

  Chiou and Tucker (2014) where we showed that the length of the data reten-

  tion period that search engines were restricted to by the European Union

  (EU) did not appear to aff ect the success of their algorithm at generating

  useful search results. This is where the success of a search result was mea-

  sured by whether or not the user felt compelled to search again. This may

  make sense in the world of search engines where many searches are either

  unique or focused on new events. On August 31, 2017, for example, the top

  trending search on Google was “Hurricane Harvey,” something that could

  not have been predicted on the basis of search behavior from more than a

  few weeks prior.3

  However, there are some forms of data where it is reasonable to think that

  their predictive power will persist almost indefi nitely. The most important

  example of this is the creation of genetic digital data. As Miller and Tucker

  (2017) point out, companies such as 23andme .com are creating large reposi-

  tories of genetic data spanning more than 1.2 million people. As pointed

  out by Miller and Tucker (2017), genetic data has the unusual quality that

  it does not change over time.

  While the internet browsing behavior of a twenty- year- old may not prove

  to be good for predicting their browsing behavior at age forty, the genetic

  data of a twenty- year- old will almost perfectly predict the genetic data of

 

‹ Prev