by Cathy O'Neil
These regulations are not perfect, and they desperately need updating. Consumer complaints are often ignored, and there’s nothing explicitly keeping credit-scoring companies from using zip codes as proxies for race. Still, they offer a good starting point. First, we need to demand transparency. Each of us should have the right to receive an alert when a credit score is being used to judge or vet us. And each of us should have access to the information being used to compute that score. If it is incorrect, we should have the right to challenge and correct it.
Next, the regulations should expand to cover new types of credit companies, like Lending Club, which use newfangled e-scores to predict the risk that we’ll default on loans. They should not be allowed to operate in the shadows.
The Americans with Disabilities Act (ADA), which protects people with medical issues from being discriminated against at work, also needs an update. The bill currently prohibits medical exams as part of an employment screening. But we need to update it to take into account Big Data personality tests, health scores, and reputation scores. They all sneak around the law, and they shouldn’t be able to. One possibility already under discussion would extend protection of the ADA to include “predicted” health outcomes down the road. In other words, if a genome analysis shows that a person has a high risk for breast cancer, or for Alzheimer’s, that person should not be denied job opportunities.
We must also expand the Health Insurance Portability and Accountability Act (HIPAA), which protects our medical information, in order to cover the medical data currently being collected by employers, health apps, and other Big Data companies. Any health-related data collected by brokers, such as Google searches for medical treatments, must also be protected.
If we want to bring out the big guns, we might consider moving toward the European model, which stipulates that any data collected must be approved by the user, as an opt-in. It also prohibits the reuse of data for other purposes. The opt-in condition is all too often bypassed by having a user click on an inscrutable legal box. But the “not reusable” clause is very strong: it makes it illegal to sell user data. This keeps it from the data brokers whose dossiers feed toxic e-scores and microtargeting campaigns. Thanks to this “not reusable” clause, the data brokers in Europe are much more restricted, assuming they follow the law.
Finally, models that have a significant impact on our lives, including credit scores and e-scores, should be open and available to the public. Ideally, we could navigate them at the level of an app on our phones. In a tight month, for example, a consumer could use such an app to compare the impact of unpaid phone and electricity bills on her credit score and see how much a lower score would affect her plans to buy a car. The technology already exists. It’s only the will we’re lacking.
On a summer day in 2013, I took the subway to the southern tip of Manhattan and walked to a large administrative building across from New York’s City Hall. I was interested in building mathematical models to help society—the opposite of WMDs. So I’d signed on as an unpaid intern in a data analysis group within the city’s Housing and Human Services Departments. The number of homeless people in the city had grown to sixty-four thousand, including twenty-two thousand children. My job was to help create a model that would predict how long a homeless family would stay in the shelter system and to pair each family with the appropriate services. The idea was to give people what they needed to take care of themselves and their families and to find a permanent home.
My job, in many ways, was to help come up with a recidivism model. Much like the analysts building the LSI–R model, I was interested in the forces that pushed people back to shelters and also those that led them to stable housing. Unlike the sentencing WMD, though, our small group was concentrating on using these findings to help the victims and to reduce homelessness and despair. The goal was to create a model for the common good.
On a separate but related project, one of the other researchers had found an extremely strong correlation, one that pointed to a solution. A certain group of homeless families tended to disappear from shelters and never return. These were the ones who had been granted vouchers under a federal affordable housing program called Section 8. This shouldn’t have been too surprising. If you provide homeless families with affordable housing, not too many of them will opt for the streets or squalid shelters.
Yet that conclusion might have been embarrassing to then-mayor Michael Bloomberg and his administration. With much fanfare, the city government had moved to wean families from Section 8. It instituted a new system called Advantage, which limited subsidies to three years. The idea was that the looming expiration of their benefits would push poor people to make more money and pay their own way. This proved optimistic, as the data made clear. Meanwhile, New York’s booming real estate market was driving up rents, making the transition even more daunting. Families without Section 8 vouchers streamed back into the shelters.
The researcher’s finding was not welcome. For a meeting with important public officials, our group prepared a PowerPoint presentation about homelessness in New York. After the slide with statistics about recidivism and the effectiveness of Section 8 was put up, an extremely awkward and brief conversation took place. Someone demanded the slide be taken down. The party line prevailed.
While Big Data, when managed wisely, can provide important insights, many of them will be disruptive. After all, it aims to find patterns that are invisible to human eyes. The challenge for data scientists is to understand the ecosystems they are wading into and to present not just the problems but also their possible solutions. A simple workflow data analysis might highlight five workers who appear to be superfluous. But if the data team brings in an expert, they might help discover a more constructive version of the model. It might suggest jobs those people could fill in an optimized system and might identify the training they’d need to fill those positions. Sometimes the job of a data scientist is to know when you don’t know enough.
As I survey the data economy, I see loads of emerging mathematical models that might be used for good and an equal number that have the potential to be great—if they’re not abused. Consider the work of Mira Bernstein, a slavery sleuth. A Harvard PhD in math, she created a model to scan vast industrial supply chains, like the ones that put together cell phones, sneakers, or SUVs, to find signs of forced labor. She built her slavery model for a nonprofit company called Made in a Free World. Its goal is to use the model to help companies root out the slave-built components in their products. The idea is that companies will be eager to free themselves from this scourge, presumably because they oppose slavery, but also because association with it could devastate their brand.
Bernstein collected data from a number of sources, including trade data from the United Nations, statistics about the regions where slavery was most prevalent, and detailed information about the components going into thousands of industrial products, and incorporated it all into a model that could score a given product from a certain region for the likelihood that it was made using slave labor. “The idea is that the user would contact his supplier and say, ‘Tell me more about where you’re getting the following parts of your computers,’ ” Bernstein told Wired magazine. Like many responsible models, the slavery detector does not overreach. It merely points to suspicious places and leaves the last part of the hunt to human beings. Some of the companies find, no doubt, that the suspected supplier is legit. (Every model produces false positives.) That information comes back to Made in a Free World, where Bernstein can study the feedback.
Another model for the common good has emerged in the field of social work. It’s a predictive model that pinpoints households where children are most likely to suffer abuse. The model, developed by Eckerd, a child and family services nonprofit in the southeastern United States, launched in 2013 in Florida’s Hillsborough County, an area encompassing Tampa. In the previous two years, nine children in the area had died from abuse, including a baby who was thrown out a car window
. The modelers included 1,500 child abuse cases in their database, including the fatalities. They found a number of markers for abuse, including a boyfriend in the home, a record of drug use or domestic violence, and a parent who had been in foster care as a child.
If this were a program to target potential criminals, you can see right away how unfair it could be. Having lived in a foster home or having an unmarried partner in the house should not be grounds for suspicion. What’s more, the model is much more likely to target the poor—and to give a pass to potential abuse in wealthy neighborhoods.
Yet if the goal is not to punish the parents, but instead to provide help to children who might need it, a potential WMD turns benign. It funnels resources to families at risk. And in the two years following implementation of the model, according to the Boston Globe, Hillsborough County suffered no fatalities from child abuse.
Models like this will abound in coming years, assessing our risk of osteoporosis or strokes, swooping in to help struggling students with calculus II, even predicting the people most likely to suffer life-altering falls. Many of these models, like some of the WMDs we’ve discussed, will arrive with the best intentions. But they must also deliver transparency, disclosing the input data they’re using as well as the results of their targeting. And they must be open to audits. These are powerful engines, after all. We must keep our eyes on them.
Data is not going away. Nor are computers—much less mathematics. Predictive models are, increasingly, the tools we will be relying on to run our institutions, deploy our resources, and manage our lives. But as I’ve tried to show throughout this book, these models are constructed not just from data but from the choices we make about which data to pay attention to—and which to leave out. Those choices are not just about logistics, profits, and efficiency. They are fundamentally moral.
If we back away from them and treat mathematical models as a neutral and inevitable force, like the weather or the tides, we abdicate our responsibility. And the result, as we’ve seen, is WMDs that treat us like machine parts in the workplace, that blackball employees and feast on inequities. We must come together to police these WMDs, to tame and disarm them. My hope is that they’ll be remembered, like the deadly coal mines of a century ago, as relics of the early days of this new revolution, before we learned how to bring fairness and accountability to the age of data. Math deserves much better than WMDs, and democracy does too.
* * *
*1 You might think that an evenhanded audit would push to eliminate variables such as race from the analysis. But if we’re going to measure the impact of a WMD, we need that data. Currently, most of the WMDs avoid directly tracking race. In many cases, it’s against the law. It is easier, however, to expose racial discrimination in mortgage lending than in auto loans, because mortgage lenders are required to ask for the race of the applicant, while auto lenders are not. If we include race in the analysis, as the computer scientist Cynthia Dwork has noted, we can quantify racial injustice where we find it. Then we can publicize it, debate the ethics, and propose remedies. Having said that, race is a social construct and as such is difficult to pin down even when you intend to, as any person of mixed race can tell you.
*2 Google has expressed interest in working to eliminate bias from its algorithm, and some Google employees briefly talked to me about this. One of the first things I tell them is to open the platform to more outside researchers.
NOTES
INTRODUCTION
one out of every two: Robert Stillwell, Public School Graduates and Dropouts from the Common Core of Data: School Year 2006–07, NCES 2010-313 (Washington, DC: National Center for Education Statistics, Institute of Education Sciences, US Department of Education, 2009), 5, http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010313.
8 percent of eighth graders: Jihyun Lee, Wendy S. Grigg, and Gloria S. Dion, The Nation’s Report Card Mathematics 2007, NCES 2007-494 (Washington, DC: National Center for Education Statistics, Institute of Education Sciences, US Department of Education, 2007), 32, https://nces.ed.gov/nationsreportcard/pdf/main2007/2007494.pdf.
Rhee developed a teacher assessment tool: Bill Turque, “Rhee Dismisses 241 D.C. Teachers; Union Vows to Contest Firings,” Washington Post, July 24, 2010, www.washingtonpost.com/wp-dyn/content/article/2010/07/23/AR2010072303093.html.
the district fired all the teachers: Steven Sawchuck, “Rhee to Dismiss Hundreds of Teachers for Poor Performance,” Education Week Blog, July 23, 2010, http://blogs.edweek.org/edweek/teacherbeat/2010/07/_states_and_districts_across.html.
another 5 percent, or 205 teachers: Bill Turque, “206 Low-Performing D.C. Teachers Fired,” Washington Post, July 15, 2011, www.washingtonpost.com/local/education/206-low-performing-dc-teachers-fired/2011/07/15/gIQANEj5GI_story.html.
Sarah Wysocki, a fifth-grade teacher: Bill Turque, “ ‘Creative…Motivating’ and Fired,” Washington Post, March 6, 2012, www.washingtonpost.com/local/education/creative—motivating-and-fired/2012/02/04/gIQAwzZpvR_story.html.
One evaluation praised her: Ibid.
Wysocki received a miserable score: Ibid.
represented half of her overall evaluation: Ibid.
The district had hired a consultancy: Ibid.
“There are so many factors”: Sarah Wysocki, e-mail interview by author, August 6, 2015.
a math teacher named Sarah Bax: Guy Brandenburg, “DCPS Administrators Won’t or Can’t Give a DCPS Teacher the IMPACT Value-Added Algorithm,” GFBrandenburg’s Blog, February 27, 2011, https://gfbrandenburg.wordpress.com/2011/02/27/dcps-administrators-wont-or-cant-give-a-dcps-teacher-the-impact-value-added-algorithm/.
29 percent of the students: Turque, “ ‘Creative…Motivating’ and Fired.”
USA Today revealed a high level: Jack Gillum and Marisol Bello, “When Standardized Test Scores Soared in D.C., Were the Gains Real?,” USA Today, March 30, 2011, http://usatoday30.usatoday.com/news/education/2011-03-28-1Aschooltesting28_CV_N.htm.
bonuses of up to $8,000: Ibid.
the erasures were “suggestive”: Turque, “ ‘Creative…Motivating’ and Fired.”
Sarah Wysocki was out of a job: Ibid.
CHAPTER 1
Boudreau, perhaps out of desperation: David Waldstein, “Who’s on Third? In Baseball’s Shifting Defenses, Maybe Nobody,” New York Times, May 12, 2014, www.nytimes.com/2014/05/13/sports/baseball/whos-on-third-in-baseballs-shifting-defenses-maybe-nobody.html?_r=0.
Moneyball: Michael Lewis, Moneyball: The Art of Winning an Unfair Game (New York: W. W. Norton, 2003).
In 1997, a convicted murderer: Manny Fernandez, “Texas Execution Stayed Based on Race Testimony,” New York Times, September 16, 2011, www.nytimes.com/2011/09/17/us/experts-testimony-on-race-led-to-stay-of-execution-in-texas.html?pagewanted=all.
made a reference to Buck’s race: Ibid.
“It is inappropriate to allow race”: Alan Berlow, “See No Racism, Hear No Racism: Despite Evidence, Perry About to Execute Another Texas Man,” National Memo, September 15, 2011, www.nationalmemo.com/perry-might-let-another-man-die/.
Buck never got a new hearing: NAACP Legal Defense Fund, “Texas Fifth Circuit Rejects Appeal in Case of Duane Buck,” NAACP LDF website, August 21, 2015, www.naacpldf.org/update/texas-fifth-circuit-rejects-appeal-case-duane-buck.
prosecutors were three times more likely: OpenFile, “TX: Study Finds Harris County Prosecutors Sought Death Penalty 3-4 Times More Often Against Defendants of Color,” Open File, Prosecutorial Misconduct and Accountability, March 15, 2013, www.prosecutorialaccountabil
ity.com/2013/03/15/tx-study-finds-harris-county-prosecutors-sought-death-penalty-3-4-times-more-often-against-defendants-of-color/.
sentences imposed on black men: American Civil Liberties Union, Racial Disparities in Sentencing, Hearing on Reports of Racism in the Justice System of the United States, submitted to the Inter-American Commission on Human Rights, 153rd Session, October 27, 2014, www.aclu.org/sites/default/files/assets/141027_iachr_racial_disparities_aclu_submission_0.pdf.
blacks fill up 40 percent of America’s prison cells: Federal Bureau of Prisons, Statistics web page, accessed January 8, 2016, www.bop.gov/about/statistics/statistics_inmate_race.jsp.
courts in twenty-four states: Sonja Starr, “Sentencing, by the Numbers,” New York Times, August 10, 2014, www.nytimes.com/2014/08/11/opinion/sentencing-by-the-numbers.html.
average of $31,000 a year: Christian Henrichson and Ruth Delaney, The Price of Prisons: What Incarceration Costs Taxpayers (New York: VERA Institute of Justice, 2012), www.vera.org/sites/default/files/resources/downloads/price-of-prisons-updated-version-021914.pdf.
A 2013 study by the New York Civil Liberties Union: New York Civil Liberties Union, “Stop-and-Frisk 2011,” NYCLU Briefing, May 9, 2012, www.nyclu.org/files/publications/NYCLU_2011_Stop-and-Frisk_Report.pdf.
such as Rhode Island: Rhode Island Department of Corrections, Planning and Research Unit, “Level of Service Inventory–Revised: A Portrait of RIDOC Offenders,” April 2011, accessed January 8, 2016, www.doc.ri.gov/administration/planning/docs/LSINewsletterFINAL.pdf.