Book Read Free

The Naked Future

Page 22

by Patrick Tucker


  Enter Andreas Olligschlaeger, a systems researcher and public policy scholar at Carnegie Mellon University in downtown Pittsburgh. He knew that certain variables make a neighborhood attractive for a drug dealer looking for new turf. One factor is the presence of commercially zoned space. Passersby are much less likely to call police on potential drug dealing near a factory or warehouse than by their own homes (and there are also fewer people around at night). The next factor was seasonality. Drug dealing is mostly an outdoor activity and tends, like cherry trees, to blossom in the spring and flourish in the summer.

  Olligschlaeger also knew that the number of 911 calls related to weapons (shots fired), robberies, and assaults provided an indication of an emerging drug-dealing neighborhood. All these elements were clues as to where the pushers were going to go. The question became how to weight those variables. Was the presence of a potential competitor in one neighborhood more or less of a factor than a lot of residents hanging around? Exactly how big a role did seasonality play? And were the variables dependent or independent? Did an assault in one neighborhood affect the attractiveness of another neighborhood as a new drug-dealing spot or did it not matter? At the time, Pittsburgh had a computer system called DMAP that allowed for the tracking of crimes across geographical space. But a straight averaging of these factors would likely result in a model that treated all the variables too equally. It would overfit.

  Classical statistics doesn’t lend itself well to modeling chaotic interactions with lots of moving parts, but artificial neural networks, which were a relatively recent innovation in 1991, were showing some interesting promise in the field of high-energy physics. An artificial neural network (aka neural net) is a mathematical program modeled on the way neurons exchange signals in the human brain. One of the core features of a neural net is that the weighting of variables changes as the system processes the problem repeatedly. In the same way that a kid shooting free throws from the same spot eventually becomes a great free-throw shooter, or the novice artist who does a thousand different sketches of hands becomes a better artist, neural nets learn by applying a particular set of tools to a particular problem over and over again. Though he was specializing in public policy at the time, Olligschlaeger is also the sort of guy who reads physics journals, which turned out to be a good thing.

  The movement of drug dealers around Pittsburgh had to be subject to mathematical laws just like the movement of particles. Olligschlaeger trained a neural net system on every 911 call related to assault, robbery, and weapons from 1990 to 1992, as well as six other variables (for a total of nine) and then ran fifteen thousand simulations. The system came up with a series of predictions for which 2,150-square-feet sections of the city would see an uptick in drug-related 911 calls. At the end of August 1992 Olligschlaeger made three maps: one showed the predictions made by the straight statistical model (regression forecasts), the other two were neural net based. The result was that the neural net model presented a clear 54 percent improvement over the straight averaging model.2

  The models all performed differently, and none predicted the actual number of calls perfectly. But the straight statistical regression model overestimated the number of calls that occurred by a great deal so if the PD had used that model, they would have sent a lot of cops to quiet neighborhoods in anticipation of calls that would not come. That means they wouldn’t be covering the problem areas as well. The neural net, conversely, missed the relatively few calls that occurred in the southwestern portion of the city. Yet compared with the statistical regression model, it was the model that was least likely to send police to a place where there was definitely not going to be any action. It did a much better job predicting not only where crime would not be but also the number of drug-related calls that would occur in each map cell where they did happen. It provided far better value than straight guessing or even traditional statistical analysis. Unfortunately, that wasn’t good enough to convince the city of Pittsburgh. They never adopted neural nets as a crime-fighting tool.3

  You can’t blame the city hall bureaucrats for not buying the neural net concept. The connection between the input (data) and the output (prediction) was too opaque. Even though the predictions themselves were good, the lack of transparency as to how the net reached its conclusion made the entire system unattractive from a policy standpoint. In his paper on the subject, Olligschlaeger himself admitted this: “One disadvantage of neural networks is that there currently are no tests of statistical significance for the estimated weight structures. However, if the main goal of a model is to provide good forecasts rather than to analyze relationships between dependent and independent variables, then this should not be an issue.”

  Though the use of neural nets did not become standard practice, Olligschlaeger’s study represents a key evolutionary moment of what is today called predictive policing, the use of computational databases and statistics to identify emergent crime patterns and deploy cops preemptively.

  Skip ahead to 1994, newly appointed New York City police commissioner William Bratton institutes what he called a “strategic re-engineering” of the city’s police department. The use of up-to-the-minute data, citywide crime statistics, and crime mapping will go on to bring down the city crime rate by 37 percent in three years. Bratton’s reengineering became another important victory for predictive policing, but not a decisive one because stats were only one portion of Bratton’s overall strategy. Today, many scholars credit tougher zero-tolerance and stop-and-frisk policies, coupled with the use of crime mapping, for bringing down New York City’s crime rate in the 1990s. These measures were not without controversy. New York’s aggressive law enforcement strategies under Bratton led to complaints and charges of harassment and overly aggressive tactics, particularly the stop-and-frisk provision, which targeted mainly minority youth.4

  The first unequivocal victory for predictive policing in practice occurred in 2003 in Richmond, Virginia. Criminologist Colleen McCue was using IBM’s Statistical Package for the Social Sciences (SPSS) software as part of her research into crime patterns. She realized that incidents of random gunfire around New Year’s Day in Richmond happened within a very specific time period, between 10 P.M. on New Year’s Eve and 2 A.M. on New Year’s Day. And these incidents occurred in very particular neighborhoods and under unique conditions. With these variables she built a model to show that on New Year’s Eve the department could dramatically cut down on gunfire complaints, nab a lot of illegal firearms, and do it all with far fewer officers than they had used for patrol the year before by placing police in the places where the gunfire was most likely to occur.5

  Most police departments are run like regular businesses. Cops have precincts to report to and are scheduled in regular shifts. Cops go out on patrol to look for crimes in progress but most of the job is responding to complaints and calls that have come in. The idea of sticking a lot of cops in one spot, in one time window, in advance of something that might happen was pretty revolutionary in 2003, but the department followed her lead.

  When the initiative dubbed Project Exile was concluded, gunfire complaints were down by half compared to the previous year, gun seizures were up 246 percent, and the department had saved $15,000 in New Year’s overtime pay for officers. Complaints were down, guns came off the streets in droves, and more cops got the night off. It was a triple score.6

  Both Project Exile and the neural nets showed that they could get results. Yet where Olligschlaeger found resistance from city officials in Pittsburgh, Richmond police were eager to embrace Project Exile. The reason why says a lot about the way city governments work. Because Exile didn’t involve a neural net or any outrageously sophisticated modeling technique and was instead a straight statistical regression, the political decision makers could understand it. Neural nets are sometimes referred to as black box systems. It’s extremely difficult to see exactly how they reach the conclusions that they reach. What was a fascinating system scientifically w
as unusable as a decision-making tool for a lawmaker or police representative, someone who had to be able to show how and why he arrived at a particular decision, almost regardless of whether the decision was right or wrong. Yes, Exile proved extremely effective when applied to the problem of random gunfire but the challenge of identifying emerging drug neighborhoods was rather more difficult and potentially of greater long-term significance.

  Project Exile simply capitalized on better record keeping techniques. It worked on correlation. Using data to predict crime on the basis of cause was a much more important test. It would occur a few years later in Memphis.

  The Red Dot of Crime

  Over the last several decades, Memphis has followed the same path—straight down—of many formerly prosperous U.S. metro regions. Property values and college graduation rates are abysmal. Poverty is high. Throughout the early 2000s, Memphis was consistently ranked one of the top five worst U.S. cities for violent crime.

  Between 2006 and 2010, in spite of all of the above, crime went down 31 percent.

  The demographics of Memphis didn’t change in that time. The approximately twenty-three hundred men and women on the police force at that time were the same sort you find in any town where there’s too much to do and too few to do it. Here’s what changed: the department began handling its information differently thanks to Dr. Richard Janikowski, an associate professor in the Department of Criminology and Criminal Justice at the University of Memphis.

  Janikowski convinced local police head Larry Godwin to allow him to study the department’s arrest records. But Janikowski wasn’t looking for biographical sketches of the perpetrators; he was looking for marginalia, the circumstances behind each arrest, the where and when of crime.

  The biggest single finding and by far the most controversial was that the rising crime rate was closely connected to Section 8 housing, federally subsidized housing for qualified individuals below a certain income level. When Janikowski and his wife, housing expert Phyllis Betts, took a crime hot-spot map and layered it on top of the map for Section 8 housing, the pattern was unmistakable. Hanna Rosin, in her 2008 Atlantic article on Janikowski, described it thusly: “On the merged map, dense violent-crime areas are shaded dark blue, and Section 8 addresses are represented by little red dots. All of the dark-blue areas are covered in little red dots, like bursts of gunfire. The rest of the city has almost no dots.”7

  When I asked Janikowski about it, he points out that the blue-area, red-dot analysis omitted some important data. “You know, the stuff didn’t overlap perfectly. There were high levels of correlation with it. Section 8 housing was part of what you see there. But it was also just heavy levels—a big concentration of poverty. And that’s a complex relationship that was occurring.”

  Today, we know with more certainty that the connection between Section 8 housing and rising crime is correlative, not causative. People who live in this housing are not more likely to commit crimes so much as they are more likely to move to low-rent neighborhoods where the probability of a crime rise is already high.8

  This relationship between the likelihood of being a crime victim, being a crime assailant, and living in Section 8 was particularly complicated in Memphis, says Janikowski, where many traditional Section 8 units were in terrible shape and others were being torn down. “You’ve got lots of demolition that was occurring in what was the traditional inner city for various reasons. So you had movement there. You had a lot of movement of at-risk populations. And they all tended to cluster because, again, the price of housing.”

  Even if the chain of causation between housing vouchers and violent crime wasn’t clear, the relationship was still a useful guide for predicting where crime was going to occur. Janikowski had to make this case.

  He sat down at a local cafeteria with Memphis police director Larry Godwin, local district attorney Bill Gibbons, and representatives from the department’s Organized Crime Unit. Janikowski was blunt. He told them that to better focus their efforts and get more value for their money, they had to go back over arrest records and take a better look at when and where crimes were occurring.9

  Operation Blue CRUSH (Crime Reduction Utilizing Statistical History) was born. The system used IBM’s SPSS program and mapping software from Esri to better capture and disseminate crime data. When the initial test in an area in East Memphis called Hickory Hill proved successful at bringing down crime at less expense, the department increased the number of police working the tourist areas around Beale Street after 11 P.M., then they focused on the relatively rough-and-tumble area that is today called Legends Park but that at the time was a seventy-year-old, soon-to-be-condemned housing development called Dixie Homes.

  Blue CRUSH uses primarily rule-induction algorithms. In terms of complexity these lie somewhere between a neural net and a straight statistical regression. It’s a learning program that comes up with its own rules for what different variables should weigh based on training data its programmers have exposed it to (this process of coming up with rules is the induction part). It’s still a varying weight model but one with more traceable results.

  The Memphis PD also looked at a lot more variables than the nine (or so) different factors that Olligschlaeger modeled. In addition to weather patterns, seasonality, and area demographics, they could also model lighting conditions with a particular focus on garages and alleys. They looked at when big local employers issued paychecks by time of the week, the month, and the year and what times of day people went to and left work.

  The same location optimization techniques that companies such as Esri provide to retail chains to find the best neighborhood to place a new store are also useful in mapping relationships between crime, economics, and physical space. “We can not only just manage what is this dot on the map that we call ‘burglary’ or ‘robbery,’ but how does that dot on the map interact with the demographics of the area, home values or population trends,” said Mike King, Esri’s national law enforcement manager. “If you’re in a predominantly blue-collar neighborhood that works at factories, what happens every other Friday when it’s payroll time? Do we see increases in alcohol-related events? Do we see increases in domestic violence?”

  Here’s why the way these models work matters to the naked future: as we develop the capacity to monitor more of these signals and incorporate more variables, the statistical tools required to make use of them will become simpler and more transparent. (It’s hard to conceive of practitioners today using a neural net, which is considered rather quaint.) As transparency increases, governmental decision makers will have an easier time accepting and supporting predictive policing programs. As more departments begin to use such programs, and share information about which variables and tools are most useful, these programs could get a lot better very quickly.

  Changes in area economics have emerged as a useful signal for future crime predicting, but it’s not a clear signal. If a sizable portion of the people in your neighborhood suddenly can’t afford to pay their phone bills, or are facing vehicle repossession, that can be indicative of more potential criminals since clearly these people have fallen on rough times. But a sudden rise of neighborhood inequality is also an indicator since part of the neighborhood now perceives itself to be less well-off compared with its neighbors. Criminality, like envy, can be contagious.

  In 2005 a military base reconstruction project left many residents of a particular San Antonio neighborhood suddenly a lot better-off than their neighbors. A big base realignment and closing program resulted in a bump in demand for a very particular type of contractor. Neighborhoods that had been fairly uniform economically were suddenly divided into haves and have-nots. Cornell researchers Matthew Freedman and Emily G. Owens showed that “because of the targeted nature of the spending program, an important effect of this program was to increase the criminal opportunities of the average San Antonian.”10

  But suddenly losing your job can also make
you more likely to become a victim of crime. Janikowski found that when a group of women in Memphis who couldn’t afford a landline were forced to make telephone calls from a pay phone on the side of a convenience store, their risk of suffering sexual assault increased.

  One of the strongest indicators that crime in a given neighborhood is about to jump is foreclosures. Foreclosed homes invite burglars who ransack the residences for copper in wires and electrical equipment. Drug dealers seem to feel relatively comfortable working in a neighborhood the residents have been pushed out of by landlords and banks. Focusing on foreclosure clusters, and putting cops nearby as soon as the cluster appears, is broken-windows theory 2.0. Rather than react to neighborhood dereliction, it anticipates it.

  Within the broader variable of seasonality, there’s a lot of nuance. When the Memphis department focused a heavy police presence downtown during the end of summer, they were able to preempt a rash of burglaries and vehicle break-ins that would normally have been perpetrated by teenagers about to go back to school. In one week the PD dropped crime in that precinct by 37 percent compared with the previous year.11

  The data collection and the analysis that made these predictive insights possible are accelerating and becoming cheaper. Mobile computing and the Internet of Things are allowing officers in the field to collect and disseminate incident data, and better access data from one another, much faster.

  Today, police officers on the beat have the same rapidly evolving view of potential hot spots that headquarter dispatchers had a few years ago. Big command and control centers are moving away from situation rooms, where operators on headsets feed information to soldiers on the ground, and into a single console that patrol officers carry with them. The goal to make that information assimilation process work in a mobile environment is one of the key jobs of Mike King at Esri.

 

‹ Prev