by Hannah Fry
These numbers are impressive, but it’s actually difficult to know for sure whether PredPol can take the credit. Toby Davies, a mathematician and crime scientist from UCL, told me: ‘It’s possible that merely encouraging policing officers to go to places and get out of their cars and walk around, regardless of where, actually could lead to reductions [in crime] anyway.’
And there’s another issue here. If, the harder you look for crime, the more likely you are to find it, then the act of sending police out could actually change the crime records themselves: ‘When police are in a place,’ Davies told me, ‘they detect more crime than they would have done otherwise. Even if an equal value of crime is happening in two places, the police will detect more in the place they were than the one that they weren’t.’
That means there is one very big potential downside of using a cops-on-the-dots tactic. By sending police into an area to fight crime on the back of the algorithm’s predictions, you can risk getting into a feedback loop.
If, say, a poorer neighbourhood had a high level of crime in the first instance, the algorithm may well predict that more crime will happen there in future. As a result, officers are sent to the neighbourhood, which means they detect more crime. Thus, the algorithm predicts more still, more officers are sent there, and so on it goes. These feedback loops are more likely to be a problem for crimes that are linked to poorer areas such as begging, vagrancy and low-level drug use.
In the UK, where some sections of society regularly complain about a lack of police presence on the streets, focusing police attention on certain areas might not immediately seem unfair. But not everyone has a positive relationship with the police. ‘It is legitimate for people who see a police officer walking in front of their house every day to feel oppressed by that, even if no one’s doing any crimes, even if that police officer is literally just walking up and down,’ Davies told me. ‘You almost have a right not to be constantly under pressure, under the eye of the police.’
I’m rather inclined to agree.
Now, a well-tuned algorithm should be built so that it can take account of the tactics being used by the police. There are ways, theoretically at least, to ensure that the algorithm doesn’t disproportionately target particular neighbourhoods – like randomly sending police to medium-risk areas as well as high-risk ones. But, unfortunately, there’s no way to know for sure whether PredPol is managing to avoid these feedback loops entirely, or indeed whether it is operating fairly more generally, because PredPol is a proprietary algorithm, so the code isn’t available to the public and no one knows exactly how it works.
PredPol is not the only software on the market. One competitor is HunchLab, which works by combining all sorts of statistics about an area: reported crimes, emergency calls, census data (as well as more eyebrow-raising metrics like moon phases). HunchLab doesn’t have an underlying theory. It doesn’t attempt to establish why crime occurs in some areas more than others; it simply reports on patterns it finds in the data. As a result, it can reliably predict more types of crime than PredPol (which has at its heart theories about how criminals create geographical patterns) – but, because HunchLab too is protected as intellectual property, it is virtually impossible from the outside to ensure it isn’t inadvertently discriminating against certain groups of people.39
Another opaque predictive algorithm is the Strategic Subject List used by the Chicago Police Department.40 This algorithm takes an entirely different approach from the others. Rather than focusing on geography, it tries to predict which individuals will be involved in gun crime. Using a variety of factors, it creates a ‘heat list’ of people it deems most likely to be involved in gun violence in the near future, either doing the shooting or being shot. The theory is sound: today’s victims are often tomorrow’s perpetrators. And the programme is well intentioned: officers visit people on the watch list to offer access to intervention programmes and help to turn their lives around.
But there are concerns that the Strategic Subject List might not be living up to its promise. One recent investigation by the non-profit RAND Corporation concluded that appearing on it actually made no difference to an individual’s likelihood of being involved in a shooting.41 It did, however, mean they were more likely to be arrested. Perhaps – the report concluded – this was because officers were simply treating the watch list as a list of suspects whenever a shooting occurred.
Predictive policing algorithms undoubtedly show promise, and the people responsible for creating them are undoubtedly doing so in good faith, with good intentions. But the concerns raised around bias and discrimination are legitimate. And for me, these questions are too fundamental to a just society for us simply to accept assurances that law enforcement agencies will use them in a fair way. It’s one of many examples of how badly we need independent experts and a regulatory body to ensure that the good an algorithm does outweighs the harm.
And the potential harms go beyond prediction. As we have already seen in a variety of other examples, there is a real danger that algorithms can add an air of authority to an incorrect result. And the consequences here can be dramatic. Just because the computer says something doesn’t make it so.
Who do you think you are?
Steve Talley was asleep at home in South Denver in 2014 when he heard a knock at the door.42 He opened it to find a man apologizing for accidentally hitting his car. The stranger asked Talley to step outside and take a look. He obliged. As he crouched down to assess the damage to his driver’s side door,43 a flash grenade went off. Three men dressed in black jackets and helmets appeared and knocked him to the ground. One man stood on his face. Another restrained his arms while another started repeatedly hitting him with the butt of a gun.
Talley’s injuries would be extensive. By the end of the evening he had sustained nerve damage, blood clots and a broken penis.44 ‘I didn’t even know you could break a penis,’ he later told a journalist at The Intercept. ‘At one point I was actually screaming for the police. Then I realized these were cops who were beating me up.’45
Steve Talley was being arrested for two local bank robberies. During the second robbery a police officer had been assaulted, which is why, Talley thinks, he was treated so brutally during his arrest. ‘I told them they were crazy,’ he remembers shouting at the officers, ‘You’ve got the wrong guy!’
Talley wasn’t lying. His arrest was the result of his striking resemblance to the right guy – the real robber.
Although it was a maintenance man working in Talley’s building who initially tipped off the police after seeing photos on the local news, it would eventually be an FBI expert using facial recognition software46 who later examined the CCTV footage and concluded that ‘the questioned individual depicted appears to be Talley’.47
Talley had a cast-iron alibi, but thanks to the FBI expert’s testimony, it would still take over a year to clear his name entirely. In that time he was held in a maximum-security pod for almost two months until enough evidence surfaced to release him. As a result, he was unable to work, and by the time his ordeal was over he had lost his job, his home and access to his children. All as a direct result of that false identification.
Seeing double
Facial recognition algorithms are becoming commonplace in modern policing. These algorithms, presented with a photograph, footage or snapshot from a 3D camera, will detect a face, measure its characteristics and compare them to a database of known faces with the aim of determining the identity of the person pictured.
In Berlin, facial recognition algorithms capable of identifying known terrorism suspects are trained on the crowds that pass through railway stations.48 In the United States, these algorithms have led to more than four thousand arrests since 2010 just for fraud and identity theft in the state of New York alone.49 And in the UK, cameras mounted on vehicles that look like souped-up Google StreetView cars now drive around automatically cross-checking our likenesses with a database of wanted people.50 These vans scored their first
success in June 2017 after one drove past a man in south Wales where police had a warrant out for his arrest.51
Our safety and security often depend on our ability to identify and recognize faces. But leaving that task in the hands of humans can be risky. Take passport officers, for instance. In one recent study, set to mimic an airport security environment, these professional face recognizers failed to spot a person carrying the wrong ID a staggering 14 per cent of the time – and incorrectly rejected 6 per cent of perfectly valid matches.52 I don’t know about you, but I find those figures more than a little disconcerting when you consider the number of people passing through Heathrow every day.
As we shall see, facial recognition algorithms can certainly do better at the task than humans. But as they’re applied to hunt for criminals, where the consequences of misidentification are so serious, their use raises an important question. Just how easily could one person’s identity be confused with another’s? How many of us have a Steve Talley-style lookalike lurking out there somewhere?
One study from 2015 seems to suggest the chances of you having your own real-life doppelgänger (whether they’re a bank robber or otherwise) are vanishingly small. Teghan Lucas at the University of Adelaide painstakingly took eight facial measurements from photographs of four thousand people and failed to find a single match among them, leading her to conclude that the chances of two people having exactly the same face were less than one in a trillion.53 By that calculation, Talley wasn’t just ‘a bit’ unlucky. Taking into account that his particular one-in-a-trillion evil twin also lived nearby and happened to be a criminal, we could expect it to be tens of thousands of years before another ill-fated soul fell foul of the same miserable experience.
And yet there are reasons to suspect that those numbers don’t quite add up. While it’s certainly difficult to imagine meeting someone with the same face as yourself, anecdotal evidence of unrelated twin strangers does appear to be much more common than Lucas’s research might suggest.
Take Neil Douglas, who was boarding a plane to Ireland when he realized his double was sitting in his seat. The selfie they took, with a plane-load of passengers laughing along in the background, quickly went viral, and soon redheads with beards from across the world were sending in photos of their own to demonstrate that they too shared the likeness. ‘I think there was a small army of us at some point,’ Neil told the BBC.54
I even have my own story to add to the pile. When I was 22, a friend showed me a photo they’d seen on a local band’s Myspace page. It was a collage of pictures taken at a gig that I hadn’t attended, showing a number of people all enjoying themselves, one of whom looked eerily familiar. Just to be sure I hadn’t unwittingly blacked out one night and wandered off to a party I now had no recollection of attending, I emailed the lead singer in the band, who confirmed what I suspected: my synth-pop-loving doppelgänger had a better social life than me.
So that’s Talley, Douglas and me who each have at least one doppelgänger of our own, possibly more. We’re up to three in a population of 7.5 billion and we haven’t even started counting in earnest – and we’re already way above Lucas’s estimate of one in a trillion.
There is a reason for the discrepancy. It all comes down to the researcher’s definition of ‘identical’. Lucas’s study required that two people’s measurements must match one another exactly. Even though Neil and his lookalike are incredibly similar, if one nostril or one earlobe were out by so much as a millimetre, they wouldn’t strictly count as doppelgängers according to her criteria.
But even when you’re comparing two images of the same person, exact measurements won’t reflect how each one of us is continually changing, through ageing, illness, tiredness, the expressions we’re pulling or how our faces are distorted by a camera angle. Try to capture the essence of a face in millimetres and you’ll find as much variation in one person’s face as you will between people. Put simply, measurements alone can’t distinguish one face from another.
Although they might not be perfectly identical, I can none the less easily imagine mixing up Neil and his twin-stranger in the photograph. Likewise in the Talley case – poor Steve didn’t even look that similar to the real robber, and yet the images were misinterpreted by FBI experts to the point where he was charged with a crime he didn’t commit and thrown into a maximum-security cell.
As the passport officers demonstrated, it’s astonishingly easy to confuse unfamiliar faces, even when they bear only a passing resemblance. It turns out that humans are astonishingly bad at recognizing strangers. It’s the reason why a friend of mine claimed she could barely sit through Christopher Nolan’s beautifully made film Dunkirk – because she struggled to distinguish between the actors. It’s why teenagers find it worthwhile to ‘borrow’ an older friend’s ID to buy alcohol. And it’s why the Innocence Project, a non-profit legal organization in the United States, estimates that eyewitness misidentification plays a role in more than 70 per cent of wrongful convictions.55
And yet, while an eyewitness might easily confuse Neil with his travel companion, his mother would surely have no problem picking out her son in the photo. When it comes to people we know, we are tremendously good at recognizing faces – even when it comes to real-life doppelgängers: a set of identical twins might be easily confused if they are only your acquaintances, but just as easily distinguished once you know them properly.
And herein lies a critical point: similarity is in the eye of the beholder. With no strict definition of similarity, you can’t measure how different two faces are and there is no threshold at which we can say that two faces are identical. You can’t define what it means to be a doppelgänger, or say how common a particular face is; nor – crucially – can you state a probability that two images were taken from the same individual.
This means that facial recognition, as a method of identification, is not like DNA, which sits proudly on a robust statistical platform. When DNA testing is used in forensics, the profiling focuses on particular chunks of the genome that are known to be highly variable between humans. The extent of that variation is key: if the DNA sequence in a sample of body tissue found at the scene of a crime matches the sequence in a swab from a suspect, it means you can calculate the probability that both came from the same individual. It also means you can state the exact chance that some unlucky soul just happened to have an identical DNA sequence at those points.56 The more markers you use, the lower your chances of a mismatch, and so, by choosing the number of markers to test, every judicial system in the world has complete power to decide on the threshold of doubt they’re willing to tolerate.57
Even though our faces feel so intrinsically linked to who we are, without knowing the variation across humans the practice of identifying felons by their faces isn’t supported by rigorous science. When it comes to identifying people from photos – to quote a presentation given by an FBI forensics unit – ‘Lack of statistics means: conclusions are ultimately opinion based.’58
Unfortunately, using algorithms to do our facial recognition for us does not solve this conundrum, which is one very good reason to exercise caution when using them to pinpoint criminals. Resemblance and identity are not the same thing and never will be, however accurate the algorithms become.
And there’s another good reason to tread carefully with face-recognition algorithms. They’re not quite as good at recognizing faces as you might think.
One in a million?
The algorithms themselves work using one of two main approaches. The first kind builds a 3D model of your face, either by combining a series of 2D images or by scanning you using a special infrared camera. This is the method adopted by the Face ID system that Apple uses in its iPhones. These algorithms have worked out a way to get around the issues of different facial expressions and ageing by focusing on areas of the face that have rigid tissue and bone, like the curve of your eye socket or the ridge of your nose.
Apple has claimed that the chance of a random person
being able to unlock your phone with Face ID is one in a million, but the algorithm is not flawless. It can be fooled by twins,59 siblings,60 and children on their parents’ phones. (Soon after the launch of Face ID, a video appeared of a ten-year-old boy who could hoodwink the facial recognition on his mother’s iPhone. She now deletes her texts if there is something she doesn’t want her son to look at.)61 There have also been reports that the algorithm can be tricked by a specially built 3D printed mask, with infrared images glued on for the eyes.62 All this means that while the algorithm might be good enough to unlock your phone, it probably isn’t yet reliable enough to be used to grant access to your bank accounts.
Nor are these 3D algorithms much use for scanning passport photos or CCTV footage. For that you need the second kind of algorithm, which sticks to 2D images and uses a statistical approach. These algorithms don’t directly concern themselves with landmarks that you or I could recognize as distinguishing features, but instead build a statistical description of the patterns of light and dark across the image. Like the algorithms built to recognize dogs in the ‘Medicine’ chapter, researchers realized recently that, rather than having to rely on humans to decide which patterns will work best, you can get the algorithm to learn the best combinations for itself, by using trial and error on a vast dataset of faces. Typically, it’s done using neural networks. This kind of algorithm is where the big recent leaps forward in performance and accuracy have come in. That performance, though, comes with a cost. It isn’t always clear precisely how the algorithm decides whether one face is like another.