Reports on DNA matches . . . include scientifically rigorous probabilities of the likelihood of finding the same DNA profile in a random, unrelated individual. The chances are typically far less than 1 in 10 billion for a full DNA profile from a single individual. It is that degree of improbability that forms the basis for the common perception that DNA testing is foolproof.15
Suppose that the crime-scene sample is profiled and compared with that of the suspect, and on the basis of the population database used by police, the RMP is determined to be 1 in 10 million. The prosecution tells the jury that the suspect is the likely offender because if we choose an unrelated person at random in the population, the chance that this individual would have the exact profile of that found at the crime scene is 1 in 10 million. But with 6 billion people on the earth, there could, on average, be 600 people with the identical profile. The defense can justifiably say that the suspect is 1 out of 600 people who could have the same DNA profile as that found at the crime scene.16 An RMP of 1 in 10 million does not necessarily mean that you will find one and only one such profile in a population of 10 million. On average an event with a probability of 1 in 10 million occurs once in every 10 million trials, but in some instances it might occur more than once in 10 million trials.17
But what about people who are related or who are from a relatively isolated or highly inbred population? Could there be an exact match for a DNA profile of 13 loci of two individuals who are not identical twins? To date, no such case has been recorded. According to Dan Krane,
The crux of the problem is simply that the RMP delivers pretty much what it says that it will (the chance that a randomly chosen, unrelated individual from a particular population has a perfectly matching DNA profile) and that it is completely silent on the chance that a close relative (or that one of a very large number of relatively close relatives) would have [identical] DNA profiles.18
Krane notes that the assumptions behind the product rule (random assortment of all alleles) do not apply for relatives of individuals. The chances for a coincidental match, then, even if small, are not zero.
Myth of the Infallibility of a Cold Hit
A Cold-Hit Match Made Against a Large Database Has the Same Weight as a Match Between a Person Suspected of a Crime and Evidence from a Crime Scene.
Generally there are two ways in which police seek to find DNA profile matches with crime-scene evidence. First, when they have a suspect, they obtain a biological sample from that individual and compare it with the profile derived from the crime-scene sample. If police get an exact match (all 26 alleles are identical), it usually comes with other evidence linking the suspect to the crime. Otherwise they would not have had reason to obtain the DNA profile of the suspect in the first place.
Second, when police have no suspect, they may compare the DNA profile from the crime scene with all the profiles that have been entered into a DNA offender database or a DNA database consisting of offenders, arrestees, and/or volunteers. This is a fishing expedition using computer technology to make comparisons between one DNA profile (from the crime scene) and the more than 8 million profiles that have been banked in the national Combined DNA Index System (CODIS).19 If they get a match in this case, it is called a “cold hit” because they are operating blindly, without any evidence linking a suspect to the crime or any a priori suspicion.
Do both of these kinds of DNA profile matches—a match that occurs by comparing a known suspect’s DNA with that of the crime scene and a match that occurs as a cold hit—merit the same statistical weight? Keith Devlin, a statistician at Stanford University, argues that a 13-locus match would be a definitive identification provided that “the match is arrived at by comparing a profile from a sample from the crime scene with a profile from a sample from a suspect who has been identified by means other than his or her DNA profile.”20 Devlin argues that the chance that the match is coincidental is higher, however, when a given sample is compared with many samples in a database. In cold-hit cases the investigation involves searching a database of hundreds of thousands or even millions of genetic profiles for a match. Each individual comparison increases the chance that a match will occur with an innocent person.
David Kaye uses the “birthday problem” in statistics to illustrate this point.21 If you are in a room with a group of people and you choose one, then the chances that the two of you have the same birthday is 1 out of 365. But if we ask what the chances are that you have the same birthday as anyone in the room, that will depend on how many people are in the room. Moreover, if you asked what the probability is of one birthday match (not necessarily yours) in the room, the probability would even be greater because you are making pairwise comparisons with everyone in the room.
This example has been used to illustrate the point that RMPs can underestimate the chances of a coincidental match in a cold-hit case, where no other evidence but a DNA profile match is found. Even with their aggressive collection of DNA from citizenry, good practice guidelines adopted by the British police state clearly that because of chances of a coincidental match and other limitations of DNA evidence, individuals should not be convicted exclusively on DNA evidence (i.e., a cold match in a database).22
The National Academy of Sciences recognized that the method of determining the RMP from a suspect sample (where there is prior evidence of suspicion) should not be identical with that from a cold hit (where there is no prior evidence of suspicion). In the latter case the RMP should depend on the size of the database. The chance of finding a random match is greater with a very large database. The academy wrote: “If the only way that the person becomes a suspect is that his DNA profile turned up in a database, the calculations [of RMP] must be modified. . . . Multiply the match probability by the size of the database searched. This is the procedure we recommend.”23
Although it is true that the larger the database, the greater are the chances of finding a match, including a random match, for crime-scene DNA, it is also true that finding a single match of a suspect in a large database improves the chances that the suspect was at the crime scene because it rules out all the other people in the database.
The reliability of the calculation of the RMP is dependent on the reliability of the independence of the genetic loci used in the calculation. But the independence principle remains an assumption or idealization. Devlin has argued for an empirical method of calculating RMPs that requires large data sets and not simply 200 to 400 data points. If we use the product model for calculating RMPs, we could validate it by comparing its results with the frequency of matches found in a large database.
One such test was run in 2005 on the Arizona convicted-offender database containing approximately 65,000 entries, which was analyzed for profile similarities. Approximately 1 in every 228 profiles in the database matched another profile in the database at 9 or more loci; approximately 1 in every 1,489 profiles matched at 10 loci; 1 in 16,374 profiles matched at 11 loci; and 1 in 32,747 matched at 12 loci (both were siblings). Devlin opined: “How big a population does it take to produce so many matches that appear to contradict so dramatically the astronomical, theoretical figures given by the naive application of the product rule?”24 About 1 in 1,489 profiles matched at 10 loci. If we calculated the RMP based on STR frequencies of a very conservative 1 of 5, the theoretical answer would be 1 in 11 million, a much lower probability than was actually found in Arizona. On the basis of this empirical result Devlin concludes:
It is not much of a leap to estimate that the FBI’s national CODIS database of 3,000,000 entries will contain not just one but several pairs that match on all 13 loci, contrary (and how!) to the prediction made by proponents of the currently much touted RMP that you can expect a single match only when you have on the order of 15 quadrillion profiles.25
The debates among statisticians and forensic scientists on RMPs play out in the courtroom as well. The same information can be packaged and presented differently to a panel of jurors, one framing that supports the prosecution
and another that supports the defense.
Let us suppose that the calculated frequency of an individual’s 26 alleles is 1 in 6 billion. This means that when you multiply the frequencies of the individual alleles in the relevant population, the product of the frequencies yields a frequency of 1 in 6 billion. This could be presented to the jury as follows: “There is only one person in 6 billion with this DNA profile and that is our suspect, because there are only 6 billion people on the earth. If there were 12 billion we would have to conclude that there might be another person with the same DNA profile.”
But “1 in 6 billion” is a theoretical calculation based on databases that have not been chosen randomly to determine allele frequencies. So there is still a chance that more than one person on the planet will have the same DNA profile. If we had a DNA profile for every living person on the planet, we could ascertain definitively whether more than one exact profile match occurs.
In cold-hit matches the profile is uploaded to a database where, let us assume, one match is found. There are two ways of thinking about the probability of this being a coincidental match. On the one hand, if the database is very small, we might think that this could be a coincidental match because we have not seen a large-enough population from which to judge the profile. On the other hand, since the database is small, the likelihood of getting a coincidental match should be small because it increases with the size of the comparison population (a world database would increase the chances of a coincidental match). Even though the theoretical calculation gives us an RMP of 1 in 6 billion, we know that the assumptions behind the calculation do not take account of close family relations; those have to be analyzed using kinship statistics. Thompson notes, “These estimates understate the probability of a coincidental match in actual cases because they take no account of the possibility that the pool of possible suspects contains the relatives of the perpetrator, who would be more likely to have the same profile due to common ancestry.”26
Now suppose that the database from which police obtained the cold hit was very large. An actual DNA database of 6 million profiles that yielded one cold hit tells us that 5,999,999 people have been excluded from the crime-scene match. The larger the database, the more confidence we can have that our cold hit—with no other evidence—is not a false match because we are approaching the true population size. By imagining a database with 60 million people and one cold hit, we gain even more confidence, given that 59,999,999 people are excluded. But if we found 1 match in 60 million people, then there could be 100 matches in 6 billion people (1 match per 60 million, using a kind of inductive logic).
So another narrative that could be presented to the jury is that the chance of a coincidental match for a cold hit in a database of 60 million people is 1 out of 100. Telling a jury that there could be another 99 people on the planet with the same DNA profile presents a very different statistic that could change its psychology when it is trying to determine the grounds for “beyond a reasonable doubt.”
Would higher probability statistics in cold-hit cases make a difference in their probative value or how juries relate to the evidence? Thus far, in cold-hit cases the courts have opted for the RMP estimates from forensic statisticians over the mathematical statisticians. Juries typically do not get to hear the controversy because it is often resolved before experts appear before the jury.
BOX 16.6 The Case of John Puckett
In 1972 a 22-year-old nurse was sexually assaulted and stabbed to death in San Francisco. More than 30 years later a swab that had been taken from the victim’s mouth in 1972 containing a degraded sperm sample and at least one other person’s DNA produced a partial DNA profile of 7 markers. When the profile was compared with California’s DNA databases of 338,000 profiles, it matched with that of John Puckett. Puckett, then 70, denied ever knowing the victim, and there was virtually no other evidence linking him to the crime, aside from the fact that he lived in San Francisco in 1972 and had a previous rape conviction. During his trial the jury was provided a random-match probability of 1 in 1.1 million, based on population statistics. During pretrial hearings Bicka Barlow of the San Francisco Public Defender’s Office argued that this figure did not take into account the size of the database. Following the NAS recommendation to multiply the RMP by the number of profiles in the database, she argued that the chances were in fact 1 in 3 that the database search had resulted in linking an innocent person to the crime. The judge did not allow this statistic to be presented to the jury. Puckett was convicted and sentenced to life in prison.
Source: Jason Felch and Maura Dolan, “DNA Matches Aren’t Always a Lock,” Los Angeles Times, May 4, 2008.
Increasingly, police are trolling their databases for partial matches of DNA profiles. This means that they might be interested in a cold hit with 20 out of 26 matched alleles. It is possible to generate a fairly high RMP with fewer than 13 loci that could sound convincing to a jury. In 1999 police in the United Kingdom found an exact match of 6 loci between the profile of crime-scene DNA from a burglary and a profile logged into the United Kingdom’s national databank. The frequency of a random match was calculated by law enforcement to be 1 in 37 million, which is persuasive evidence in a country of 60 million people. When the suspect was arrested, it soon became obvious that the match was a coincidence because the man was disabled and was physically incapable of carrying out the crime. The coincidental match could have been corroborated by testing more alleles in the biological samples.27
According to Thompson, “The British Home Office has reported that between 2001 and 2006, 27.6 percent of the matches reported from searches of the U.K. National DNA Database (NDNAD) were to more than one person in the database,”28 largely because police were uploading partial DNA samples where degradation of the crime-scene sample had taken place or because a number of individuals were entered into the database more than once. The current interest in familial DNA searching has resulted in greater interest among criminal investigators in partial matches. Although forensic scientists have made efforts to develop statistical models that predict the probability that a partial match of an individual implicates that person’s family as the source of the DNA, the results have been highly problematic and contested (see chapter 4).29
Myth of Infallible Rape Evidence
If the DNA of a Suspected Rapist Is Found in the Vaginal Smear of the Victim, Then the Suspect Must Be the Rapist.
DNA testing has been responsible for a high conviction rate in crimes involving rape. It is widely assumed that if a suspect’s DNA is found in a vaginal smear of the rape victim, then the suspect’s guilt has been established beyond a reasonable doubt. There are two separate issues. First, does the DNA of the suspect in the vaginal smear prove beyond a reasonable doubt that the suspect had sexual intercourse with the victim? Second, does the DNA match prove that the suspect raped the victim?
The answer to the first question must most probably be in the affirmative. It seems extremely unlikely that a suspect’s DNA could enter the vaginal canal without intercourse. The victim could surely set the record straight in such an event. That said, there was one case where a woman implanted a sperm sample in order to thwart law enforcement. In 1999 a convicted rapist named Anthony Turner smuggled a sample of his semen out of prison, concealed in a ketchup packet. Turner’s family members paid the woman $50 to use the sperm to stage a phony rape as a way of casting doubt on the DNA evidence that placed him in prison.30
The second question asks whether a DNA match implies a rape. There are cases where a victim has had multiple sexual partners, one or more of whom may have been consensual, where a vaginal smear by itself may not reveal the actual rapist. This is where forensic investigators can use elimination samples in mixed DNA samples where there have been consensual partners. It is also possible that in violent crimes against women involving more than one man, one of the perpetrators did not penetrate the victim or did not ejaculate. Thus the nonappearance of sperm is not by itself conclusive evidence that
the suspect was not involved in violence or a rape against the woman. DNA evidence, separated from its context, is never solely definitive for either conviction or exoneration, although the burden for the former is much higher.
Myth of DNA Detection Equaling Physical Presence
When the DNA of a Suspect Is Found at a Crime Scene, Then the Suspect Must Have Been Present at the Crime.
The fact that an individual’s DNA is found at the scene of a crime does not indicate that he or she committed the crime in question or even was present at the crime scene. There are many ways in which a person’s DNA can wind up at the scene of a crime. As discussed in “Myth of DNA Consistency,” DNA in the form of a vaginal swab found on a rape victim might be far more useful to investigators than DNA lifted from a cup or a cigarette butt.
Moreover, even if a person’s DNA is reportedly found at the scene of a crime, it is not necessarily the case that the person deposited it there. There is always the possibility that the DNA could have appeared as result of secondary transfer, that the DNA could have been planted, or that the results of the DNA analysis were fabricated.
Secondary transfer refers to the phenomenon where DNA deposited on one item winds up on another. The individual does not have direct contact with that item (primary transfer); instead, his or her DNA is transferred by way of an intermediary, which could be either another person or another object. For example, if person A shakes person B’s hand, they are each likely to have trace amounts of the other’s DNA on their hand. If A then takes out a kitchen knife and cuts vegetables, it is quite possible that the DNA of both A and B could be found on the knife handle, even though B never touched the knife. Ironically, the potential for inadvertent transfer of DNA to muddy an investigation has increased over time as DNA testing techniques have become more sensitive and able to type the DNA of samples of only a few cells.
Genetic Justice Page 36