Error, Uncertainty and the Statistics of Forensic Science

Q: What is the prosecutor's fallacy?

The prosecutor's fallacy occurs when the probability that evidence matches given innocence is treated as equivalent to the probability of innocence given the evidence. The two are not the same. A one-in-a-million random-match probability does not mean the chance of innocence is one in a million, because the starting probability of guilt depends on other case facts, not just the match statistic.

Q: What is a likelihood ratio in forensic reporting?

A likelihood ratio compares how probable the evidence is under two competing propositions: that the item came from the suspect versus that it came from an unrelated person. A ratio greater than one supports the prosecution hypothesis; a ratio less than one supports the defence. It lets the analyst express evidential weight without stating a probability of guilt, which is a question for the jury.

Q: What is the difference between a false positive and a false negative in forensic testing?

A false positive means the test returns a match when no genuine match exists. A false negative means the test misses a real match. Both contribute to error rates, and their relative consequences differ depending on context. A false positive in criminal casework risks convicting the innocent; a false negative risks missing a genuine link.

Q: What is measurement uncertainty and why does it matter in forensic science?

Measurement uncertainty is the range within which the true value of a quantity lies, given the limitations of the instrument, analyst technique, and reference materials. Reporting a result without its uncertainty interval gives a false impression of precision. Two measurements that look different may overlap entirely once uncertainty is accounted for.

Q: Why is 'match' language being phased out in forensic reports?

The word 'match' implies a binary certainty that analytical methods cannot support. Even when two samples are analytically consistent, there is always some probability that unrelated sources could produce the same result. Probabilistic reporting replaces 'match' with a likelihood ratio or verbal equivalent that honestly reflects both the strength of the association and the possibility of a coincidence.

Forensic results carry error rates and measurement uncertainty that are rarely communicated clearly in court. This topic explains the likelihood-ratio framework, the prosecutor's fallacy, and why probabilistic reporting is replacing the language of 'match'.

Last updated: 19 Jun 2026

Forensic evidence is probabilistic, not absolute. Every result carries an error rate and a margin of measurement uncertainty, and courts have historically received those results without either figure attached. The likelihood ratio (LR) is the current standard framework for expressing evidential weight: it states how much more probable the observed evidence is under the prosecution's hypothesis than under the defence's hypothesis, without asserting guilt. Understanding this framework, together with the prosecutor's fallacy and base-rate reasoning, is essential for anyone who produces, presents, or evaluates forensic findings.

Every forensic result is an estimate. The DNA profile, the fibre comparison, the bloodstain angle, the tool-mark impression each arrives with a margin of error, and that margin carries direct legal consequences. For most of the twentieth century, analysts wrote reports that sounded definitive: 'The samples match.' 'The fibres are consistent.' What those phrases meant in probabilistic terms was rarely stated, and courts rarely required it.

The consequences of that silence were serious. Wrongful convictions were secured partly on forensic testimony that overstated what the science could prove. Review bodies in the United States, the United Kingdom, Australia, and elsewhere found the same pattern: analysts presenting results without error rates, juries hearing statistics they could not evaluate, and defence teams lacking the tools to challenge numbers they did not understand. The discipline has been reforming since the 2009 US National Academy of Sciences report made those failures explicit.

This topic builds the statistical vocabulary a forensic scientist needs to report findings honestly. It covers error rates and what drives them, the distinction between measurement uncertainty and match probability, the likelihood-ratio framework that replaces binary conclusions with calibrated weight of evidence, and the classic courtroom fallacies that turn a genuine finding into a misleading one. Getting this right is both a scientific requirement and an ethical obligation to the justice system that relies on forensic testimony.

By the end of this topic you will be able to:

Distinguish between false positive and false negative error rates and explain how validation studies establish them for a forensic method.
Define measurement uncertainty and apply the ISO/IEC 17025 requirement to quantitative laboratory results.
Calculate and interpret a likelihood ratio, identifying what it asserts and what it deliberately leaves to the trier of fact.
Identify the prosecutor's fallacy and the defence fallacy, and explain how the LR framework prevents both errors.
Apply base-rate reasoning to a cold-hit DNA match and assess what additional information is required before the result supports identification.

Key terms

Error rate: The frequency with which a test or method produces an incorrect result under defined conditions. Includes both false positives (spurious matches) and false negatives (missed matches). A validated method has a published error rate; an unvalidated one does not.
Measurement uncertainty: The range around a reported value within which the true quantity plausibly lies, derived from instrument precision, calibration standards, replicate variability, and analyst technique. Reported as ± a value or as a confidence interval.
Likelihood ratio (LR): The ratio of the probability of the evidence given the prosecution hypothesis to the probability of the evidence given the defence hypothesis. An LR greater than one favours the prosecution; less than one favours the defence. It expresses evidential strength without usurping the jury's role of deciding guilt.
Prosecutor's fallacy: The logical error of treating P(evidence | innocence) as if it equals P(innocence | evidence). A tiny random-match probability does not by itself imply a high probability of guilt; prior probability and case context must be incorporated via Bayes' theorem.
Base-rate fallacy: Ignoring the background frequency (base rate) of a characteristic when interpreting a match. A test specific to 1 in 10,000 people still flags many innocent individuals in a large population, so the absolute number of potential matches must be considered alongside the match probability.
Probabilistic reporting: A reporting framework that expresses forensic conclusions as a likelihood ratio, a verbal equivalent on an LR scale, or a posterior probability, rather than as binary match or no-match statements. Increasingly required by accreditation bodies and appellate courts.

What error rates mean and where they come from

An error rate is not an admission of incompetence. It is an inherent property of any real-world measurement process, where reagents, instruments, and analysts all introduce variability. The two varieties are false positives and false negatives. A false positive declares a match where none genuinely exists. A false negative misses a real match. Both occur in every forensic discipline, and their balance is a deliberate design choice: tune a method to be more sensitive and false negatives fall but false positives rise, and vice versa.

Error rates are established through proficiency testing and validation studies. A laboratory submits its analysts to blind samples where the correct answer is known, and the fraction of incorrect calls becomes the observed error rate for that method under those conditions. The 2009 NAS report found that many pattern-comparison disciplines (bitemark, toolmark, hair microscopy, handwriting) lacked any rigorous validation and had no published error rate at all. That was not a minor gap: it meant the courtroom testimony lacked the one number that would let a juror know how much to trust it.

Discipline	Error-rate status (pre-2015)	Reform status
DNA profiling (STR)	Well-characterised; typically <1 in a million for random match	Probabilistic genotyping now standard
Fingerprint comparison	Largely undocumented until 2011 PCAST review; FBI study found 0.1% false positive rate in ideal conditions	Ongoing validation; error rates now published in some labs
Hair microscopy	No validated error rate; FBI review (2012-2022) found widespread overstatements	Largely replaced by mitochondrial DNA
Bitemark analysis	No validated error rate; innocence cases reveal high false positives	Under serious challenge; PCAST found no empirical basis for foundational validity and recommended courts not admit bitemark evidence
Toolmark/firearm	Limited validation; error rates vary widely between studies	OSAC technical notes require uncertainty statements

Measurement uncertainty in the laboratory

Measurement uncertainty applies across every quantitative forensic discipline: the blood-alcohol concentration from a breathalyser, the elemental composition of a glass fragment, the refractive index of a fibre, the concentration of a drug in a tablet. Any numerical result that comes from an instrument carries an attached band of uncertainty, and that band is not optional to report. The ISO/IEC 17025 standard, which governs accredited forensic laboratories, requires uncertainty to be estimated and reported for all quantitative results.

Uncertainty arises from multiple sources. The instrument itself has a precision limit: two readings of the same sample will not be identical. The reference material used for calibration has its own certified uncertainty. The analyst's technique introduces variability that proficiency studies can quantify. Environmental factors (temperature, humidity, reagent batch) add further spread. Expanded uncertainty is the final reported figure, typically at the 95% confidence level, meaning the true value falls within that range 95 times out of 100 under the same conditions.

Sources of measurement uncertainty flowing into a final expanded uncertainty statement.

The practical significance is this: two samples that look different at face value may be fully consistent once uncertainty is included. Conversely, two samples that look identical may differ if the uncertainty interval is small enough. Without uncertainty estimates, the comparison is meaningless. A blood-alcohol reading of 0.082 and a legal limit of 0.080 cannot be distinguished without knowing whether the instrument's uncertainty is ±0.001 or ±0.005. The number alone is not the answer.

The likelihood ratio: weighing evidence without deciding guilt

The likelihood ratio is the central tool of modern probabilistic forensic reporting, and its structure is worth understanding precisely. Given a forensic observation E, the analyst proposes two competing hypotheses: Hp (prosecution: the suspect is the source) and Hd (defence: an unknown unrelated person is the source). The LR is then P(E | Hp) divided by P(E | Hd). If a DNA profile occurs in 1 person in 1 billion, the denominator is 10⁻⁹ and a clean match gives a numerator close to 1, producing an LR of approximately one billion: the evidence is one billion times more probable if the suspect is the source than if a random unrelated person is.

Likelihood-ratio framework: analyst delivers LR; jury applies it to prior odds.

The LR is bounded by design: it expresses how much more probable the scientific observation is under one proposition than the other, and does not address whether the suspect committed the act. What the jury does with that LR depends on their prior beliefs about the case, all the other evidence, witness credibility, and motive. The analyst is not supposed to cross into that territory. The moment an expert says 'the evidence proves the defendant was there' rather than 'the evidence is one thousand times more probable if the defendant was there', the expert has stepped outside their role.

LR > 1: evidence supports the prosecution hypothesis over the defence hypothesis.
LR = 1: evidence is equally probable under both hypotheses. It provides no discriminating power.
LR < 1: evidence supports the defence hypothesis. This does not happen often in casework but is logically possible and must be reported honestly if it arises.
Verbal scale: many laboratories translate LR magnitudes into verbal labels such as 'moderate support', 'strong support', or 'very strong support'. The scale (e.g. the UK Association of Forensic Science Providers scale) is calibrated so that jurors can interpret the strength without needing to process large numbers.

The prosecutor's fallacy and its mirror

The prosecutor's fallacy is a specific logical error in which P(evidence | innocence) is treated as equivalent to P(innocence | evidence). These two quantities can differ by orders of magnitude, and confusing them can be lethal to fair trial outcomes. The original example comes from DNA: an analyst reports a random match probability of 1 in 1 million. A prosecutor, or the analyst themselves, then tells the jury that there is only a one-in-a-million chance the defendant is innocent. This is wrong. The one-in-a-million figure is the probability of seeing this profile in a randomly chosen unrelated person. The probability of innocence depends on how many people were plausible suspects to begin with, the population of the city, the strength of other evidence, and every other fact in the case.

The mirror image is the defence fallacy: arguing that because many people share a characteristic, it proves nothing about any individual. If a fibre type is worn by 10,000 people in the city, the argument goes, the match is worthless. This overstates the case in the other direction. The matching result does raise the probability that the suspect is the source relative to a prior. It narrows the field. The correct response is the LR framework: calculate exactly how much this evidence shifts the odds, not whether it does so at all.

Both fallacies share the same root: pulling a single probability out of context and treating it as a verdict. The LR framework prevents both errors by making both the numerator and denominator explicit, forcing the audience to compare probabilities rather than take one number as the answer.

Base-rate fallacy and the population problem

The base-rate fallacy is the error of focusing on a match probability while ignoring how many people in the relevant population share the matching characteristic. A DNA test with a random match probability of 1 in 10,000 applied to a city of one million people will statistically match roughly 100 individuals by chance. If the investigation initially had no strong suspect and ran the profile against a database of 500,000 people, a 'cold hit' match might identify someone who is one of those 100 incidental matches rather than the actual source. The cold hit is not worthless, but it is the beginning of an investigation, not the end of one.

Base-rate considerations are not unique to DNA. A shoe-print pattern shared by 5% of footwear sold in a country is meaningless unless the analyst knows that the crime scene has additional features narrowing the relevant population. A particular toolmark that could come from any of 2,000 tools of the same model is much weaker than one that could only come from tools showing a specific wear pattern that appears in roughly 50 examples. The denominator is always part of the calculation.

From 'match' to probabilistic reporting: the shift in practice

The transition from binary match language to probabilistic reporting is the most significant change in forensic communication since the introduction of DNA evidence. Binary reports said things like: 'the samples are consistent' or 'the samples match'. Probabilistic reports say instead: 'the DNA findings are approximately 1 billion times more probable if the sample originated from the suspect than if it originated from an unrelated individual.' Both describe the same analytical result. The second statement is longer but scientifically honest about what the result actually proves.

Define the hypotheses
Before calculating anything, the analyst states a clear prosecution hypothesis (Hp) and defence hypothesis (Hd). These are tied to the specific case. For a DNA contact trace, Hp might be 'the suspect touched the item' and Hd 'an unknown unrelated person touched the item'.
Calculate or estimate the LR
Using validated population databases for DNA, or experimentally derived transfer/persistence data for trace, or published frequency data for other features, the analyst computes how probable the observation is under each hypothesis. For complex mixed profiles, probabilistic genotyping software such as STRmix or TrueAllele performs this calculation using Monte Carlo sampling.
Apply the verbal scale
The numerical LR is translated into a verbal statement following a published scale so the report is interpretable by non-statisticians. For example, on the AFSP scale, LRs between 100 and 1,000 might be labelled 'moderate support for the prosecution hypothesis'; LRs above 1 million might be labelled 'extremely strong support'.
Report and stop at the source level
The analyst reports the LR and stops there unless specifically asked to address activity level (how the DNA got there) or offence level (what the presence implies about what happened). Each level requires different data and carries different assumptions. Crossing from source to offence level without acknowledging it is a common source of misleading testimony.

Not every discipline has reached the same point in this transition. DNA profiling has mature probabilistic tools. Fingerprint comparison still uses categorical conclusions ('identification', 'exclusion', 'inconclusive') in many jurisdictions, though several major laboratories are piloting LR-based reporting. Toolmark and bitemark analysis lack the population-frequency data needed to assign a denominator with any confidence, which is why their probative value is so contested.

Worked example

Interpreting a mixed DNA profile: the prosecutor's fallacy in real-time

A case where the numbers were right but the courtroom testimony was not.

A robbery scene yields a touch-DNA swab from a door handle. The profile is a two-person mixture. The major contributor matches the victim. The minor contributor is a partial profile with alleles consistent with the suspect at 8 of the 15 STR loci tested. The analyst calculates the combined probability of an unrelated person matching the evidence at those 8 loci as 1 in 220,000.

What the analyst should say in court: 'The DNA evidence is approximately 220,000 times more probable if the minor contributor is the suspect than if the minor contributor is an unrelated person drawn from the general population.' This is the LR framing: it places the number correctly as a ratio comparing two hypotheses.
What the prosecutor incorrectly says: 'There is only a 1 in 220,000 chance that the defendant is not the source of the DNA.' This is the prosecutor's fallacy. It treats the random match probability as the probability of innocence, ignoring how many people were in the original suspect pool and what the other evidence says.
The base-rate problem: the city has a population of 1.1 million. At 1 in 220,000, roughly 5 people in that population could produce an incidentally matching partial profile. If the original investigation had no independent evidence pointing to this suspect before the DNA search, the cold-hit match is one of 5 possible spurious links, not a near-certain identification.
How the defence challenges it: a statistician for the defence points out that the mixture interpretation used a partial profile with 8 of 15 loci. The analyst should have quantified the uncertainty of the LR due to the missing loci data and the mixture deconvolution assumptions, and should have disclosed that probabilistic genotyping software was not used, even though the profile complexity warranted it.

The example shows that analytical accuracy and courtroom accuracy can diverge even when the laboratory work is technically sound. Probabilistic reasoning must carry from the bench to the report to the witness stand, or the gap between what the evidence proves and what the jury hears becomes a source of injustice.

Check your understanding

Question 1 of 4· 0 answered

A DNA analyst reports a random match probability of 1 in 500,000. The prosecution then tells the jury there is a 1 in 500,000 chance the defendant is innocent. What logical error has been made?

Key Takeaways

Every forensic method has error rates for false positives and false negatives; publishing these is required for valid courtroom testimony, and many pattern-comparison disciplines lacked them until recent reform efforts.
Measurement uncertainty quantifies the range around a numerical result and is mandatory under ISO/IEC 17025 accreditation; comparing two numbers without their uncertainty intervals is meaningless.
The likelihood ratio expresses how much more probable the evidence is under the prosecution hypothesis than the defence hypothesis, delivering evidential weight without deciding guilt, which remains the jury's task.
The prosecutor's fallacy transposes the conditional, equating match probability with probability of innocence; the Sally Clark case remains the most cited illustration of how this error causes wrongful convictions.
Base-rate reasoning requires knowing how many people in the relevant population share the matching feature; a rare match is not rare in absolute terms if the population is large enough.
The shift from binary 'match' language to probabilistic reporting using LRs and verbal scales is the most significant reform in forensic communication in two decades, with DNA leading and other disciplines following.

What is the prosecutor's fallacy?

The prosecutor's fallacy occurs when the probability that evidence matches given innocence is treated as equivalent to the probability of innocence given the evidence. The two are not the same. A one-in-a-million random-match probability does not mean the chance of innocence is one in a million, because the starting probability of guilt depends on other case facts, not just the match statistic.

What is a likelihood ratio in forensic reporting?

A likelihood ratio compares how probable the evidence is under two competing propositions: that the item came from the suspect versus that it came from an unrelated person. A ratio greater than one supports the prosecution hypothesis; a ratio less than one supports the defence. It lets the analyst express evidential weight without stating a probability of guilt, which is a question for the jury.

What is the difference between a false positive and a false negative in forensic testing?

A false positive means the test returns a match when no genuine match exists. A false negative means the test misses a real match. Both contribute to error rates, and their relative consequences differ depending on context. A false positive in criminal casework risks convicting the innocent; a false negative risks missing a genuine link.

What is measurement uncertainty and why does it matter in forensic science?

Measurement uncertainty is the range within which the true value of a quantity lies, given the limitations of the instrument, analyst technique, and reference materials. Reporting a result without its uncertainty interval gives a false impression of precision. Two measurements that look different may overlap entirely once uncertainty is accounted for.

Why is 'match' language being phased out in forensic reports?

The word 'match' implies a binary certainty that analytical methods cannot support. Even when two samples are analytically consistent, there is always some probability that unrelated sources could produce the same result. Probabilistic reporting replaces 'match' with a likelihood ratio or verbal equivalent that honestly reflects both the strength of the association and the possibility of a coincidence.

Test yourself on Basics of Forensic Science with free, timed mocks.

Practice Basics of Forensic Science questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Key Takeaways

Your journey to becoming a forensic professional starts here.