Bayes Theorem in Evidence Evaluation
Bayes theorem in odds form shows how a likelihood ratio converts prior odds into posterior odds, giving forensic scientists a principled way to express the strength of evidence. This topic covers the mathematics, practical forensic scenarios, and the boundary between expert and fact-finder roles.
Last updated:
Bayes theorem in odds form is the mathematical framework that forensic scientists use to express what a piece of evidence adds to a case. The theorem states that posterior odds equal prior odds multiplied by the likelihood ratio. Prior odds capture what was believed about two competing hypotheses before the evidence arrived. The likelihood ratio measures how much more probable the evidence is under one hypothesis than the other. Multiplying these two quantities together gives posterior odds: the updated state of belief after the evidence is taken into account. Applied to forensic evidence, this framework separates what the scientist can properly say from what belongs to the court.
The odds form of Bayes theorem is preferred over the probability form in forensic reporting because it makes the likelihood ratio a freestanding quantity. A scientist can calculate and report the likelihood ratio without knowing the prior odds. The prior odds depend on every other fact in the case, the credibility of witnesses, and the legal presumption of innocence. Those matters belong to the judge or jury. By delivering only the likelihood ratio, the scientist contributes the evidential value of the scientific findings without usurping the fact-finder's role.
The framework has been adopted in DNA evidence reporting across many jurisdictions, including England and Wales, Australia, the Netherlands, New Zealand, and the United States, and it is progressively being applied to other forensic disciplines such as fingerprint comparison, glass analysis, and firearms identification. Guidance documents from organisations including the UK Forensic Science Regulator, the European Network of Forensic Science Institutes (ENFSI), and the US National Commission on Forensic Science have endorsed likelihood ratio reporting as the principled alternative to categorical statements such as a match.
By the end of this topic you will be able to:
- State Bayes theorem in odds form and identify each of its three components: prior odds, likelihood ratio, and posterior odds.
- Calculate the likelihood ratio for a simple forensic scenario where the relevant probabilities are given.
- Explain why the prior odds are the fact-finder's responsibility and not a quantity the forensic expert should set or report.
- Interpret a verbal likelihood ratio scale and translate a numerical LR into a qualitative statement about evidential strength.
- Identify the prosecutor's fallacy and explain why confusing the LR with the posterior probability of guilt is a logical error.
- Prior odds
- The ratio of the probability of the prosecution hypothesis to the probability of the defence hypothesis before the forensic evidence in question is considered. Set by the fact-finder using all non-forensic case information. The forensic expert does not supply this quantity.
- Likelihood ratio (LR)
- The probability of the observed evidence given the prosecution hypothesis divided by the probability of the same evidence given the defence hypothesis. The LR is the quantity the forensic scientist reports; it measures the probative value of the evidence without incorporating prior beliefs.
- Posterior odds
- The ratio of the probability of the prosecution hypothesis to the probability of the defence hypothesis after the forensic evidence is taken into account. Calculated as prior odds multiplied by the likelihood ratio. This is the updated belief state, which the fact-finder uses alongside all other case evidence.
- Prosecution hypothesis (Hp)
- The proposition advanced by the prosecution, typically that the defendant is the source of the recovered material or was present at the scene. Paired with the defence hypothesis to define the comparison for the likelihood ratio.
- Defence hypothesis (Hd)
- The proposition advanced by the defence, typically that the material came from an unknown unrelated individual or that the defendant was not present. The analyst must choose a realistic Hd; an implausible Hd inflates the LR artificially.
- Prosecutor's fallacy
- The error of treating the probability of the evidence given innocence as if it were the probability of innocence given the evidence. Formally, confusing P(E | Hd) with P(Hd | E). This transposition of the conditional leads to extreme overstatement of the strength of evidence.
The mathematics of Bayes theorem in odds form
Bayes theorem can be written in probability form or in odds form. The probability form is useful theoretically, but it requires the analyst to handle the full prior probability of each hypothesis, which demands knowledge of all case information. The odds form separates the equation into a part the scientist can supply and a part the scientist cannot.
The odds form is: Posterior Odds = Prior Odds x Likelihood Ratio. In notation, if Hp is the prosecution hypothesis and Hd is the defence hypothesis, and E is the evidence: P(Hp | E) / P(Hd | E) = [P(Hp) / P(Hd)] x [P(E | Hp) / P(E | Hd)]. The first bracketed term is the prior odds. The second bracketed term is the likelihood ratio.
The LR is a ratio, not a probability. It can range from zero to infinity. An LR of 1 means the evidence is equally consistent with both hypotheses and adds nothing. An LR of 1000 means the evidence is 1000 times more probable if Hp is true than if Hd is true. An LR of 0.001 means the evidence is 1000 times more probable under Hd, supporting the defence position. The LR is symmetric in this sense: evidence can support either side, and the direction is determined by which hypothesis makes the observed evidence more probable.
Formulating the hypotheses correctly
The LR is meaningless without a clear pair of competing hypotheses. In practice, the forensic scientist proposes hypotheses at the level appropriate to the findings, typically the source level (who contributed the material) or the activity level (what actions produced the trace). Conflating these levels is a common source of error.
A source-level pair for a glass transfer case might be: Hp, the glass fragments on the suspect's clothing came from the broken window at the scene; Hd, the glass fragments came from some other source. An activity-level pair might be: Hp, the suspect broke the window; Hd, the suspect was in the area but did not break the window. The LR values for these two pairs can differ substantially, and the relevant pair must match the question actually in dispute in the trial.
The choice of Hd requires care. In DNA cases, the realistic defence hypothesis is usually that the DNA came from a randomly selected individual from the relevant population. In a case with a defined suspect pool, a narrower Hd may be more appropriate. A poorly chosen Hd produces an LR that does not actually answer the question before the court, and courts in England, the Netherlands, and Australia have criticised evidence reports where the hypotheses were not clearly defined.
| Level | Prosecution hypothesis (Hp) | Defence hypothesis (Hd) | What the LR measures |
|---|---|---|---|
| Source | Material came from the defendant | Material came from a random person in the reference population | How much more likely is this profile if from defendant than from a random person |
| Activity | Defendant performed action X that deposited the trace | Defendant was present but did not perform action X | How much more likely is this trace quantity/location if action X occurred |
| Offence | Defendant committed the offence | Defendant did not commit the offence | Rarely appropriate for a single piece of forensic evidence alone |
Why the prior odds belong to the fact-finder
The prior odds represent the relative probability of Hp and Hd before the forensic evidence in question is weighed. Setting them requires incorporating every other piece of case information: witness accounts, the defendant's alibi, CCTV footage, motive evidence, and the legal presumption of innocence. A forensic scientist has no privileged access to most of this material.
The presumption of innocence, embedded in criminal procedure across most legal systems including under Article 6 of the European Convention on Human Rights, implies that, at the start of a trial, the prior odds should be set very low for Hp. But the exact numerical value is a legal and policy question, not a scientific one. Courts in England and Wales, drawing on the guidance in R v Adams [1996] 2 Cr App R 467, have held that it is not the scientist's role to assign numerical priors, precisely because doing so requires the scientist to act as a fact-finder.
The practical consequence is that the forensic scientist reports the LR and explains what it means. The fact-finder then applies the LR to whatever prior odds they have arrived at from all the other evidence. This division keeps the expert within the domain of their competence and preserves the fact-finder's role. It is the same principle applied under the Bharatiya Sakshya Adhiniyam 2023 in India, the Federal Rules of Evidence in the United States, and the Evidence Act 2006 in New Zealand: expert opinion assists the fact-finder but does not replace them.
Interpreting and communicating the likelihood ratio
Numerical LR values span many orders of magnitude. DNA LRs for a full 20-locus STR match in a non-relative case commonly exceed 10 billion. Glass refractive index LRs for a three-fragment transfer might be in the hundreds. Firearms toolmark LRs from some current models range from 10 to 10,000 depending on agreement quality. Communicating these values to a non-specialist jury requires both the number and a qualitative gloss.
The ENFSI verbal scale for the LR provides a consistent vocabulary across European forensic services. LR values between 1 and 10 are described as limited support for Hp. Values from 10 to 100 give moderate support. Values from 100 to 1000 give strong support. Values from 1000 to 10,000 give very strong support. Values above 10,000 give extremely strong support. Values below 1 are described in the corresponding terms supporting Hd. The scale is logarithmic in its boundaries, recognising that the perceptual difference between LR = 1,000 and LR = 1,100 is negligible, while the difference between LR = 10 and LR = 1,000 is substantial.
A related error is the defence fallacy: treating a high LR as proof of guilt rather than as evidence that shifts the odds. Both errors arise from treating the LR as if it were a posterior probability. The correct statement is always comparative: the evidence is X times more probable under Hp than under Hd.
Bayes theorem applied to common forensic scenarios
Consider a DNA single-source profile recovered from a crime scene. The prosecution hypothesis is that the defendant is the contributor. The defence hypothesis is that an unknown unrelated individual from the same population is the contributor. The population frequency of the profile is 1 in 5 million. The LR is therefore 5 million: the evidence is 5 million times more probable if the defendant is the source than if a random person from the population is the source.
Now consider that the prior odds before this DNA evidence are 1:1000 (the police investigation placed the defendant at the scene but with some uncertainty, and the fact-finder has assessed this preliminary evidence as making Hp 1000 times less likely than Hd). Posterior odds = (1/1000) x 5,000,000 = 5,000. Posterior odds of 5,000:1 favour Hp strongly. These posterior odds translate to a posterior probability of Hp of approximately 5000/5001, or about 99.98%.
The same calculation structure applies to other evidence types. For glass, if 2% of the population carry glass with matching refractive index from the same type of source, and the match probability from the scene glass to the suspect is calculated at 2%, the LR is 1/0.02 = 50. For a fibre comparison where 5% of garments in the relevant population share the same colour and polymer type, and the match probability is 5%, the LR is 20. In a case with independent glass and fibre evidence, the combined LR is the product of the individual LRs, giving 50 x 20 = 1000, provided the two evidence types are conditionally independent given the hypotheses.
The independence assumption deserves scrutiny. Glass and fibre from the same incident may not be independent if both are explained by a single activity. The analyst must consider whether conditioning on the hypotheses makes the two observations statistically independent. When dependence is suspected, combining LRs multiplicatively overstates the total evidential value, and a joint LR must be calculated instead.
Limitations, validation, and the empirical LR
An LR is only as good as the data used to compute it. In DNA, the population frequencies are derived from large, well-characterised databases spanning thousands of profiles per locus per ethnic group. For many other forensic disciplines, the databases are smaller, less representative, or less rigorously validated. An LR derived from a database of 50 glass measurements carries far more uncertainty than one derived from 50,000.
The empirical approach to LR estimation uses a set of known same-source and known different-source comparisons to calibrate a score-based LR model. The analyst generates a comparison score from the evidence, then maps that score to an LR using the calibrated model. This approach has been applied to fingerprints, speaker comparison, handwriting, and shoemarks. The validity of the resulting LR depends on how well the calibration data represent the actual case conditions, including the population of potential alternative contributors.
Validation requires showing that the LR system is well-calibrated: when the system reports LR = 100, cases with that score should, across a large test set, show the prosecution hypothesis true approximately 100 times more often than the defence hypothesis. Poorly calibrated systems can systematically overstate or understate the LR, introducing systematic error into every case that uses them. The ISO/IEC 17025:2017 standard for testing and calibration laboratories, applied by accreditation bodies including UKAS in the United Kingdom and NABL in India, requires forensic laboratories to validate their methods, including LR calculation systems.
See also Role of Statistics in Evidence Evaluation for a broader treatment of how statistical tools fit into forensic reporting practice.
A forensic scientist calculates that a paint transfer is 200 times more probable under the prosecution hypothesis than under the defence hypothesis. What is the likelihood ratio, and what does it mean?
Key Takeaways
- Bayes theorem in odds form: posterior odds equal prior odds multiplied by the likelihood ratio. Each component has a distinct role. The LR is the scientist's contribution; the prior odds are the fact-finder's.
- The likelihood ratio expresses how many times more probable the evidence is under the prosecution hypothesis than under the defence hypothesis. An LR greater than 1 supports the prosecution; an LR less than 1 supports the defence; an LR of 1 adds nothing.
- Prior odds are set by the fact-finder because they depend on all other case evidence, the credibility of witnesses, and the legal presumption of innocence. A scientist who sets prior odds steps outside their proper role and risks usurping the jury's function.
- The prosecutor's fallacy is the error of treating P(E | Hd), the match probability, as P(Hd | E), the probability of innocence. These are different quantities: the LR cannot be interpreted as a posterior probability without the prior odds.
- LR validity depends on the quality of the underlying data. Well-calibrated LR systems, validated against known same-source and different-source populations, produce reliable evidential values. Poorly calibrated systems introduce systematic error. Accreditation standards such as ISO/IEC 17025 require method validation, including LR systems.
What is Bayes theorem in odds form?
What is a likelihood ratio in forensic science?
Why can the forensic expert not set the prior odds?
What is the difference between posterior odds and the probability of guilt?
How does Bayes theorem apply to DNA evidence in court?
Test yourself on Forensic Statistics with free, timed mocks.
Practice Forensic Statistics questionsSpotted an error in this page? Report a correction or read our editorial standards.