Chapter 04· 4 min read
Mathematics & Statistics
Reading as a guest
Sign up free to save your progress, highlight passages, and pick up where you left off.
You'll lose your reading position and notes if you leave without an account.
Forensic statistics is the discipline that turns observations into defensible probabilistic statements. The court asks "did the suspect leave this trace?"; the analyst answers in the language of likelihood ratios, prior odds, posterior probabilities, and confidence intervals. This chapter covers the small set of statistical tools that appear in casework.
4.1The Likelihood Ratio
The LR answers: how much more likely is the evidence under Hp than under Hd? Critically, the LR is not a probability of guilt. Even an LR of 1,000,000 doesn't establish guilt by itself — it must be combined with the prior odds (set by the rest of the case-record) to produce posterior odds. The court holds the prior; the forensic scientist supplies the LR.
Verbal scale (RSS / ENFSI)
| LR | Verbal interpretation |
|---|---|
| 1–10 | Limited support |
| 10–100 | Moderate support |
| 100–1,000 | Strong support |
| 1,000–10,000 | Very strong support |
| > 10,000 | Extremely strong support |
4.2Bayesian Inference
Why the prior matters: with the same LR 10⁶ but a prior of 1 in 10⁹ (suspect chosen from world population without other evidence), the posterior is 10⁶ / 10⁹ = 1/1000 — the suspect is probably not the source despite the strong forensic evidence.
4.3The Prosecutor's Fallacy
The most common statistical error in courtroom forensic testimony: confusing P(E | Hd) with P(Hd | E). A "1 in a million random match probability" is not "1 in a million chance the suspect is innocent". The first is a property of the evidence; the second is the posterior, which depends on the prior.
4.4Descriptive Statistics
| Statistic | Formula | Meaning |
|---|---|---|
| Mean | μ = Σxi / n | Arithmetic average |
| Median | middle value (sorted) | 50th percentile |
| Mode | most frequent value | Peak of distribution |
| Standard deviation | σ = √variance | Spread (data variability) |
| SEM | SEM = σ / √n | Precision of the sample mean |
| Coefficient of variation | CV = (σ / μ) × 100% | Dimensionless precision |
SD vs SEM — frequently confused: SD describes the spread of individual data and doesn't decrease with sample size; SEM describes the precision of the sample mean and decreases as √n. For 100 measurements with SD 5, SEM = 5 / √100 = 0.5.
4.5The Normal Distribution
- 68% of values within 1 SD of the mean (μ ± σ)
- 95% within 2 SD (μ ± 2σ) — the standard 95% CI (more precisely 1.96σ)
- 99.7% within 3 SD (μ ± 3σ)
QC application: a measurement within mean ± 2 SD is accepted; outside that interval (5% by chance) prompts investigation; outside mean ± 3 SD (0.3% by chance) typically triggers root-cause analysis.
4.6Type I and Type II Errors
| Decision \ Reality | H0 true | H0 false |
|---|---|---|
| Reject H0 | Type I error (α, false positive) | Correct rejection |
| Fail to reject H0 | Correct retention | Type II error (β, false negative) |
Statistical power = 1 − β = probability of correctly rejecting a false null. Forensic interpretation in match testing: Type I = declaring a match when scene and suspect are different sources; Type II = missing a match when they are the same source.
4.7DNA Random Match Probability
For a multi-locus STR profile under HWE + linkage independence:
Theta (θ) correction for population substructure
| θ | Population structure | Standard usage |
|---|---|---|
| 0.01 | Homogeneous (single ethnic group) | Default for narrow population |
| 0.02 | Mildly subdivided | Multi-state populations |
| 0.03 | Substantially subdivided | Conservative default; protects against undisclosed substructure |
Identical (monozygotic) twins share their entire nuclear genome, so STR profiles match completely regardless of locus count. Distinguishing twins requires high-coverage whole-genome sequencing or epigenetic methylation profiling.
4.8ROC Analysis
The Receiver Operating Characteristic curve evaluates binary classifiers across all possible decision thresholds.
AUC ranges: 1.0 = perfect, 0.9–1.0 = excellent, 0.8–0.9 = good, 0.7–0.8 = fair, 0.5 = random. ML classifiers in forensic settings (signature, voice, biometric) report AUC alongside explicit error rates so the court can weigh the evidence properly.
LR = P(E | Hp) / P(E | Hd). Strength of evidence, not probability of guilt. Bayesian update: posterior odds = prior odds × LR. Multiply, never divide. Prosecutor's fallacy: P(E | Hd) ≠ P(Hd | E). Empirical rule: 68 / 95 / 99.7 within 1 / 2 / 3 SD. SD vs SEM: SD = data spread (fixed); SEM = SD / √n. RMP = ∏ per-locus probabilities; 13 loci → 10⁻¹³. Theta: 0.01–0.03; conservative default 0.03. Type I (α) = false positive; Type II (β) = false negative. AUC: 1 = perfect, 0.5 = random, 0.9+ = excellent.
Don't lose your place
Save this chapter and the rest of Forensic Physics.
A free ForensicSpot account remembers which chapters you've read, lets you highlight passages, take notes and resume from any device.