Base rate problem
Definition
The statistical challenge facing any deception-detection method: if only a minority of statements examined are actually deceptive, even a method that is slightly better than chance can produce large numbers of false positives in practice.
- Core issue
- When deceptive cases are rare, even accurate tests produce more false positives than true positives.
- Applies to
- Statement analysis, PRNU photo matching, fingerprint databases, and any forensic method applied to large comparison pools.
- Root cause
- False-positive opportunities multiply with pool size. A tiny error rate across a large pool of suspects produces many false leads.
Common questions
Why does a test that's better than chance still give false alarms?+
When most cases are actually innocent or truthful (the base rate), even a slightly accurate test generates many false positives. If the vast majority of statements are truthful and your test still flags some truthful ones as deceptive, most of your alerts will be wrong calls.
How does the base rate problem affect forensic matching?+
In photo analysis, PRNU or similar methods might correctly identify a source image most of the time. But when comparing against a large pool of candidate devices, even a very low false-positive rate produces many false matches because the number of false-positive opportunities grows with pool size.
What's the key takeaway for reading forensic reports?+
A method's lab accuracy is not its real-world accuracy. Always ask about the false-positive rate, the size of the comparison pool, and the base rate of actual evidence in cases like yours. A reliable test can still mislead you if applied to a huge suspect list.
Related terms
- Anti-forensic PRNU suppression
- Deliberate actions to remove or obscure the PRNU fingerprint, such as adding PRNU-erasing noise, applying heavy lossy compression, or scaling and re-cropping...
- Extraneous information
- Information in a statement that the SCAN analyst judges to be beyond the stated scope of the question. SCAN treats such additions...
- Ground truth
- In deception-detection research, independently verified knowledge of whether a statement was truthful or deceptive, established through confession, DNA, or other objective means,...
- Individual vs. device-class attribution
- Individual attribution identifies one specific camera body. Device-class attribution identifies only the make or model. PRNU provides individual attribution; metadata and lens-distortion...
- Lack of conviction
- A SCAN category covering hedging phrases like 'I think', 'I believe', 'I don't remember', which Sapir proposes signal deception because a person...
- PRNU suppression
- Any processing that reduces the usable PRNU signal in an image: heavy JPEG compression, resolution downscaling, in-camera noise reduction, social-media re-encoding, and...
- Pronoun shift
- A SCAN indicator based on the claim that deceptive writers drop first-person pronouns or shift to third-person reference when describing events they...
- SCAN (Scientific Content Analysis)
- A proprietary statement-analysis method developed by Avinoam Sapir that claims to identify deceptive content in written statements through analysis of linguistic features...
- Shared device model
- The scenario in which multiple individuals use the same camera. The PRNU fingerprint links an image to the device, not to any...
- Uncertainty quantification in PRNU
- Expressing a PRNU attribution result not just as a binary yes/no but with stated PCE values, false-positive rate at the threshold used,...
Explained in these topics
- SCAN Statement Analysis: Claims, Methods, and the Scientific CritiqueThe statistical challenge facing any deception-detection method: if only a minority of statements examined are actually deceptive, even a method that is slight...
- PRNU in Casework: Limitations and ReportingThe statistical effect in which a low but nonzero false-positive rate produces many false attributions when the candidate pool is large, because the number of...