Modern Automated Speaker Recognition: NIST SRE and the ENFSI BPM

The discipline that replaced spectrographic voiceprints with statistically defensible methods: the NIST Speaker Recognition Evaluation (SRE) series from 1996 to present that benchmark every commercial + research system, the i-vector model (Dehak 2010) and the x-vector deep-learning model (Snyder 2018) that drive current systems, the ENFSI Best Practice Manual for the Forensic Comparison of Speech (2015 + 2022 revisions) that codifies likelihood-ratio reporting, the operational systems deployed by FBI + BKA + Met Police + CFSL Hyderabad, and the case-law evolution on automated speaker-recognition evidence.

Last updated: 18 Jun 2026

Modern automated forensic speaker recognition quantifies vocal identity evidence as a likelihood ratio (LR) rather than a categorical match, expressing how much more probable the observed acoustic evidence is under a same-speaker hypothesis than a different-speaker hypothesis. The NIST Speaker Recognition Evaluation series, running since 1996, provides the adversarial benchmarking infrastructure that validates every major system, while the ENFSI Best Practice Manual for Forensic Comparison of Speech (2015, revised 2022) codifies LR reporting, validated methodology, and accreditation requirements for operational casework. Current systems use x-vector deep-learning embeddings with probabilistic linear discriminant analysis (PLDA) backends, replacing the earlier i-vector framework while retaining its core probabilistic logic. The result is a discipline that acknowledges and quantifies uncertainty explicitly, in a form courts can interrogate and challenge.

Modern forensic speaker recognition quantifies uncertainty as a likelihood ratio rather than asserting a categorical voiceprint match. The NIST Speaker Recognition Evaluation series (1996 to present) provides the adversarial benchmarking that the voiceprint era lacked, and the ENFSI Best Practice Manual for Forensic Comparison of Speech (2015, rev. 2022) translates that benchmarking discipline into operational court-facing guidance across European and allied laboratories.

Key takeaways

NIST SRE's scientific value comes from blind evaluation: systems are submitted before ground-truth labels are released, preventing post-hoc tuning.
The i-vector framework (Dehak et al., 2011) enabled reliable enrolment from 30 to 60 seconds of speech; the PLDA backend produces a direct log-likelihood ratio output.
x-vectors (Snyder et al., 2018) replaced UBM-based statistics extraction with a TDNN trained on millions of utterances, generalising better across noise, language, and channel mismatch.
The ENFSI BPM requires five domains: casework documentation, methodology validation, representative background database, LR reporting on the ENFSI verbal scale, and ISO 17025 accreditation.
Language mismatch is the primary LR reliability risk in multilingual casework; most validated background databases are English, Mandarin, or Arabic.

The collapse of spectrographic voiceprint analysis in the late 1980s did not close the question of whether machines or trained scientists could reliably distinguish speakers from acoustic recordings. It redirected the question toward a more rigorous methodology: instead of claiming that visual patterns are unique, researchers began asking how much probability mass separates same-speaker from different-speaker comparisons under defined conditions, expressed as a measurable likelihood ratio.

Two institutional developments, running in parallel from the mid-1990s, built the discipline that replaced voiceprint. The first was the NIST Speaker Recognition Evaluation series, launched in 1996 by the National Institute of Standards and Technology in Gaithersburg, Maryland, which created a standardised, competitive benchmarking environment for every major automatic speaker recognition system. The second was the European Network of Forensic Science Institutes' Speaker Identification Working Group, which translated the statistical framework being developed in NIST evaluations into operational guidance for forensic casework, culminating in the ENFSI Best Practice Manual for Forensic Comparison of Speech.

The result is a discipline that acknowledges uncertainty explicitly, quantifies it in a form courts can interrogate, and submits its systems to adversarial benchmarking against large independent datasets.

By the end of this topic you will be able to:

Explain the blind-submission structure of the NIST SRE series and why it produces auditable, non-cherry-picked performance figures.
Distinguish the i-vector (Dehak et al., 2011) and x-vector (Snyder et al., 2018) frameworks in terms of architecture, minimum enrolment duration, and cross-condition generalisation.
Describe the five domains of the ENFSI BPM and the specific evidentiary risk each domain addresses.
Interpret C_llr as a calibration metric and explain why a poorly calibrated LR is inadmissible under ENFSI BPM 2022, regardless of the underlying EER.
Compare the admissibility standards for automated speaker recognition evidence across Germany (BGH 2012), England and Wales (R v. Flynn and St John 2008), the United States (post-Daubert), and India (Bharatiya Sakshya Adhiniyam 2023).

The NIST Speaker Recognition Evaluation Series: 1996 to Present

The NIST Speaker Recognition Evaluation (SRE) programme began as an annual challenge in 1996, administered by NIST's Information Technology Laboratory in collaboration with the Linguistic Data Consortium at the University of Pennsylvania. The shared-task evaluation structure:

NIST releases test segments from known conditions.
Participant teams (universities, national labs, commercial vendors, intelligence agencies) submit recognition decisions without seeing ground-truth labels.
NIST releases ground-truth labels.
Collective results are published; all datasets deposited with LDC for future research.

No team knows the ground-truth labels before submitting; no team can retrospectively adjust its system. This structure created a community with a shared, auditable performance history.

Key evaluation metrics:

Equal Error Rate (EER): The operating point where false acceptance and false rejection rates are equal. Lower EER indicates better discriminative performance.
Detection Cost Function (DCF): A weighted error measure tunable to reflect the relative cost of false acceptance versus false rejection in a specific application context.

Evolution of the SRE conditions:

Performance degrades with language mismatch, noise, short utterances, and vocal disguise. These condition-specific performance figures are the operational parameters that forensic practitioners must disclose when presenting automated speaker recognition evidence.

i-Vector Models: The Statistical Revolution of 2010

Before approximately 2010, the dominant paradigm was the Gaussian Mixture Model-Universal Background Model (GMM-UBM) framework. A speaker-independent background model was adapted to a specific speaker using their enrolment recordings; comparison involved computing likelihood scores for a test segment against the speaker model and the background model, producing a log-likelihood ratio.

The i-vector framework (Dehak et al., MIT and CRIM Montreal, 2011) changed this substantially:

Rather than maintaining a full adapted GMM per speaker, the i-vector uses a low-dimensional fixed-length factor (typically 400 to 600 dimensions) that summarises a speaker's deviation from the universal background model, regardless of utterance length.
Speaker comparison becomes a computation over fixed-length vectors, not full Gaussian mixtures.
Forensic consequence: Enrolment from 30 to 60 seconds of speech became possible where GMM-UBM required several minutes.

Backend discriminant methods applied to i-vectors include:

WCCN (within-class covariance normalisation)
NAP (nuisance attribute projection)
PLDA (probabilistic linear discriminant analysis): explicitly models between-speaker and within-speaker variability, directly outputting a log-likelihood ratio between the same-speaker and different-speaker hypotheses.

The log-likelihood ratio produced by a PLDA backend operating on i-vectors is the technical underpinning of the numerical LR reported in court. Its validity depends on the PLDA model's assumptions being met: the between-speaker and within-speaker covariance matrices estimated on the background population dataset must be representative of the speakers and conditions in the case. This representativeness requirement is one of the central evidentiary issues that the ENFSI BPM addresses.

For the voice-comparison context that preceded the i-vector framework, see Spectrographic Voiceprint History and Its Modern Rejection.

i-vector pipeline: acoustic features extracted from speech compress into a fixed-length i-vector; PLDA backend computes a log-likelihood ratio between same-speaker and different-speaker hypotheses.

x-Vector Deep-Learning Models: From 2018 to the Present

In 2018, David Snyder and colleagues at Johns Hopkins University and NIST published the x-vector paper at the IEEE International Conference on Acoustics, Speech and Signal Processing. The x-vector framework replaces the UBM-based statistics extraction of the i-vector pipeline with a time-delay neural network (TDNN) trained to classify speakers from their acoustic features. The TDNN's penultimate layer produces a fixed-length embedding capturing speaker identity in a deep representational space learned from large labelled corpora.

Advantages over i-vectors:

Better generalisation across language, channel, and noise conditions in SRE16, SRE18, and SRE19 evaluations.
Dominated the SRE21 leaderboard.

Recent architectures evaluated in the VoxSRC challenge series (Visual Geometry Group at Oxford, Johns Hopkins Human Language Technology Center):

ResNet variants
ECAPA-TDNN (channel-attention TDNN variant)
Transformer-based speaker embedding models

The PLDA backend remains standard for computing log-likelihood ratios from x-vector embeddings, though neural network backends and end-to-end training show further gains on in-domain conditions.

Calibration requirement for forensic use:

The LR produced by the backend must be calibrated: the numerical LR must correspond to its intended probabilistic interpretation, not merely rank as high or low. Calibration is evaluated by C_llr (log-likelihood ratio cost), a proper scoring rule measuring how far LR values deviate from perfect calibration. A well-calibrated forensic system has C_llr close to zero; a poorly calibrated system may substantially over- or under-state evidential weight.

Operational deployments:

CFSL Hyderabad (India): Has piloted deep-learning speaker recognition tools in its acoustic analysis division; operational use subject to Indian court evidentiary standards.
BKA (Germany): Incorporated x-vector-based systems into casework under ENFSI BPM guidelines.
Met Police FATT unit (UK): Uses x-vector-based systems for speaker comparison under UK Crown Prosecution Service guidance and ENFSI BPM.

The ENFSI Best Practice Manual for Forensic Comparison of Speech

The European Network of Forensic Science Institutes published the first edition of its Best Practice Manual for Forensic Comparison of Speech (ENFSI BPM) in 2015, with the work led by the ENFSI Speaker Identification Working Group. A substantially revised second edition appeared in 2022, reflecting developments in deep-learning speaker recognition and updated guidance on LR reporting standards. The BPM has been adopted or referenced by forensic laboratories in the UK, Germany, the Netherlands, Sweden, Spain, France, and other ENFSI member states, and influences practice in Australia, Canada, and, through INTERPOL forensic science working groups, in several other jurisdictions.

The BPM's core requirement is that speaker comparison conclusions be expressed as a likelihood ratio, explicitly quantifying the probability of the observed acoustic evidence under the prosecution hypothesis (same speaker) relative to the defence hypothesis (different speakers). This requirement directly addresses the structural flaw of the voiceprint era, where conclusions were categorical rather than probabilistic, and where no mechanism existed for the court to evaluate what the evidence actually proved.

The BPM specifies requirements across five domains. The casework domain covers documentation of recording conditions, exhibits handling, and case-relevant acoustic conditions. The methodology domain requires that the comparison method (whether fully automated, semi-automated, or fully auditory-acoustic) has been validated on a dataset representative of the case conditions and that the validation data is documented. The database domain requires a background population database for LR computation that is appropriate to the case (language-matched, channel-matched, and population-matched to the persons of interest). The reporting domain specifies the verbal scale for communicating LR values to courts (ranging from "limited support" for LRs between 1 and 10 to "extremely strong support" for LRs above 10,000), following the ENFSI Guideline for Evaluative Reporting in Forensic Science. The quality domain requires independent case review, participation in proficiency testing, and laboratory accreditation.

Domain	BPM requirement	What it guards against
Casework	Document recording conditions, channel, temporal gap, linguistic content of the sample	Exaggerated conclusions from degraded or condition-mismatched evidence
Methodology	Validated on representative data; validation study documented and peer-reviewed	Unreproducible accuracy claims (the voiceprint failure mode)
Database	Background population data must be language- and channel-matched to case	LRs calibrated on English data applied to Hindi or Punjabi case recordings
Reporting	LR expressed on ENFSI verbal scale; uncertainty in LR acknowledged	Categorical identification assertions replacing probabilistic conclusions
Quality	Independent case review; accreditation; regular proficiency testing	Individual examiner error unchecked by systematic oversight

The ENFSI verbal scale aligns with scales used in other forensic disciplines including DNA evidence, fingerprint likelihood ratio reporting, and handwriting comparison. This alignment allows a court to compare evidential weight across disciplines using a consistent conceptual framework, even when the underlying technical methods differ substantially.

ENFSI verbal scale: six LR threshold bands translate numerical log-likelihood ratios into standardised court language, from 'limited support' (LR 1 to 10) through 'extremely strong support' (LR above 10,000); all forensic speech disciplines share this scale.

Operational Deployments: FBI, BKA, Met Police, CFSL Hyderabad

The operational landscape for automated speaker recognition in forensic casework is different from the research landscape. NIST SRE systems are evaluated on large, curated, balanced corpora with defined conditions. Forensic casework presents degraded telephone intercepts, room-acoustic recordings, multi-speaker conversations, unknown languages, vocal disguise, and severely limited reference material. The mapping between benchmark performance and casework performance requires careful condition-specific validation.

The FBI's Investigative Analysis Unit, formerly the unit that had employed spectrographic voice identification examiners before the 1989 withdrawal, shifted to phonetics-trained linguists working with acoustic analysis software (including Praat, developed at the University of Amsterdam, and BATVOX, a commercial speaker recognition system from Agnitio, now Nuance Communications). For the role of phonetic analysis in multi-language casework, see Forensic Phonetics: Multi-Language Casework and Cross-Linguistic Challenges. FBI speaker comparison testimony in federal cases post-Daubert has been cautiously presented as expert phonetic opinion rather than as a specific LR value, reflecting the US courts' incomplete adoption of the Bayesian reporting framework.

The BKA's Forensic Science Institute (Kriminaltechnisches Institut) operates one of Europe's most technically advanced speaker recognition units, using a combination of PLDA-backed i-vector and x-vector systems for automated comparison alongside auditory-phonetic analysis by trained forensic phoneticians. BKA speaker comparison evidence is reported using the ENFSI verbal scale and has been admitted in Landgericht (Regional Court) and Bundesgerichtshof (Federal Court of Justice) proceedings.

The Met Police's FATT unit covers forensic audio, telephone recording, and television enhancement for investigations in England and Wales. FATT experts provide speaker comparison evidence under the UK Crown Prosecution Service guidance, which requires LR reporting for speaker comparison evidence when the comparison is contested. R v. Flynn and St John (2008) in the Court of Appeal established that expert phonetic evidence must be grounded in scientific method and that categorical conclusions without quantified uncertainty are insufficient. Subsequent CPS guidance formalised this, bringing UK practice into alignment with the ENFSI BPM framework.

In India, the CFSL Hyderabad's Acoustics Division handles phonetic analysis and speaker comparison for cases referred from state police, courts, and central agencies. The division uses acoustic analysis software and auditory-phonetic analysis methods, with evidence presented as expert opinion under Section 79 of the Bharatiya Sakshya Adhiniyam 2023. The Indian Supreme Court's admissibility rulings on telephone interception evidence (primarily addressing collection legality under the Indian Telegraph Act 1885 and the interception evidence framework) have not yet produced a Daubert-equivalent methodology gatekeeping standard, leaving LR adoption in Indian courts at an earlier developmental stage than in ENFSI jurisdictions.

Case Law Evolution on Automated Speaker Recognition Evidence

In Germany, the Bundesgerichtshof ruled in 2012 (BGH 1 StR 386/12) that speaker comparison evidence must be reported using a probabilistic framework and that categorical identification conclusions are insufficient under German evidence law. The decision aligned German case law with the ENFSI BPM framework that the BKA was already following in practice. Subsequent regional court decisions have expanded the disclosure requirement to include the size and composition of the background population database, the system's calibration performance (C_llr), and the specific conditions of the case recordings that were used to select the relevant background population.

In England and Wales, the trajectory from R v. Robb (1991) through R v. O'Doherty (2003, Northern Ireland Court of Appeal) to R v. Flynn and St John (2008) established an increasingly rigorous standard. O'Doherty is significant because the court excluded auditory speaker comparison evidence that was not accompanied by acoustic phonetic analysis, holding that acoustic measurement is necessary (though not sufficient) for admissible speaker identification evidence. Flynn and St John extended this to require that the expert's methodology be validated and that the limitations of the comparison be disclosed. Current CPS guidance requires LR reporting, and the Forensic Science Regulator's 2023 Codes of Practice require UK forensic laboratories offering speaker comparison to be accredited under ISO 17025 for that activity.

In the United States, the post-Daubert environment has produced divergent results across federal and state courts. United States v. Angleton (S.D. Tex. 2003) excluded voiceprint and aural spectrographic identification testimony, finding the techniques not widely accepted and error rates unknown. Subsequent federal cases have admitted phonetic expert opinion under Daubert's gatekeeping framework when the expert disclosed methodology, error rates, and the limitations of the comparison. There is no uniform federal rule analogous to the ENFSI BPM or the UK CPS guidance. The National Commission on Forensic Science's 2016 Views Document on Speaker Identification recommended adoption of a probabilistic reporting framework but was not binding.

Exhibit triage and condition assessment
Assess recording quality (SNR, bandwidth, channel type), duration of usable speech, language, and number of speakers. Document conditions that affect LR reliability before any comparison is attempted.
Reference sample collection and matching
Acquire reference recordings from the person of interest under conditions as close as possible to the questioned recording (same channel, same linguistic content where possible). Document temporal gap between questioned and reference.
System selection and validation check
Select a speaker recognition system validated on conditions representative of the case. Retrieve the validation study for the relevant language, channel, and background population. Note the system's reported EER and C_llr for those conditions.
Automated comparison and LR computation
Run the validated system. Obtain the log-likelihood ratio from the PLDA or equivalent backend. Check calibration: is the LR value within the validated operating range of the system for these conditions?
Auditory-phonetic review (semi-automatic workflow)
A qualified forensic phonetician independently assesses the acoustic evidence for features the automated system may not capture: prosody, voice quality, dialect features, vocal disguise markers. The phonetician's assessment may support, qualify, or conflict with the automated LR.
LR reporting on ENFSI verbal scale
Report the LR on the ENFSI verbal scale. Disclose the background database, system EER/C_llr, recording conditions, and any factors that increase LR uncertainty beyond the validated conditions. Independent case review before disclosure.

In India, the Supreme Court's most significant ruling touching on voice evidence is Peoples Union for Civil Liberties v. Union of India (1997), which addressed the legality of telephone tapping rather than the scientific methodology of voice comparison. Individual High Court decisions on phone-tap evidence have admitted voice identification testimony from CFSL examiners without imposing the ENFSI BPM-style validation requirements that European and, increasingly, Anglo-Commonwealth courts now expect. The gap between Indian and European practice in forensic voice comparison methodology represents a significant area for capacity development.

Key terms

NIST SRE (Speaker Recognition Evaluation): A competitive benchmarking series run by the National Institute of Standards and Technology since 1996, in which research and commercial teams submit speaker recognition decisions on held-out test sets and results are published against ground truth. The primary driver of systematic performance improvement in the field.
i-vector: A fixed-length low-dimensional speaker representation extracted by projecting a speaker's UBM statistics into a total variability space. Introduced by Dehak et al. (2011); the dominant speaker recognition paradigm from approximately 2011 to 2018.
x-vector: A deep-learning speaker embedding produced by the penultimate layer of a time-delay neural network (TDNN) trained for speaker classification. Introduced by Snyder et al. (2018); dominant in NIST SRE from 2018 onwards due to superior cross-channel and cross-language generalisation.
PLDA (Probabilistic Linear Discriminant Analysis): A backend model for speaker comparison that explicitly models between-speaker and within-speaker covariance, producing a log-likelihood ratio between the same-speaker and different-speaker hypotheses from a pair of speaker embeddings.
Equal Error Rate (EER): The operating point where a speaker recognition system's false acceptance rate equals its false rejection rate. A standard NIST SRE performance metric; lower EER indicates better discriminative performance under matched conditions.
C_llr (log-likelihood ratio cost): A proper scoring rule measuring the calibration quality of a speaker recognition system's LR outputs; a well-calibrated system has C_llr close to zero. Required as a disclosure metric under ENFSI BPM 2022 for systems used in forensic casework.
ENFSI BPM: The ENFSI Best Practice Manual for Forensic Comparison of Speech (2015, rev. 2022); the operational standard for speaker comparison in European forensic laboratories, requiring LR reporting, validated methodology, representative background databases, and ISO 17025 accreditation.
ENFSI verbal scale: A standardised scale for communicating LR values to courts in plain language, ranging from 'limited support' (LR 1-10) through 'moderate,' 'moderately strong,' 'strong,' 'very strong,' to 'extremely strong support' (LR above 10,000). Shared across ENFSI disciplines for consistent court communication.
Background population database: A corpus of speaker recordings used to calibrate the LR in a forensic comparison. Must be matched to the case in language, channel, and population (sex, age, dialect) to produce a valid LR; mismatch is a major source of LR unreliability in multilingual casework.
BATVOX: A commercial automated speaker recognition system (originally Agnitio, later Nuance Communications) widely used in operational forensic casework in Europe and Latin America. Produces LR outputs from i-vector or x-vector embeddings via PLDA backends.

Practice

Question 1 of 5· 0 answered

The NIST Speaker Recognition Evaluation series ensures methodological rigour primarily through which mechanism?

Worked example

x-Vector LR Report in a Phone-Tap Extortion Case with Channel-Mismatch Conditions

The questioned call was recorded on a 3G mobile network; the reference was a controlled VoIP interview - and the channel difference is the examiner's primary methodological challenge.

Scene: A UK extortion case. The National Crime Agency has recordings of three threatening calls made from a burner phone over a 3G network to a victim. A suspect is identified through cell-site analysis. A reference recording is obtained from a voluntary police interview conducted over a VOIP channel. The forensic phonetician must produce an LR under the ENFSI BPM for Forensic Speaker Comparison.

Step 1 (channel characterisation): The examiner characterises both channel types. The 3G calls show a narrow-band codec (AMR-NB, 3.4 kHz upper bandwidth limit) with variable bit-rate encoding. The VoIP reference shows a wideband codec (G.722, 7 kHz upper bandwidth limit) with lower compression artefacts. The frequency mismatch means that features above 3.4 kHz in the reference recording have no counterpart in the questioned recordings and cannot be used in the LR computation.

Step 2 (x-vector system with channel compensation): The examiner uses an x-vector DNN system trained on the NIST SRE 2016 dataset with PLDA back-end. Channel compensation is applied using the SRE training data's telephone-channel and microphone-channel partition to train a domain-adapted model. The x-vector comparison produces a log-likelihood ratio (LLR). The examiner calibrates the LLR using a calibration dataset of known-same-source and known-different-source speech pairs recorded under similar channel-mismatch conditions.

Step 3 (disclosed uncertainty and ENFSI verbal scale): The calibrated LR provides moderate support for the common-source proposition. The examiner's report explicitly states: (1) the channel mismatch introduces additional uncertainty not fully captured by the calibration corpus; (2) the reference recording duration of nine minutes is shorter than optimal for x-vector system performance; (3) an auditory-phonetic analysis of specific features consistent across the channel types (vowel quality, voice onset time, habitual glottalisation) corroborates the LR finding. The overall conclusion: moderate support, with disclosed caveats. The report is submitted under CrimPR Part 19 and admitted.

Conclusion: The scenario demonstrates the ENFSI BPM's core requirement for speaker comparison evidence: explicit disclosure of the conditions under which the LR was computed, including all factors that may inflate or deflate it. Channel mismatch is the most common real-world complication in forensic speaker recognition, and the examiner's documented response (channel-compensated model, calibration on matched conditions, disclosed limitation) is the operationally correct approach.

Can a forensic speaker recognition system reliably identify a speaker from a single word?

Short utterances degrade performance substantially. Both i-vector and x-vector systems require sufficient temporal context to estimate reliable speaker embeddings; performance on utterances below 5-10 seconds is measurably worse than on 30-second segments in NIST SRE benchmarks. In forensic casework, a single word or short phrase may permit auditory-phonetic analysis of specific features (a particular vowel, a prosodic pattern) but cannot support an LR computation with the calibration quality that the ENFSI BPM requires. The examiner must disclose this limitation and may need to express substantially increased uncertainty in the resulting LR.

Does the ENFSI Best Practice Manual for speaker comparison apply outside Europe?

The ENFSI BPM is a European document, but it has become an international reference standard. Australia's forensic science laboratories follow ASQA/NATA accreditation frameworks that align with its principles. Canadian and New Zealand forensic phoneticians reference it in their methodology documentation. India's CFSL laboratories are aware of the BPM framework and some have adopted elements of it in internal standard operating procedures, though formal incorporation into Indian court practice lags behind ENFSI jurisdictions. INTERPOL's forensic science working groups have recommended member states align with BPM principles for speaker comparison evidence in transnational cases.

How does forensic speaker recognition differ from voice biometrics used in banking?

The technical pipeline is similar (speaker embeddings, LR or threshold-based decisions), but the operational context and error tolerance differ fundamentally. In banking voice biometrics, a false acceptance rate of 1 in 1,000 is often acceptable because the cost of a fraudulent transaction is bounded and reversible. In forensic casework, a false acceptance at a high LR value may contribute to a wrongful conviction. This is why forensic speaker recognition requires explicit LR calibration, independent validation on case-representative data, and ENFSI BPM-mandated disclosure of uncertainty, requirements that consumer voice biometrics products do not meet. Forensic examiners must never import commercial biometric system performance figures directly into court testimony.

Test yourself on Fingerprint Sciences with free, timed mocks.

Practice Fingerprint Sciences questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Your journey to becoming a forensic professional starts here.