Writing Evaluative Statements in Forensic Reports

Evaluative statements in forensic reports express the scientist's opinion about the significance of evidence within a formal logical framework. This topic covers how to draft propositions at the correct level, select verbal equivalents of likelihood ratios, and structure the evaluative section of a report to withstand courtroom scrutiny.

Last updated: 24 Jun 2026

An evaluative statement is the formal expression of a forensic scientist's opinion about the significance of evidence, framed as the ratio of two probabilities: how likely the observed findings are if one proposition is true compared to how likely they are if a competing proposition is true. This ratio is the likelihood ratio (LR). The evaluative statement does not tell the court who is guilty. It tells the court how much more or less probable the evidence is under one explanation than under another, leaving the court to combine that information with all other evidence in the case. This framework, now adopted in official guidance from bodies including the UK Forensic Science Regulator, the European Network of Forensic Science Institutes (ENFSI), and the US President's Council of Advisors on Science and Technology (PCAST), applies regardless of discipline: DNA, fingerprints, fibres, glass, gunshot residue, or documents.

The logical framework requires that the scientist state two competing propositions before expressing any opinion. The propositions must be at the same inferential level (source, activity, or offence), must be mutually exclusive, and must be exhaustive enough to capture the plausible alternatives in the case. The opinion is then the verbal equivalent of the LR computed or estimated for those propositions. Courts in the UK, Australia, the Netherlands, and several other jurisdictions now expect reports to follow this structure. Indian courts under the Bharatiya Sakshya Adhiniyam 2023 apply similar reasoning through provisions governing expert evidence, though a standardised verbal-scale requirement has not yet been formalised at the national level.

Many forensic laboratories issued evaluative statements for decades without making their logical structure explicit, often reporting a conclusion such as 'consistent with' or 'cannot be excluded' that left the court unable to determine the actual weight of the evidence. The move toward explicit LR-based reporting since the 1990s has been driven partly by academic work and partly by miscarriage-of-justice cases in which courts received evidence framed in ways that systematically overstated or understated its significance. Understanding how to draft correct evaluative statements is now a core competency for practising forensic scientists.

By the end of this topic you will be able to:

Identify whether a proposition operates at source, activity, or offence level and explain why that level must match the legal issue in the case.
Draft a correctly paired set of propositions for a given casework scenario, checking that they are mutually exclusive and at the same inferential level.
Select the appropriate verbal equivalent for a given likelihood ratio range using a standard scale such as the ENFSI recommended scale.
Structure an evaluative section of a forensic report in the sequence: propositions, key assumptions, likelihood ratio, verbal equivalent, and limitations.
Identify at least four common drafting errors, including the prosecutor's fallacy and proposition-level mismatch, and correct them in example text.

Key terms

Likelihood ratio (LR): The ratio of the probability of the evidence given the prosecution proposition to the probability of the evidence given the defence proposition. An LR greater than 1 supports the prosecution proposition; an LR less than 1 supports the defence proposition. The LR is the quantity the evaluative statement expresses.
Source-level proposition: A proposition about the origin of material. Example: 'The DNA profile originated from the suspect' versus 'The DNA profile originated from an unknown person.' Source-level propositions are often not the most legally relevant question, which is usually how the material came to be where it was found.
Activity-level proposition: A proposition about what happened. Example: 'The suspect handled the knife' versus 'The suspect never handled the knife.' Activity-level propositions require the scientist to incorporate assumptions about transfer, persistence, and background rates, making them more complex but more directly relevant to the legal issue.
Offence-level proposition: A proposition about guilt. Example: 'The suspect committed the assault' versus 'The suspect did not commit the assault.' Forensic scientists must not evaluate evidence at the offence level because doing so requires the scientist to weigh all case evidence, a task reserved for the jury or court.
Verbal equivalent scale: A standardised mapping from LR ranges to descriptive phrases, such as the ENFSI scale: LR 10 to 100 corresponds to 'moderate support', LR 100 to 1000 to 'moderately strong support', and so on. The scale provides a consistent vocabulary across laboratories and disciplines.
Prosecutor's fallacy: The logical error of confusing P(evidence | innocence) with P(innocence | evidence). An expert who states that a one-in-a-million match probability means there is a one-in-a-million chance the suspect is innocent is committing this fallacy. The correct statement reports the LR; the posterior probability belongs to the court.

Proposition levels: source, activity, and offence

Forensic propositions operate at three hierarchical levels, and choosing the wrong level is the most consequential structural error a scientist can make in an evaluative report. The level must match the legal question the court is actually deciding.

Level	The question asked	Example prosecution proposition	Example defence proposition
Source	Where did the material come from?	The blood on the shirt originated from the complainant.	The blood on the shirt originated from an unknown person.
Activity	How did the material get there?	The suspect stabbed the complainant.	The suspect was in the same room but did not stab anyone.
Offence	Is the suspect guilty?	The suspect committed the murder.	The suspect did not commit the murder.

In most contested cases, the parties do not dispute that the complainant's DNA was found on the knife: they dispute how it got there. A source-level opinion that says 'the DNA profile is 10,000 times more likely if it came from the complainant than from a random person' does not help the court resolve the actual dispute. An activity-level opinion that says 'the findings are 50 times more likely if the suspect used the knife to stab the complainant than if someone else did while the suspect was never in contact with the knife' is directly responsive to the legal issue.

Activity-level propositions are harder to evaluate because they require the scientist to estimate or model the probability of the evidence under a scenario involving human behaviour, not just a laboratory measurement. The scientist must incorporate assumptions about transfer rates (how likely is it that stabbing would deposit DNA of the observed amount?), persistence (how much DNA would survive the intervening time?), and background (how likely is it that the complainant's DNA would be on the knife for an innocent reason?). These assumptions must be stated explicitly in the report. Omitting them makes the opinion unverifiable.

Constructing correctly paired propositions

A pair of propositions is correctly constructed when it satisfies three conditions: both propositions are at the same inferential level, they are mutually exclusive (both cannot be true simultaneously), and together they cover the realistic alternative explanations in the case context. The scientist does not invent propositions. They are derived from instructions received from the instructing party, informed by the likely defence case as disclosed.

A common error is to pair a source-level prosecution proposition with an activity-level defence proposition. For example: 'H1: The fibres on the seat originated from the suspect's jacket' versus 'H2: The suspect was never in the car.' H1 is a source proposition; H2 is an activity proposition. Evaluating evidence under this mismatched pair produces an LR whose denominator includes outcomes that are irrelevant to the prosecution proposition, making the number uninterpretable.

The defence proposition must also be realistic. A proposition framed as 'the fibres came from any other blue wool jacket in the world' is technically correct but so broad as to make the LR trivially large. The accepted practice, recommended in ENFSI Guideline for Evaluative Reporting (2015) and reflected in guidance from the UK Forensic Science Regulator, is to anchor the defence proposition to the realistic alternative advanced by the defence or, where no defence case has been given, to the most plausible alternative given the case circumstances.

Verbal equivalent scales

Likelihood ratios span many orders of magnitude. A DNA LR might be 10 billion; a voice comparison LR might be 8. Presenting raw numbers to a jury or to a lay judge risks either inducing false precision or causing incomprehension. Verbal equivalent scales translate LR ranges into standardised descriptive phrases. The goal is to communicate the direction and approximate magnitude of evidential support in language accessible to the court, without abandoning the logical structure.

LR range	Verbal equivalent (supports prosecution proposition)	Verbal equivalent (supports defence proposition)
1 to 10	Weak support	Weak support for defence proposition
10 to 100	Moderate support	Moderate support for defence proposition
100 to 1,000	Moderately strong support	Moderately strong support for defence proposition
1,000 to 10,000	Strong support	Strong support for defence proposition
10,000 to 1,000,000	Very strong support	Very strong support for defence proposition
> 1,000,000	Extremely strong support	Extremely strong support for defence proposition

The table above follows the ENFSI recommended scale (2015). Other scales exist: the Association of Forensic Science Providers (AFSP) in the UK published a similar six-tier scale; the International Association for Identification (IAI) uses different terminology for friction-ridge evidence. Individual laboratories may use in-house scales, provided those scales are disclosed in the report and validated for the discipline.

The verbal equivalent must correspond to the actual LR range. Selecting a phrase that overstates the LR is a drafting error with direct legal consequences. It effectively converts a moderate finding into a very strong one in the mind of the jury. Courts have cited such overstating as grounds for appeal in several jurisdictions. Where the LR has been estimated rather than calculated, the uncertainty in the estimate must be reflected in the choice of verbal equivalent: use the more conservative phrase when the LR spans more than one tier.

Structuring the evaluative section of a report

A well-structured evaluative section follows a fixed sequence. Deviating from this sequence invites the reader to confuse the proposition with the conclusion, or the LR with the posterior probability. The sequence below reflects the structure recommended in ENFSI Guideline for Evaluative Reporting and is consistent with Forensic Science Regulator guidance in England and Wales, as well as guidance issued by forensic science institutes in Australia and the Netherlands.

Propositions: State both propositions explicitly and label them (e.g., H1 and H2 or Hp and Hd). This tells the reader exactly what question is being answered.
Key assumptions: List the conditioning assumptions the opinion depends on, particularly for activity-level opinions. These include assumed transfer rates, persistence rates, and the relevant reference population for background frequency estimates.
Findings: Describe the observed evidence briefly. The full analytical results belong in an earlier section; here, state only what is relevant to the evaluation.
Likelihood ratio: State the LR value or range, and whether it was calculated, estimated, or derived from a model. If it was estimated, state the basis for the estimate.
Verbal equivalent: State the verbal equivalent and identify the scale being used. This is the sentence the court reads. It must translate the LR accurately.
Limitations: State what the opinion does not cover. If only source-level evidence is available but the legal issue is at activity level, say so explicitly.

The order matters. Stating the verbal equivalent before the propositions allows the reader to attach the conclusion to an implicit question rather than the explicit one, which is exactly the error these guidelines are designed to prevent. Stating the LR before the assumptions invites the reader to treat the number as a raw fact rather than a conditional probability.

Common drafting errors and how to correct them

The errors below recur across disciplines and jurisdictions. Each has appeared in published appellate judgments or in peer-reviewed audits of forensic reporting practices.

Proposition-level mismatch: The legal issue is activity-level but the report evaluates source-level propositions. Correction: re-draft propositions to match the court's question; if activity-level evaluation is not possible, state that explicitly.
The prosecutor's fallacy: Reporting 'there is a one in ten million chance this profile came from someone other than the suspect' instead of 'the evidence is ten million times more likely if the profile originated from the suspect than from an unrelated individual.' Correction: always frame the opinion as the LR, never as a posterior probability.
The defence attorney's fallacy: Arguing that because thousands of people share the profile, the evidence has little value. This ignores all other case evidence and misapplies probability. Forensic scientists should be alert to this framing in cross-examination and correct it.
Verbal overstating: Choosing 'very strong support' for an LR that falls in the 'moderate support' range, because the scientist believes the suspect is guilty and wants the jury to reach the same conclusion. Correction: the verbal equivalent must follow from the LR, not from the scientist's private opinion about the case.
Omitting conditioning assumptions: Reporting an activity-level LR without stating the transfer and persistence rates assumed in its calculation. If those assumptions are wrong, the LR is wrong. Correction: list all key assumptions in the evaluative section and, where possible, give the sensitivity of the LR to changes in those assumptions.
Ambiguous language: Using phrases such as 'consistent with', 'cannot be excluded', or 'supports the view'. These phrases carry no defined probabilistic meaning. Juror perception studies in the UK, Australia, and the Netherlands have shown that different jurors interpret them across the full range of probabilities. Correction: replace with a verbal equivalent tied to an explicit LR.

Peer review of evaluative sections before a report is finalised catches most of these errors. Several forensic science regulators, including the UK Forensic Science Regulator and the Netherlands Register of Court Experts (NRGD), now require that evaluative reports be reviewed by a second scientist before submission to a court. This requirement is not yet universal, but the practice has measurably reduced reporting errors in organisations that apply it.

Annotated casework examples

Reading examples from real or realistic casework is the most direct way to internalise the difference between correct and flawed evaluative statements. The examples below are illustrative; the analytical details are typical of published casework reports in the UK and Australia.

Example 1: Correct activity-level report. 'H1: The suspect fired the firearm. H2: The suspect was present in the room where the firearm was fired but did not fire it. Key assumptions: I have assumed that firing a weapon deposits residue of this composition on the shooter's hands in approximately 80 to 95% of cases (based on published studies), and that secondary transfer from a bystander is possible but substantially less likely. Findings: Gunshot residue particles consistent with primer composition were recovered from the right hand of the suspect. LR: I estimate the LR at between 100 and 1,000. Verbal equivalent: The findings provide moderately strong support for H1 compared to H2, using the ENFSI recommended scale. Limitation: This evaluation does not address whether the suspect loaded or unloaded the weapon without firing it; additional data would be needed to evaluate that alternative.'

Example 2: Flawed source-level report for an activity-level issue. 'The glass fragments recovered from the suspect's jacket are indistinguishable in refractive index from glass from the broken window at the scene. There is therefore strong evidence that the glass came from the scene window.' The legal issue is whether the suspect broke the window. The report answers a source question (where did the glass come from?) rather than an activity question (did the suspect break the window?). A bystander near the scene, a secondary transfer from clothing contact, or contamination could all explain the same finding without implicating the suspect in the breaking. The report overstates the significance of the evidence for the legal question.

Worked example

Drafting an evaluative section for a fibre comparison case

A worked example that takes a fibre casework scenario from raw findings through correctly paired propositions to a final evaluative statement.

Scenario: Blue acrylic fibres were recovered from the victim's pullover. A reference sample from a cardigan seized from the suspect's home has fibres of matching colour, fibre type, and dye profile. The defence says the suspect was never in contact with the victim. The legal issue is activity: did the suspect have physical contact with the victim?

Draft propositions at activity level. H1: The suspect had direct physical contact with the victim in the period before the fibres were collected. H2: The suspect had no direct physical contact with the victim; the fibres arrived by another route. Check: both propositions are at activity level; they are mutually exclusive; they cover the realistic alternatives.
State key assumptions. Transfer: acrylic fibres of this type transfer during brief upper-body contact with a probability of approximately 0.3 to 0.7 based on published studies (e.g., Lowrie and Wiggins 2005). Persistence: fibres of this type persist on a pullover for up to 24 hours at moderate levels. Background: blue acrylic fibres matching this profile appear in approximately 2 to 5 per cent of garments examined in comparable UK urban casework populations, based on laboratory reference data.
Compute or estimate the LR. Using the transfer rate, persistence rate, and background frequency, the LR is estimated at between 10 and 50. The uncertainty in the background frequency and transfer rate is the main source of the range.
Select the verbal equivalent. An LR of 10 to 50 spans the ENFSI 'weak' to 'moderate' range. Because the estimate is uncertain and the lower bound falls in the 'weak' tier, report as: 'The findings provide weak to moderate support for H1 compared to H2, using the ENFSI recommended scale (2015).'
State the limitation. 'This evaluation does not consider the possibility that the fibres were transferred via an intermediate item such as a shared seat or surface. If the defence advances that alternative, the propositions and the LR would need to be revised to incorporate it.'

Check your understanding

Question 1 of 4· 0 answered

A forensic scientist reports: 'The DNA profile is 50,000 times more likely if it originated from the suspect than if it originated from an unrelated individual selected at random from the population.' What level of proposition is this?

Key Takeaways

Every evaluative statement expresses a likelihood ratio: how much more probable the evidence is under one proposition than under a competing proposition. The LR belongs to the scientist; the posterior probability belongs to the court.
Propositions must be at the same inferential level, mutually exclusive, and anchored to the actual legal question. Activity-level propositions are almost always more relevant than source-level ones for contested criminal cases.
Verbal equivalent scales such as the ENFSI six-tier scale translate LR ranges into standardised language. The phrase chosen must correspond to the computed or estimated LR, not to the scientist's general impression of the case.
The evaluative section of a report follows a fixed sequence: propositions, key assumptions, findings, LR, verbal equivalent, and limitations. Deviating from this order creates interpretive ambiguity.
The most common drafting errors are proposition-level mismatch, the prosecutor's fallacy, verbal overstating, omitting conditioning assumptions, and using ambiguous phrases such as 'consistent with' in place of a defined verbal equivalent.

What is the difference between a source-level and an activity-level proposition?

A source-level proposition asks where the material came from, for example whether a DNA profile originated from a named suspect. An activity-level proposition asks how the material arrived in a location, for example whether a suspect handled an object or was merely present in the same room. Activity-level propositions are usually more legally relevant because courts are interested in what happened, not just the origin of a trace.

What is a verbal equivalent scale in forensic reporting?

A verbal equivalent scale maps numerical likelihood ratio ranges to standardised phrases such as 'strong support' or 'moderate support'. It allows expert witnesses to communicate the weight of evidence in accessible language without abandoning the underlying probabilistic reasoning. Different organisations use different scales, but they all share the same structure: phrases on one side favour the prosecution proposition and mirror phrases on the other side favour the defence proposition.

Why must propositions be paired and mutually exclusive?

A likelihood ratio compares the probability of the evidence under two competing propositions. If the propositions are not mutually exclusive, the comparison is logically incoherent. If they are not paired at the same inferential level, the expert is implicitly making a legal determination that belongs to the jury. Correctly paired propositions keep the scientist's role within its proper boundary.

What is the prosecutor's fallacy in the context of evaluative reporting?

The prosecutor's fallacy is the error of treating the probability of the evidence given innocence as if it were the probability of innocence given the evidence. For example, stating that a match probability of one in a million means there is a one-in-a-million chance the suspect is innocent. The correct evaluative statement reports the likelihood ratio and leaves the posterior probability to the jury once they have considered all other evidence.

What common drafting errors undermine the logical integrity of an evaluative opinion?

The most frequent errors are: (1) using source-level language when the legal issue is at activity level; (2) expressing the opinion as a posterior probability rather than a likelihood ratio; (3) failing to state the propositions explicitly before stating the opinion; (4) selecting a verbal equivalent that does not correspond to the computed or estimated likelihood ratio; and (5) omitting the conditioning assumptions, such as the assumed transfer and persistence rates, that underpin an activity-level opinion.

Test yourself on Forensic Statistics with free, timed mocks.

Practice Forensic Statistics questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Key Takeaways

Your journey to becoming a forensic professional starts here.