Reference Standards, Controls and Comparison Logic

The logic of forensic comparison: known versus questioned samples, certified reference materials, positive and negative controls, blanks, and why controls are what validate a result rather than the measurement alone.

Last updated: 19 Jun 2026

In forensic science, every comparison requires three foundations: a known sample with verified provenance to serve as the reference, certified reference materials to anchor measurements to a traceable international scale, and controls run with every analytical batch to confirm the method is working and the results are free from contamination. Without all three, a numerical agreement between two samples carries no evidentiary weight, because there is no basis for checking whether the measurement itself is valid. The known-versus-questioned framework, reference materials, and batch controls are not procedural formalities; they are the logical preconditions for any conclusion about common source.

The central act of forensic science is comparison. A fibre found on a victim's clothing is compared to fibres taken from a suspect's jumper. A DNA profile extracted from a blood stain is compared to a reference profile from a named individual. A refractive index measured on a glass fragment is compared to the measured index of the broken pane at the scene. The question in every case is the same: are these the same thing, or merely similar?

Answering that question reliably requires three things: a rigorous distinction between samples whose origin is known and samples whose origin is in question; reference materials that anchor instruments to a shared measurement scale; and controls that run alongside every batch of casework samples to confirm the method is working and the result is free from contamination. Without any one of them, the comparison that follows cannot be verified.

This topic explains how forensic scientists build that machinery. It covers the known-versus-questioned sample framework that organises every comparison, the role of certified reference materials in making measurements meaningful, the logic of positive and negative controls, and the reasoning structure that turns a measurement agreement into a defensible conclusion about whether two samples share a common source.

By the end of this topic you will be able to:

Distinguish between known (K) and questioned (Q) samples and explain why the integrity of the K sample is a precondition for any valid comparison.
Define a certified reference material and explain how measurement traceability through a CRM makes results comparable across laboratories and defensible in court.
Describe the role of positive controls, negative controls, and blanks in a casework batch and state the consequence of each type of failure.
Differentiate class characteristics from individual characteristics and apply the independence requirement before combining frequency data.
Explain how controls and calibration standards underpin the validity of a likelihood ratio calculation in a forensic comparison.

Key terms

Known (K) sample: A sample with documented, verified provenance used as the reference in a comparison. The origin of a K sample is not in question; its role is to anchor the measurement.
Questioned (Q) sample: A sample recovered from a scene, a person, or an object whose origin or relationship to a source is the subject of the forensic inquiry. The Q sample is what is being tested.
Certified Reference Material (CRM): A material characterised by a metrologically traceable procedure, with certified property values and stated measurement uncertainties. Used to calibrate instruments and validate methods so results can be compared across laboratories and over time.
Positive control: A sample known to contain the target analyte, processed alongside casework samples. Confirms the method is detecting what it should under current conditions.
Negative control (blank): A sample known to contain no target analyte, processed alongside casework samples. Confirms that background contamination is not producing false signals in the batch.
Measurement traceability: The property of a measurement result whereby it can be related to a national or international measurement standard through an unbroken chain of calibrations, each with stated uncertainties.

The known versus questioned framework

Every forensic comparison begins with a categorical distinction: which sample has a known origin, and which is the sample whose origin is under investigation? The known sample, called K in many traditions, is the anchor. It might be a buccal swab taken from a named individual under caution, a cutting from a specific garment seized from a suspect, a paint standard reference from the manufacturer's database, or soil taken from a GPS-mapped location. The origin of K is documented and the documentation is part of the exhibit record.

The questioned sample, Q, is recovered from a context whose significance is not yet established. The blood on a door handle, the fibre on a victim's coat, the glass fragment in a suspect's hair: these are Q samples. Comparison answers the question of whether Q and K are consistent with sharing a common source, and if so, how strongly the agreement supports that conclusion.

K versus Q comparison structure in forensic science.

Certified reference materials and measurement traceability

Suppose a forensic glass examiner measures the refractive index of a fragment recovered from a suspect's jacket and reports it as 1.5183. That number only becomes evidence when it can be compared to the index measured on the broken window at the scene. For the comparison to be valid, both measurements must be on the same scale: they must be traceable to the same calibration standard.

Certified Reference Materials are the mechanism that provides this traceability. A CRM is a material whose property has been measured by a national metrology institute (such as NIST in the United States or NPL in the United Kingdom) and certified with a stated value and uncertainty. Forensic laboratories use CRMs to calibrate their instruments. When Instrument A and Instrument B in two different cities are both calibrated against the same CRM, a measurement made on one can be meaningfully compared to a measurement made on the other.

For many forensic applications, purpose-built CRMs exist. The NIST Standard Reference Materials (SRM) programme offers glass standards with certified refractive indices (SRM 1822a), DNA profiling standards with certified allele designations and genotypes (SRM 2391 series), and ethanol solutions for blood alcohol calibration with certified concentrations. The UK Forensic Science Regulator's statutory Code of Practice, which came into force on 2 October 2023 under the Forensic Science Regulator Act 2021, requires that all methods used in regulated UK forensic laboratories be linked to traceable reference standards. Without this linkage, measurements cannot be challenged, corrected, or compared across time.

Positive and negative controls: the batch validation layer

Every batch of casework samples is accompanied by a set of controls that run through the entire analytical process, from sample preparation through to instrument output, under identical conditions. Controls are not procedural add-ons; they are the primary check that the batch result is valid.

Control type	What it contains	What it checks	Pass criterion
Positive control	Material known to contain the target at a concentration the method should detect	The method detects target under current reagent and instrument conditions	Signal appears at expected level
Negative control (blank)	Reagent or matrix with no target analyte	No false signal from background contamination in reagents, equipment, or environment	No signal above the detection threshold
Internal standard	A known quantity of a structurally similar compound added to each sample before extraction	Consistent extraction and instrument response across the batch	Recovery within defined acceptance range
Proficiency test sample	A blind sample from an external provider with a known result the laboratory does not know	The whole analytical system, including analyst skill and interpretation	Match to the expected result within stated tolerance

The positive control failure and the negative control failure are the two critical failures in a batch. A failed positive means the method missed the target: perhaps a reagent degraded, an instrument went out of specification, or a preparation step was omitted. Every case result in that batch must be treated as potentially false-negative until the cause is found and re-testing confirms the results. A failed negative means something contaminated the batch: a reagent, a surface, analyst carry-over from a previous high-concentration sample. Every case result must be treated as potentially false-positive.

Control logic in a forensic analysis batch: positive and negative controls validate the case sample results.

The comparison logic: class versus individual characteristics

When a K and a Q sample agree on a set of measured characteristics, the analyst must reason about what that agreement actually means. The critical variable is discrimination power: how many independent sources would also agree on these characteristics, and how is the relevant population defined?

Most trace evidence carries class characteristics: properties shared by all items made the same way. A blue polyester fibre has a colour and cross-section shared by all fibre from the same production run, potentially millions of garments worldwide. Agreement on class characteristics supports a common source but does not exclude other sources in the same class. Individual characteristics are features that, by theory or empirical study, are unique to one item: the ridge detail of a fingerprint, the random striation pattern on a bullet from a specific barrel, the full 20-locus STR profile in a large enough reference database.

Class characteristic match: narrows the population of possible sources. Reports should state what class was established and how common that class is, not imply individual identification.
Combination of class characteristics: each additional agreeing characteristic multiplies the discrimination if they are independent. Three independent class characteristics that each occur in 1 in 100 items together occur in roughly 1 in a million, but only if genuinely independent.
Individual characteristic match: supports a very strong probabilistic link to a single source, but even here the question of how large the reference population is and whether the characteristic is truly unique must be addressed.

Blanks and the contamination problem

A blank is a control with no target material, processed through the entire analytical workflow. It probes a specific question: did any target material enter the system from sources other than the case samples? The sources of contamination in a forensic analytical workflow are numerous: reagents that were manufactured using the same cell lines as a reference standard, surfaces and tools that were not fully decontaminated between samples, analyst carry-over from handling a high-concentration sample earlier in the session, and environmental DNA from biological aerosols in the laboratory space.

The blank's passage through the full workflow, including extraction, amplification, and detection, is what makes it informative. A blank that is only subjected to the final detection step tells you nothing about whether contamination was introduced during extraction. A full-process blank mimics the path the case sample takes and can catch problems at any point along it.

In DNA laboratories, contamination management is particularly rigorous because PCR amplification can turn a few cells' worth of contamination into a strong signal. The UK National DNA Database, for example, holds elimination profiles for all laboratory staff, so that if an analyst's DNA profile appears in a case result it can be identified and reported as a contamination event rather than being mistaken for a true case finding. Many laboratories also hold profiles for cleaners, maintenance staff, and anyone who regularly enters the analysis area.

Putting it together: from agreement to conclusion

When a comparison shows agreement between K and Q, the analyst must answer three questions before reaching a conclusion. First: how many characteristics agree, and are they class or individual? Second: how common are those characteristics in the relevant population? Third: what is the probability of the agreement under the prosecution proposition (the K and Q are from the same source) versus the defence proposition (K and Q are from different sources but happen to agree by coincidence)?

These three questions are the structure of the likelihood ratio approach. The LR does not produce a finding of guilt or innocence; it quantifies how much the scientific evidence should shift a rational person's belief about which proposition is more probable. An LR of 10,000 for a fibre comparison means the observed agreement is 10,000 times more probable if the fibres came from the same source than if they came from different sources in the relevant population. Communicating that number accurately, and its uncertainty, is the analyst's job.

Controls and reference standards are what give the LR calculation its validity. Without a certified reference material anchoring the refractive index measurement, the numerical agreement between K and Q is meaningless. Without a negative control confirming no contamination, the agreement might be an artefact. Without a positive control confirming the method's sensitivity, a non-agreement cannot be reported as a meaningful exclusion. Every element of the comparison logic rests on the validity of the measurement infrastructure beneath it.

Worked example

Paint transfer comparison from a hit-and-run: controls in action

Two paint samples, a comparison, and the role each type of control plays in making it defensible.

A white paint fragment is recovered from the clothing of a cyclist struck by a vehicle that drove away. A paint sample is taken from a white van recovered three days later that the driver claims was not involved. The laboratory is asked to compare the two.

K and Q established: the paint from the van is the K sample, documented with the vehicle registration, location, and date of sampling. The fragment from the cyclist's jacket is the Q sample.
Method and CRM: the analyst uses FTIR spectroscopy to characterise the organic binder in each sample. Before the casework batch runs, the instrument is calibrated using a certified polystyrene reference standard with a known spectrum from NIST. This anchors the wavenumber scale to an internationally recognised reference, making the resulting spectra comparable to those in manufacturer databases and published literature.
Controls: a positive control is a paint sample of known type processed alongside the case samples to confirm the FTIR method resolves the binder features correctly. A blank background spectrum is run to confirm the instrument baseline is clean and no residue from a previous sample is present.
Comparison: the FTIR spectra of K and Q show agreement across all major absorption bands. The layer sequence (primer, undercoat, colour coat, clearcoat) is examined by light microscopy on cross-sections of both samples. The layer colours, approximate thicknesses, and refractive indices of the clearcoat match.
Conclusion: the analyst reports that the paint is indistinguishable by spectroscopy and layer-sequence examination and is consistent with the K and Q having come from the same vehicle make, model, and production period, but notes that the same paint specification was used on approximately 18,000 vehicles of the same model year. The comparison is class-level; it cannot individualise to a single vehicle.

Every step of the comparison rests on the controls and calibration standards that preceded it. The NIST calibration makes the spectrum comparable to any other laboratory. The positive control confirms the method resolved the features. The blank confirms no contamination. Remove any of those, and the agreement in the spectra is a measurement without a foundation.

Check your understanding

Question 1 of 4· 0 answered

A forensic analyst runs a DNA extraction batch and finds that the negative control (blank) has produced a weak but detectable STR signal at two loci. What should happen next?

Key Takeaways

Every forensic comparison rests on the known-versus-questioned distinction: the K sample anchors the measurement with verified provenance; the Q sample is what the investigation is testing.
Certified reference materials give measurements a traceable scale that is independent of any single laboratory, making results comparable across institutions and defensible under cross-examination.
Positive and negative controls run with every batch are the operational proof that the method worked and no contamination affected the results; a failed control invalidates the batch, not just the individual sample.
The discrimination power of a comparison depends on whether the agreeing characteristics are class or individual, how common they are in the relevant population, and whether they are genuinely statistically independent before their frequencies are combined.
Controls and calibration standards are what give a likelihood ratio calculation its validity; the comparison logic is only as sound as the measurement infrastructure it rests on.

What is the difference between a known sample and a questioned sample in forensic science?

A known sample (also called a reference sample or K sample) is one whose origin is established and documented: a buccal swab from a named individual, a paint chip from a specific vehicle, a soil sample from a mapped location. A questioned sample (Q sample) is one whose origin is the subject of the investigation. Comparison means measuring both under identical conditions and assessing how closely they agree.

What is a certified reference material?

A certified reference material (CRM) is a substance whose composition or property value has been determined by a metrology authority and certified with a stated uncertainty. CRMs anchor analytical results to a traceable measurement scale. Without traceability through a CRM, a number produced by an instrument has no agreed meaning outside the laboratory that produced it.

What is the difference between a positive control and a negative control?

A positive control is a sample known to contain the target analyte at a concentration the method should detect. It confirms the method is working as expected. A negative control (blank) is a sample known to contain no target analyte. It confirms no background contamination is producing false signals. Both must be run with every batch of casework samples.

Why does a partial match in forensic comparison not automatically mean two samples came from the same source?

Agreement in some characteristics does not exclude all other possible sources. The strength of a partial match depends on how discriminating the matching characteristics are, how common those characteristics are in the relevant population, and how many characteristics agree versus how many were tested. A partial match raises the probability of common origin but a full statistical framework is needed to quantify the strength of that evidence.

What does 'measurement traceability' mean in a forensic context?

Traceability means that a measurement result can be linked, through an unbroken chain of calibrations, to a national or international measurement standard. It is what allows a refractive index measured in a laboratory in one city to be directly compared with one measured in a laboratory in another city or another country. Without traceability, inter-laboratory comparisons are unreliable.

Test yourself on Basics of Forensic Science with free, timed mocks.

Practice Basics of Forensic Science questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Key Takeaways

Your journey to becoming a forensic professional starts here.