Practice with mock tests, learn from structured notes, and get your questions answered by a global forensic community, all in one place.
Facial image comparison is the forensic discipline of examining still or video images to determine whether two faces belong to the same person, using anatomical landmark analysis and holistic assessment under rigorous methodological standards.
Last updated:
A grainy CCTV still, a passport photograph, and a person sitting in the dock: the question is whether these faces belong to the same individual. No fingerprint is available. No DNA was left behind. The investigating officer has two images and a suspect, and the case may turn on what a facial image examiner can and cannot say about them. This is the territory of forensic facial image comparison: the systematic examination of images to assess whether two facial depictions share a common origin.
The discipline draws on two complementary approaches. Morphological analysis works feature by feature, cataloguing the shape of the orbital region, the nasal profile, the ear, the chin line, and dozens of other anatomical structures, then asking whether the pattern of similarities and differences between two images is consistent with one identity or two. Holistic assessment asks the examiner to take the face as a whole, drawing on the same perceptual processes humans use to recognise one another, but guided by training and anchored to a structured conclusion scale. Neither approach alone is sufficient; skilled examiners combine them.
The field has been shaped by two main institutional voices. In the United Kingdom the technique has been called Facial Mapping, and it reached the courts through cases in the 1990s and 2000s. Internationally, the Facial Identification Scientific Working Group (FISWG), established in the United States in 2008 with broader representation, developed the terminology and guidelines that are now the reference standard. This topic walks through how both morphological and holistic methods work, how the ear and photo-anthropometric overlay fit in, and what the FISWG conclusion scale actually asks an examiner to say.
Two jurisdictions, one problem, and a long road to shared standards.
Forensic facial comparison reached UK courts in the 1980s and 1990s, largely through the work of practitioners who had backgrounds in medical illustration and physical anthropology. The term Facial Mapping entered legal vocabulary to describe the process, and early cases relied on a relatively informal methodology that varied between practitioners. The reliability of the field was contested, and courts struggled with how to weigh evidence that lacked the population-frequency databases underpinning fingerprint and DNA work.
In Australia, similar debates played out, with the Bluey Clark case in the 1990s among the early examples of facial image evidence being tested in court. In the United States, law-enforcement interest was driven partly by the expansion of photographic surveillance and partly by post-9/11 investment in biometric identification. The FISWG was established in 2008, bringing together facial comparison practitioners, cognitive psychologists, and computer scientists. Its published guidelines in 2012 and subsequent updates established a methodology built around documentation, sequential unmasking, and a structured conclusion vocabulary.
Every facial feature is a data point. The comparison is building a table, feature by feature.
Morphological facial comparison is systematic in a way that resembles fingerprint ridge analysis: the examiner records the characteristics of a defined set of anatomical regions and then assesses whether the pattern of correspondences and differences between two images is what you would expect if the faces are the same person, or if they are different people. The regions examined follow anatomical logic.
The examiner records whether features show correspondence, difference, or are indeterminate (due to pose, resolution, lighting, or age difference between images). A difference in a single feature does not automatically mean exclusion, because the same person photographed years apart, at different angles, or under different lighting conditions will show apparent variation. The comparison matrix as a whole drives the conclusion, not any single outlier.
Humans are remarkably good at face recognition. The question is whether that ability transfers reliably into forensic casework.
Face recognition is one of the most heavily studied areas of human perception. A substantial body of research from cognitive psychology shows that people, including trained examiners, process faces holistically: the brain encodes a face as a unified pattern, not as a list of features assembled at runtime. This is why inverting a face dramatically impairs recognition, and why a feature that looks right in isolation can look wrong once the rest of the face changes around it.
In forensic facial comparison, holistic assessment serves as a check on the morphological feature matrix. After working through the anatomy, an examiner also forms an overall impression: does the general facial gestalt fit with the two images being the same person? This step can catch things the feature-by-feature pass misses, particularly when subtle proportional relationships across the face carry identity signal that no single feature captures alone. But it is also where examiner subjectivity is highest, which is why FISWG guidelines require that the holistic assessment supplements rather than replaces the documented feature analysis.
| Approach | Strength | Weakness |
|---|---|---|
| Morphological (feature-by-feature) | Explicit, documentable, reviewable step by step | May miss identity signal carried by proportional relationships across features |
| Holistic assessment | Captures gestalt identity signal, mirrors natural face processing | Susceptible to examiner bias; hard to audit or reproduce |
| Combined (FISWG best practice) | Systematic documentation plus perceptual synthesis | Requires more training and takes longer per case |
Research by David White, Richard Kemp, and colleagues at UNSW and other institutions has repeatedly shown that even trained facial comparison examiners make errors when working with poor-quality images. Accuracy drops significantly when the two images to be compared differ in pose, lighting, or age. This research has been influential in pushing the field toward formalised methodology and quality controls, and in reinforcing that facial comparison evidence should be presented with explicit statements of its limitations.
The ear is sometimes the only face visible on CCTV, and its morphology is surprisingly individual.
The external ear, or pinna, has a complex three-dimensional structure made up of cartilaginous ridges and hollows that are largely set by genetics and remain relatively stable through adult life. The helix, antihelix, tragus, antitragus, concha, and lobule all show variation across individuals, and the overall shape and size of the ear can be assessed from a two-dimensional image if the angle of view is favourable.
Ear comparison entered forensic literature in the late 1990s, and Dutch criminologist Cornelis van der Lugt was among the early systematic researchers. In the UK, earprint evidence was used in the Dallagher case (2002), where it was accepted that an individual earprint could be compared against a suspect's ear. The evidentiary value of ear comparison on its own remains debated, and the field acknowledges that the population-frequency data for ear morphology are thin compared with fingerprint or DNA databases. In practice, ear comparison is most useful when the face is fully obscured or when the ear happens to be captured clearly on surveillance footage that shows the face at an unhelpful angle.
When a face cannot be compared feature by feature, metric geometry sometimes fills the gap.
Photo-anthropometric overlay is a technique that places two images in alignment and measures the proportional relationships between anatomical landmarks: the intercanthal distance, the alar width, the mouth width, the distance between pupil centres, and several other fixed-point pairs. The examiner converts these measurements to ratios that are independent of absolute image scale, then compares the ratio profiles between the reference and questioned images.
The approach was developed partly from craniofacial anthropometry, which has a long history in physical anthropology and medical diagnosis. In a forensic context it has appeal because the output is numerical and appears objective. But the technique carries important caveats. Accurate landmark localisation requires adequate image resolution, and even small errors in placing a landmark can shift ratio values substantially. More critically, all metric comparisons are sensitive to camera geometry: a face photographed with a telephoto lens at a distance will have systematically different apparent proportions than the same face photographed with a wide-angle lens at close range. Correcting for this requires knowing or estimating the camera and its focal length, which is rarely possible from crime-scene footage.
When the geometry can be controlled, for instance when measurements are taken from calibrated photography of a suspect alongside the crime-scene image, anthropometric overlay adds a useful quantitative dimension to the comparison. When the geometry is uncertain, the numerical output can create a false impression of precision, and examiners are expected to state clearly which conditions applied.
A seven-point verbal scale that tries to put probability into plain language.
One of FISWG's most practically important contributions was a standardised conclusion vocabulary. Before this, different examiners used phrases like consistent with, cannot be excluded, and highly probable in ways that courts and lawyers interpreted inconsistently. FISWG proposed a scale anchored at both ends by strong conclusions and with graduated intermediate levels, structured to communicate the weight of the evidence in a way that does not overstate certainty.
The scale is not a probability scale in the statistical sense: there are no likelihood ratios attached to each level, and the verbal labels are not mapped to fixed numerical ranges. What the scale does do is force an examiner to commit to a position rather than hiding behind deliberately vague language, and it gives courts a framework to understand the relative weight of positive and negative conclusions. FISWG guidelines also require that any conclusion be accompanied by an explanation of what features drove it and what limitations affected the analysis.
Which of the following best describes the difference between Facial Mapping and Facial Image Comparison?
Test yourself on Forensic Audio, Video and Image Analysis with free, timed mocks.
Practice Forensic Audio, Video and Image Analysis questionsSpotted an error in this page? Report a correction or read our editorial standards.