Practice with national-level exam (FACT, FACT Plus, NET, CUET, etc.) mocks, learn from structured notes, and get your doubts solved in one place.
Single-nucleotide polymorphism panels as the second-generation forensic marker class: identification SNPs (the FORENSeq DNA Signature Prep, the SNPforID consortium), ancestry-informative markers, externally-visible-characteristic panels (HIrisPlex-S for eye and hair colour), and the Parabon Snapshot phenotyping pipeline that drew faces from crime-scene DNA.
Last updated:
In the first decade of forensic DNA typing, the central question was always the same: does this profile match that individual? STR profiling answered it with remarkable statistical power. But there is a class of investigative problem where there is no suspect to match, no reference sample to compare, and no database hit to work from. The body recovered from a river carries an unknown identity. The crime-scene swab from an unsolved murder has been in evidence storage for a decade, every database upload has returned nothing, and the investigation is cold. In these situations, the analyst needs a different question: not "does this profile match?" but "what can the DNA itself tell us about who this person was?"
Single-nucleotide polymorphisms, the most common class of genetic variation in the human genome, have made that second question answerable. SNPs are biallelic variants at specific genome positions, where one base is substituted for another in a fraction of the population. The human genome contains over 10 million SNPs with minor allele frequencies above 1%, and their distribution across populations, gene pathways, and trait-associated loci means they can be mined for three distinct forensic purposes: identification (SNPs that, in aggregate, match an individual's STR-level discrimination); ancestry inference (ancestry-informative markers whose allele frequencies differ sharply between continental populations); and externally visible characteristic (EVC) prediction (variants in pigmentation, morphology and metabolic genes that predict observable traits from DNA alone).
The convergence of massively parallel sequencing (MPS) technology with curated SNP panels has moved all three applications from research curiosities into operational forensic tools over the past decade. In the United States, Parabon NanoLabs' Snapshot phenotyping service has generated investigative leads in hundreds of cases since 2015. The HIrisPlex-S panel, developed at Erasmus University Medical Center Rotterdam and validated across European, US and Asian populations, has been used in German, Dutch, Polish, and UK casework. The FORENSeq DNA Signature Prep kit from Verogen types a combined STR and SNP panel in a single MPS run. And in India, the forensic genetics community has published population frequency data for ancestry-informative SNPs from Indian subcontinent populations that support this class of analysis in Indian casework.
A single-base difference at one genome position carries almost no information on its own, but the same logic that lets 20 STR loci individualise a person lets several hundred carefully chosen SNPs answer questions no STR panel can.
Test yourself on Forensic Biotechnology with free, timed mocks.
Practice Forensic Biotechnology questionsA single-nucleotide polymorphism is a position in the genome where two alternative bases (typically A/G, A/T, C/G, or C/T) are both present in the population above a threshold frequency, conventionally 1% or more for the minor allele. There are approximately 10 million such positions in the human genome. Because SNPs are biallelic (only two alleles per position in most cases), a single SNP contributes very little discriminating information: a locus where 30% of the population carries the minor allele distinguishes far fewer individuals than a hypervariable STR with 15 common alleles. The forensic power of SNP panels comes from multiplicity: 50, 100, or 300 SNPs typed simultaneously, each providing independent information.
Three functionally distinct SNP classes underpin the three forensic applications:
Identity SNPs are selected for maximum discriminating power across all human populations, analogous to the STR loci in CODIS: they have high heterozygosity globally, are minimally correlated with each other, and are not associated with any trait or phenotype. A 50-SNP identity panel approaches the discriminating power of a 10-15 locus STR multiplex. Because SNP amplicons can be designed as short as 50-70 base pairs (flanking a single variable position), identity SNPs are particularly valuable for degraded samples where STR amplicon sizes of 100-400 base pairs fail.
Ancestry-informative markers (AIMs) are SNPs whose allele frequencies differ substantially between continental population groups, typically with an FST value (fixation index, a measure of allele-frequency differentiation) above 0.4 or 0.5. A panel of 100-200 such AIMs can assign an unknown sample to a continental ancestry cluster (European, African, East Asian, South Asian, Native American) with high accuracy, and can resolve finer-scale ancestry structure with larger panels. This is not a determination of race (a social construct) but a probabilistic statement about the population group with which the contributor's genome most closely aligns, which carries investigative value in narrowing a search space.
Externally visible characteristic (EVC) SNPs are variants in genes associated with observable physical traits: pigmentation genes (MC1R, OCA2, HERC2, IRF4, SLC45A2, SLC24A5, TYR and others) for eye, hair, and skin colour; other trait loci for height, age-related facial morphology, and male-pattern baldness. EVC analysis does not identify an individual; it predicts what an unknown contributor probably looked like, supporting facial approximation and investigative leads.
The commercial kit that brought identity SNPs into operational forensic practice combines STR and SNP typing in a single MPS run, and the consortium that selected the underlying SNP set applied rigorous population-genetic criteria that the broader forensic community has since validated.
The SNPforID consortium, a European-funded research collaboration involving forensic genetics laboratories in Denmark, Germany, Spain, Portugal, the United Kingdom, the Netherlands, and Poland, published in 2004 and 2006 a validated set of 52 identity SNPs selected for maximum discriminating power across European, African, and East Asian populations. The SNPforID 52-plex is the foundational identity-SNP panel for forensic use and was the first to be validated for casework at the discriminating power required by SWGDAM and ENFSI standards. The combined discrimination capacity of the 52-plex approaches 1 in 10^17 for a European population sample, comparable to a 16-locus STR multiplex. Critically, because SNP amplicons were designed at 50-60 base pairs, the panel amplifies from templates that are too degraded for conventional STR typing.
The FORENSeq DNA Signature Prep kit (Verogen, acquired from Illumina Forensics in 2019) is the primary commercial platform that operationalises identity-SNP typing. The FORENSeq kit types, in a single MPS library preparation and sequencing run, 27 autosomal STRs, 7 X-STRs, 23 Y-STRs, and 94 identity SNPs, plus ancestry-informative markers (94 AIMs) and 56 phenotype SNPs. The entire panel is sequenced on the Illumina MiSeq FGx instrument, and the Universal Analysis Software (UAS) calls alleles, flags quality thresholds, and generates a results file in the format required for CODIS compatibility.
Validation studies on the FORENSeq kit have been published by laboratories in the US (University of North Texas Health Science Center), Germany (Bundeskriminalamt), the Netherlands (Netherlands Forensic Institute, NFI), and the UK (King's College London). NFI in particular has been a leader in MPS-based forensic genotyping and has presented casework statistics at multiple ISFG conferences. The UK Forensic Science Regulator's guidance on MPS for forensic DNA analysis (published in 2021) provides the quality-assurance framework within which FORENSeq validation studies must be conducted before casework deployment.
In India, the Forensic Science Laboratory at Gandhinagar and the CFSL New Delhi have published pilot studies on MPS-based forensic typing, and the DNA Technology Bill 2019 provisions would, if enacted, require accredited laboratories deploying MPS platforms to maintain validation documentation meeting ISO/IEC 17025 Annex A standards. The Investigator 24plex QS kit (Qiagen), which is standard in many European and Indian operational labs for autosomal STR, does not include SNP typing; FORENSeq or equivalent platforms represent the next-generation upgrade for laboratories that need SNP capability in a single run.
Ancestry prediction from crime-scene DNA is the most contested of the three SNP applications, partly because the forensic community overlearned population-genetics terminology and partly because courts have had to develop new frameworks for how intelligence-use evidence is disclosed.
Ancestry-informative markers are SNPs whose allele frequencies differ substantially between continental or regional population groups. The canonical example is the SLC24A5 variant rs1426654, where the derived A allele reaches near-fixation in European and South Asian populations (frequency approximately 98% in Europeans, 90-95% in South Asians) but has a frequency of approximately 10-15% in West African populations. A single such marker provides useful ancestry information. A panel of 100-200 AIMs provides probabilistic continental ancestry assignment with accuracies exceeding 95% for separating African, European, and East Asian population clusters.
The Precision ID Ancestry Panel from ThermoFisher amplifies 165 AIMs in a single multiplex, and the FORENSeq AIM set (94 markers) provides sufficient resolution for broad continental and some regional ancestry inference. Research panels, such as those used in the Human Genome Diversity Project and published by Rosenberg et al. in Science in 2002, demonstrated the feasibility of ancestry inference from SNPs across the full range of world populations. The forensic AIM panels are adapted from these larger academic resources to a casework-practical size.
The forensic-legal framework for ancestry results has developed differently across jurisdictions. In the United States, the FBI's OSAC (Organization of Scientific Area Committees) has issued standards for forensic DNA phenotyping, treating ancestry as an intelligence-use tool subject to the same disclosure obligations as other investigative leads. The NIJ-funded work on forensic genetics and genetic ancestry, including publications by Shriver and colleagues at Penn State, established the scientific foundation. In Germany, the German genetic-profiling law (Gendiagnostikgesetz) was amended in 2019 to permit DNA intelligence analysis including ancestry and phenotype prediction for serious crimes, a significant legislative step that followed years of academic debate among German forensic geneticists. In the United Kingdom, the Forensic Science Regulator's 2021 guidance on DNA intelligence analysis provides a framework distinguishing intelligence-grade and evidence-grade DNA results, with ancestry and phenotype in the intelligence category.
In India, the DNA Technology Bill 2019 does not specifically address ancestry or phenotype analysis, treating forensic DNA primarily in the context of identification and database matching. The CFSL and state FSL laboratories have not published operational protocols for ancestry analysis, though academic research on Indian population structure using AIM panels is well developed. The population genetic structure of South Asian populations, which reflects the subcontinent's complex demographic history of Indo-Aryan, Dravidian, tribal, and Austro-Asiatic ancestries, requires India-specific AIM panels rather than panels calibrated for European/African/East Asian distinction, and this remains an active research gap.
A panel of 41 SNPs built on two decades of pigmentation genetics delivers probabilistic colour predictions validated across European, African, East Asian, and South Asian populations, and the predictions have reached courtrooms in Germany, the Netherlands, Poland, and Australia.
HIrisPlex-S is an externally visible characteristic prediction tool developed by Susan Walsh and Manfred Kayser at Erasmus University Medical Center Rotterdam, expanding on the earlier IrisPlex (eye colour) and HIrisPlex (eye and hair colour) panels. The current HIrisPlex-S panel, published in Forensic Science International Genetics in 2017, uses 41 SNPs to predict eye colour (blue, intermediate, brown), hair colour (blond, brown, red, black), and skin colour (very pale, pale, intermediate, dark, dark to black). The underlying variant set draws on genome-wide association studies of pigmentation traits in populations spanning Europe, Africa, East Asia, and South Asia.
The prediction model is a multinomial logistic regression, producing a probability for each colour category rather than a single deterministic call. For a crime-scene sample, the HIrisPlex-S output might read: brown eyes 0.92 probability, intermediate 0.07, blue 0.01; and brown hair 0.85, blond 0.10, dark brown 0.04, red 0.01. The Rotterdam group maintains a free web tool (hirisplex.erasmusmc.nl) that accepts SNP genotypes and returns probability distributions, available to any accredited laboratory.
Validation of HIrisPlex-S across non-European populations is a critical issue. Published validation studies have tested the model on South Asian samples from Pakistan, Bangladesh, and India (Bhardwaj et al., 2019, International Journal of Legal Medicine), finding acceptable accuracy for eye colour but noting that the South Asian hair colour distribution (predominantly dark brown to black) means the panel's discriminating power for hair colour is limited in South Asian populations. Similar validation limitations apply to the Precision ID Ancestry Panel when applied to finer-scale South Asian ancestry structure. The forensic community's standard is that an EVC model must be validated on a population that includes samples from the relevant ancestry group before it is used in casework for individuals of that ancestry.
In Germany, the Gendiagnostikgesetz amendment (2019) explicitly permits the use of EVC analysis in serious-crime investigations. The Bundeskriminalamt published guidance in 2020 on how HIrisPlex-S results should be presented in investigative reports, as intelligence information that generates a lead rather than identification evidence. In the Netherlands, the NFI has used HIrisPlex-S in casework since 2012 and has published on the results in multiple Dutch criminal cases admitted under the Nederlandse Strafvordering evidentiary framework. In the United Kingdom, the Forensic Science Regulator's guidance permits EVC analysis as an intelligence product in serious unsolved cases where conventional DNA profiling has not generated a lead. In Poland, legislative amendments to the DNA evidence law in 2018 permit EVC analysis and AIMs analysis in serious crimes, making Poland one of the more permissive European jurisdictions. Australia (through state legislation in Queensland, Western Australia, and New South Wales) permits physical characteristics DNA analysis for serious crimes.
The most publicly prominent SNP phenotyping application is an American commercial service that has generated sketches of unknown contributors from cold-case DNA in hundreds of US and international cases, and its use has forced prosecutors, defence attorneys, and forensic regulators to clarify the evidentiary status of trait prediction from DNA.
Parabon NanoLabs, based in Reston, Virginia, launched its Snapshot DNA Phenotyping Service in 2015. The service accepts a DNA extract from a crime-scene sample (minimum 1 nanogram), genotypes several thousand SNPs from the Illumina HumanOmniExpress array or from targeted panels, and applies a machine-learning model trained on the Human Genome Diversity Project and Global Biobank datasets to predict a composite set of facial features including eye colour, hair colour, skin colour, face shape and texture, and freckling. The output is a computer-generated composite face image, a portrait of the most probable appearance of the contributor, accompanied by probability distributions for each predicted trait.
Snapshot has generated investigative leads in several high-profile US cold cases. In the Anne Marie Fahey murder (Delaware, 1996, reopened 2019), DNA from the crime scene was submitted to Parabon; the resulting composite did not identify the perpetrator (who had already been convicted) but demonstrated the technique's operational capability. In the Christy Mirack case (Lancaster, Pennsylvania, 1992, solved in 2018 through genealogy), Parabon-generated composites circulated publicly before the genealogical match was made via GEDmatch, and the composite was described as having been consistent with the suspect's appearance. In a Florida Jane Doe case (Pinellas County, 1988, identified in 2022), Snapshot analysis was combined with isotope analysis and forensic genetic genealogy to generate enough investigative leads for eventual identification.
The scientific validity of Snapshot's full-face composite has been contested in the forensic genetics literature. Kayser and colleagues at Erasmus, who published the HIrisPlex-S validation work, noted in a 2018 commentary in Forensic Science International that face prediction from DNA is at an earlier validation stage than eye or hair colour prediction, and that the published error rates for the composite-face component are not available for independent peer review. Parabon's response has been that Snapshot is an investigative tool, not an identification tool, and that the composites are used to generate leads, not to identify individuals in court.
The evidentiary status of Snapshot composites differs across jurisdictions. In the United States, Snapshot results have been used in investigations but have not been offered as identification evidence in court; the composites are released to the public or used internally as investigative intelligence. In the United Kingdom, the FSR guidance would classify a Snapshot composite as an intelligence product requiring the same disclosure standards as other DNA intelligence products. In Germany, the Gendiagnostikgesetz framework permits facial characteristic DNA analysis in serious crime investigations but requires that the results be presented as probabilistic information. In India, the admissibility framework under the Bharatiya Sakshya Adhiniyam 2023 (BSA, Section 39 on opinion evidence) has not yet been tested on SNP phenotyping; no published Indian casework uses Snapshot or equivalent phenotyping.
SNP phenotyping's operational future depends on three things that the field is still building: validated population-frequency databases for non-European populations, regulatory frameworks that govern how results are disclosed and used, and public trust that the technology is being applied proportionately.
Identification SNPs have their statistical grounding in frequency databases analogous to those used for STRs. The SNPforID consortium published population frequency data for the 52-plex across European, Asian, and African samples. The 1000 Genomes Project and the gnomAD database (Genome Aggregation Database, maintained at the Broad Institute) provide allele frequencies for millions of SNPs across major population groups at the scale needed for reliable random-match probability calculations. The FORENSeq Universal Analysis Software uses gnomAD-derived allele frequencies for its identification-SNP calculations, with the proviso that the user selects the appropriate population group (European, African, East Asian, Latino, South Asian, or other).
For ancestry AIMs and EVC SNPs, the validation challenge is scale and representativeness. Published panels work well for the populations on which they were trained. European-trained EVC models have documented lower accuracy on South Asian and East Asian samples for hair and skin colour. GWAS data from Indian populations (the INDIGEN project, the GenomeAsia 100K initiative) are beginning to provide the dataset foundation for India-specific validation of AIM and EVC panels. The ISFG Forensic Genetics Policy Initiative has called for mandatory population-specific validation before any EVC or AIM panel is used in casework for individuals of underrepresented ancestry.
The ethical constraints on SNP phenotyping are real and documented. A composite portrait released to the public is a probabilistic likeness, not a photograph: if the underlying model has a 15% error rate for skin tone in a given population, 15% of released composites will show the wrong skin tone, with implications for false suspicion and racial profiling. Several US civil-liberties organisations, including the ACLU, have published critiques of Parabon Snapshot's use in communities of colour. The ENFSI DNA Working Group guidance on EVC analysis requires that reports include explicit accuracy statements and that agencies using the results implement safeguards against discriminatory use. In India, the Personal Data Protection-aligned provisions of the DNA Technology Bill 2019 (if enacted) would require informed consent for phenotype analysis in certain non-criminal contexts.
| SNP application | Panel example | Loci | Forensic output | Court status | Key database |
|---|---|---|---|---|---|
| Identification | SNPforID 52-plex; FORENSeq 94-ID | 50-100 | Discrimination capacity ~10^17; aids degraded-sample ID | Evidence-grade (identification) | gnomAD; 1000 Genomes; SNPforID population data |
| Ancestry (AIMs) | Precision ID Ancestry 165; FORENSeq 94-AIM | 100-200 | Continental or regional population assignment; probability distribution | Intelligence use only | HGDP; gnomAD; 1000 Genomes; published AIM papers |
A forensic examiner receives a degraded soil-burial sample from which conventional STR multiplex typing (amplicon range 100-400 bp) has failed completely. The laboratory holds a FORENSeq-validated MPS workflow. Which SNP class would provide the strongest identification evidence from this sample?
| EVC (phenotype) |
| HIrisPlex-S 41-SNP; Snapshot array |
| 41-thousands |
| Predicted eye/hair/skin colour, facial traits; probability per category |
| Intelligence use only |
| hirisplex.erasmusmc.nl; Parabon (proprietary) |