Practice with national-level exam (FACT, FACT Plus, NET, CUET, etc.) mocks, learn from structured notes, and get your doubts solved in one place.
What an examiner does when three contributors appear in one swab: the classical mixture interpretation rules (number of contributors, allele sharing, peak-height ratios, stochastic threshold) and the probabilistic genotyping software (STRmix, TrueAllele, EuroForMix, LRmix Studio) that now handles 2-3 person mixtures across the US, UK, Australia and EU labs.
Last updated:
A single swab from a steering wheel, a ligature, or a sexual assault evidence kit can carry the DNA of two, three, or four contributors. The resulting electropherogram shows a thicket of peaks at many loci, some loci with three or four alleles where a single contributor would show at most two, and peak heights that encode the relative proportion of each contributor's DNA in the mixture. Untangling that trace to assign a meaningful evidential weight, the likelihood ratio that compares the probability of the mixture if the suspect is a contributor against the probability if an unknown person is the contributor in the suspect's place, is the central analytical challenge of modern forensic DNA casework.
Classical binary mixture interpretation, the approach taught and applied through the 1990s and much of the 2000s, asked the examiner to identify a major contributor profile by selecting the tallest peaks at each locus, then to determine whether a person of interest's profile was included or excluded from the remaining peaks. This approach was workable for two-contributor mixtures with a clear major-minor ratio, but it produced unreliable or uninterpretable results for balanced mixtures, mixtures with three or more contributors, and samples with stochastic effects from low template. Courts in the United States (particularly in New York, Maryland, and Virginia) and the United Kingdom began questioning binary interpretation outcomes from the mid-2000s onwards.
Probabilistic genotyping (PG) software replaced or supplemented binary interpretation in most accredited forensic DNA laboratories across the US, UK, Australia, and the EU through the 2010s. The leading platforms, STRmix (developed by the New Zealand Institute of Environmental Science and Research (ESR) and jointly maintained with the Australian Federal Police), TrueAllele (Cybergenetics, Pittsburgh), EuroForMix (developed by the Norwegian Institute of Public Health), and LRmix Studio (Leiden University Medical Center, Netherlands), use different statistical engines but share the same goal: to compute a likelihood ratio for a continuous model that incorporates peak height variability, stutter, drop-out probability, and allele sharing across all possible contributor combinations simultaneously, rather than requiring the examiner to select peaks by eye.
Before software took over, examiners read mixtures the way musicians read a score: by eye, with training, and with rules that held up until the complexity reached a threshold the rules couldn't handle.
Test yourself on Forensic Biotechnology with free, timed mocks.
Practice Forensic Biotechnology questionsClassical mixture interpretation follows a defined sequence of analytical steps. The examiner first estimates the number of contributors (NoC) by counting the maximum number of alleles observed at any single locus. A locus with four alleles cannot have fewer than two contributors (two diploid individuals can contribute at most four distinct alleles at a locus). A locus with three alleles could have two contributors if one allele is shared. The highest allele count across all loci gives a minimum NoC; consensus requires agreement across the majority of informative loci.
Once NoC is established, the examiner assesses the mixture ratio by comparing peak heights or peak areas at loci where major and minor contributors' alleles do not overlap. A major:minor ratio of 4:1, for example, means the major contributor's alleles are approximately four times taller than the minor contributor's alleles. The examiner can then identify the probable major contributor genotype by selecting the tallest two alleles at each locus (or applying the mixture ratio to assign alleles), and can determine whether a person of interest's profile is consistent with being either the major or minor contributor.
The stochastic threshold (ST) plays a critical role in classical mixture interpretation. If a minor contributor's allele is close to or below the ST, it may have dropped out in some amplifications, making the mixture appear to have fewer contributors than it does. The examiner must acknowledge potential drop-out at loci below the ST. Conversely, if drop-in contamination introduces an allele, it may falsely inflate the apparent contributor count or be misassigned to a contributor's genotype.
The limitation of classical interpretation is stark in balanced two-contributor mixtures (ratio close to 1:1) and in all three-contributor mixtures. In a balanced mixture, every allele could be assigned to either contributor, and the number of possible deconvolutions grows combinatorially with the number of alleles. This ambiguity drove the development of probabilistic genotyping.
Instead of asking which alleles belong to which contributor, continuous probabilistic genotyping asks how probable it is that the entire trace arose from a given combination of genotypes.
Probabilistic genotyping in its continuous form models the electropherogram as a probabilistic output of a specified number of contributors with specified mixture proportions and specified genotypes. The model parameters include: the number of contributors, the mixture weight (proportion of each contributor's DNA), the mean peak height for a single allele copy (the gamma model parameter), the stutter ratio at each locus, and the probabilities of drop-out and drop-in. Given these parameters and a proposed set of contributor genotypes, the model computes the probability of observing the actual peak heights seen in the electropherogram.
To compute a likelihood ratio for a person of interest (POI), the software evaluates: (1) the probability of the observed electropherogram given that the POI is a contributor (with the remaining contributors unknown), and (2) the probability of the observed electropherogram given that the POI is not a contributor (with all contributors unknown). The ratio of these two probabilities is the LR. Because the unknown contributors' genotypes are drawn from allele frequency databases, the LR integrates over all possible genotype combinations weighted by their population frequency.
This continuous model approach means that no allele is simply "included" or "excluded". An allele that appears to be absent from a locus may have dropped out with a computable probability; an allele at the stutter position may be stutter or genuine allele with computable relative probabilities. Every observed peak, including stutter, low-height peaks near the AT, and off-ladder peaks, contributes to the likelihood calculation. This is what distinguishes continuous PG from the simpler semi-continuous models (such as early LRmix Studio versions) that categorise alleles as present or absent rather than modelling their continuous heights.
Developed in New Zealand and now licensed to over a hundred laboratories on four continents, STRmix is the probabilistic genotyping platform that most examiners in the English-speaking world now encounter first.
STRmix was developed by Dr John Buckleton and colleagues at the New Zealand Institute of Environmental Science and Research (ESR) in Auckland, first published in 2013. It implements a continuous probabilistic genotyping model using Markov chain Monte Carlo (MCMC) sampling to integrate over the space of possible contributor genotype combinations. STRmix has been validated for two, three, and (in more recent versions) four-contributor mixtures, for all major STR multiplex kits including GlobalFiler, PowerPlex Fusion, Investigator 24plex, and Investigator ESSplex SE QS.
As of 2023, STRmix is in operational use in the FBI (which completed its internal validation in 2015), the UK's Forensic Science Regulator-accredited laboratories (LGC Forensics, Eurofins Forensics UK), the Australian Federal Police, all Australian state and territory forensic laboratories, the Royal Canadian Mounted Police (RCMP), and more than seventy US state crime laboratories. This breadth of adoption means that STRmix outputs are regularly presented in courts across these jurisdictions, and a substantial body of case law has now accumulated on challenges to STRmix LR outputs.
Several US cases have tested STRmix admissibility. In United States v. Dorsey (E.D. Pa. 2020), the court held that STRmix met the Daubert standard for scientific reliability. In North Carolina v. Swearing (2022), an appeal challenged STRmix validation data and the court upheld admissibility. In all jurisdictions, courts have consistently required that the laboratory's STRmix validation study (covering the specific kit, instrument, and analyst conditions in use) be made available for defence expert review. The New York State Police Forensic Investigation Center and the FBI have both faced successful defence challenges not to STRmix's admissibility per se, but to the adequacy of the validation data produced in discovery, leading to additional validation reporting requirements.
Three platforms developed independently in Pittsburgh, Oslo, and Leiden represent the diversity of approaches that converge on the same goal: a defensible LR from a complex mixture.
TrueAllele (Cybergenetics, Pittsburgh, PA) is the US-developed alternative continuous probabilistic genotyping platform. Developed by Dr Mark Perlin, TrueAllele uses a continuous model similar in concept to STRmix but with a different parameterisation and a different MCMC implementation. It has been in forensic use since approximately 2009 and has been admitted in evidence in courts across the United States, including a landmark early admissibility hearing in Pennsylvania v. Foley (Allegheny County, 2012), where TrueAllele's model was examined by an independent statistician appointed by the court. TrueAllele is available as a cloud-based service, where raw data are uploaded to Cybergenetics' servers for analysis, a workflow that has raised chain-of-custody and data integrity questions in several cases that Cybergenetics has addressed through an audit trail system.
EuroForMix (developed by Oyvind Bleka and colleagues at the Norwegian Institute of Public Health, with contributions from Hinda Haned) is an open-source continuous probabilistic genotyping platform distributed as an R package. Its open-source nature makes it the preferred platform for academic forensic statistics research and for laboratories that want to run independent validation studies. EuroForMix is in operational use in laboratories in Norway, Sweden, Denmark, the Netherlands, Belgium, and several other EU member states. The ENFSI DNA Working Group has published comparative validation data for EuroForMix, STRmix, and LRmix Studio.
LRmix Studio (Hinda Haned, Leiden University Medical Center) was the first widely-adopted probabilistic genotyping tool used in European forensic laboratories and preceded the fully continuous platforms. Its original model is semi-continuous (discrete allele presence/absence rather than modelled peak heights). LRmix Studio was used by the Netherlands Forensic Institute (NFI) and a number of other European laboratories from approximately 2012 and has been presented in courts in the Netherlands, Belgium, and the UK. Its replacement by fully continuous platforms has been gradual; some laboratories continue to use LRmix Studio for lower-complexity two-contributor mixtures where the semi-continuous model performs comparably.
| Platform | Model type | Developer | Key jurisdictions | Open source |
|---|---|---|---|---|
| STRmix | Continuous (MCMC) | ESR, NZ / AFP, AU | US (FBI, 70+ state labs), UK, AU, NZ, CA | No (source code under controlled access since 2021) |
| TrueAllele | Continuous (MCMC) | Cybergenetics, US | United States (state and federal courts) | No |
| EuroForMix | Continuous (MCMC) | Norwegian Institute of Public Health | Norway, Sweden, Denmark, Netherlands, Belgium | Yes (R package on GitHub) |
| LRmix Studio | Semi-continuous (discrete) | Leiden University Medical Center, NL | Netherlands, Belgium, UK (earlier use) | Yes (R package) |
| TrueAllele (cloud) | Continuous (MCMC) |
The LR a probabilistic genotyping platform outputs for a four-contributor mixture from a touch DNA swab deserves a different kind of confidence than the LR from a two-contributor sexual assault reference sample, and courts are beginning to notice that difference.
Probabilistic genotyping platforms are not uniformly reliable across all levels of mixture complexity. Validation studies for STRmix and EuroForMix show that LR performance (measured by the rate of false inclusions and false exclusions, and by the correct LR distribution for known contributors and non-contributors) is strong for two-contributor mixtures and good for three-contributor mixtures in most mixture ratio and template conditions. For four-contributor mixtures, particularly those with balanced proportions and low template, LR performance degrades: confidence intervals on the LR widen substantially, the rate of false inclusions increases, and the LR magnitude may be unreliable.
SWGDAM's 2015 guidelines on probabilistic genotyping recommend that laboratories define their validated mixture complexity limits and state those limits explicitly in case reports. A laboratory that has validated STRmix for two and three contributors only should not use it to compute an LR for a four-contributor mixture and then report that LR without qualification. In England and Wales, the Forensic Science Regulator's Codes of Practice and Conduct require that validation scope is disclosed in the expert statement, a requirement that emerged partly from the Jama case (R v. Jama, 2008, Victoria, Australia, where a DNA result was used outside the laboratory's validated parameters, though that case involved a different kind of error than mixture complexity).
The LR threshold for reporting in court is itself a contested issue. SWGDAM recommends that any LR result below a laboratory-defined threshold (often 10, representing 10:1 odds in favour of the prosecution hypothesis) should be treated as inconclusive rather than weakly inculpatory. Some UK laboratories use a verbal equivalence scale: LR above 1 million is "very strong support", LR between 100 and 1 million is "strong support", LR between 10 and 100 is "moderate support", and LR below 10 is "limited support" or inconclusive. This scale was published by Buckleton et al. and has been adopted in modified form by several European national forensic institutes.
When a mixture profile from a crime scene in one country is compared against a reference sample collected in another, the probabilistic genotyping platform, the population database, and the validation scope all become questions of international forensic procedure.
Cross-border mixture casework is increasingly common in serious crime and terrorism investigations. The INTERPOL DNA Gateway currently accepts only single-source or deduced single-source profiles for its automated hit/no-hit database search. Mixture profiles are transmitted as case intelligence rather than as database search queries, meaning the probabilistic genotyping analysis is typically performed in the country where the evidence was collected, and the resulting LR is shared as part of the case file.
The 2015-2016 Paris and Brussels terrorist attacks generated complex forensic evidence from multiple crime scenes in France and Belgium, with reference samples taken from suspects and their associates in several EU countries. Belgian and French forensic laboratories ran probabilistic genotyping analyses using STRmix and LRmix Studio respectively, and the resulting LR values formed part of the evidence package for coordinated prosecutions across EU jurisdictions under the European Arrest Warrant framework. The case illustrated both the utility and the procedural complexity of cross-border mixture evidence: each laboratory's LR was computed using its own validated parameters and population database, creating results that were not directly comparable and required expert reconciliation.
In India, the forensic DNA landscape for complex mixture casework is still developing. The CFSL (CBI) and the FSL (Delhi) have published validation data for STR multiplex kits including Investigator 24plex and PowerPlex Fusion, but probabilistic genotyping software adoption is not yet uniformly reported across Indian state FSLs. The DNA Technology Bill's regulatory framework includes provisions for approved analytical methods, which in practice would need to address PG software validation as the field matures.
An electropherogram at locus D18S51 shows four alleles: 14, 16, 17, and 19. The minimum number of contributors to this mixture is:
| Cybergenetics, US |
| US laboratories using cloud submission |
| No |