Practice with national-level exam (FACT, FACT Plus, NET, CUET, etc.) mocks, learn from structured notes, and get your doubts solved in one place.
How the Golden State Killer changed forensic biotech: massively parallel sequencing of STRs and SNPs (Verogen MiSeq FGx, Ion S5), the DeAngelo 2018 investigative-genetic-genealogy workflow, the GEDmatch and FamilyTreeDNA opt-in databases, third-cousin-network triangulation, and the US Department of Justice 2019 IGG interim policy that put guard-rails on the technique.
Last updated:
On 24 April 2018, Sacramento law enforcement arrested Joseph James DeAngelo, a 72-year-old retired police officer, for a series of murders and sexual assaults committed across California between 1974 and 1986. The breakthrough came not from a database hit in the state's criminal CODIS system, where DeAngelo's profile had never appeared, but from a genealogical profile uploaded by a genetic-genealogy investigator to GEDmatch, an open public database populated by people who had used consumer genetic testing kits. The investigators matched crime-scene DNA to distant relatives of DeAngelo in the GEDmatch pool, reconstructed his family tree to identify candidate individuals born in the right era and geography, and confirmed the identification with a discarded DNA sample collected surreptitiously from DeAngelo's vehicle. The case changed forensic science.
The technical underpinning of that investigation was massively parallel sequencing (MPS), also called next-generation sequencing (NGS) in its forensic application. MPS allowed the crime-scene DNA to be converted into a genome-wide single-nucleotide polymorphism (SNP) profile resembling the output of a consumer autosomal ancestry kit, rather than the 20-locus STR profile used in criminal databases. Consumer ancestry databases match people based on shared SNP segments, not STR allele identities; SNP-based matching identifies distant relatives (third and fourth cousins) who share relatively short identical-by-descent (IBD) chromosomal segments, rather than the close first-degree matches that criminal DNA databases search for.
The DeAngelo arrest triggered both wide adoption and intense ethical scrutiny. Within three years of the arrest, more than 200 US cold cases had been resolved using investigative genetic genealogy (IGG), the formal name for the technique. The US Department of Justice published an Interim Policy on Forensic Genetic Genealogy DNA Analysis in September 2019 that permits IGG only in specified circumstances and requires FBI approval. The UK, Australia, Canada, and the Netherlands have each taken different regulatory positions; the technique sits in a legal and ethical grey zone in most countries that lack an explicit statutory framework.
*MPS does not replace STR profiling for criminal databases, it supplements it by opening access to the hundreds of millions of SNP profiles in consumer ancestry databases.*
Test yourself on Forensic Biotechnology with free, timed mocks.
Practice Forensic Biotechnology questionsForensic MPS for identification purposes currently focuses on two marker classes: STR amplicons sequenced through their full length (rather than sized by fragment length as in capillary electrophoresis), and genome-wide SNP arrays sufficient to generate an autosomal ancestry profile. These two applications use different library-preparation kits, different sequencing instruments, and different downstream bioinformatics pipelines, though they often run on the same physical sequencer in a shared-reagent workflow.
MPS-based STR typing offers several advantages over capillary electrophoresis. Sequence-level resolution distinguishes isoalleles (STR alleles of the same fragment length but different internal sequence) that CE treats as identical, improving discrimination in mixtures and reducing stutter artefacts. The ForenSeq DNA Signature Prep Kit (Verogen, formerly Illumina), run on the Verogen MiSeq FGx, types 27 autosomal STRs, 7 X-STRs, 23 Y-STRs, plus 94 identity-informative SNPs and 56 ancestry-informative SNPs in a single library-preparation workflow. The Precision ID GlobalFiler NGS STR Panel (Thermo Fisher Scientific) runs on the Ion S5 platform and types the 20 CODIS core STR loci plus additional markers. Both kits produce CODIS-compatible allele designations so that the resulting profiles upload directly into the NDIS, UK NDNAD, and most European national databases without modification.
Genome-wide SNP profiling for genealogy requires substantially more input DNA and more sequencing depth. Investigative genetic genealogy operations typically use low-coverage whole-genome sequencing (lcWGS) at 0.1x to 1x coverage to generate a genome-wide SNP set of 300,000 to 1,000,000 SNPs, then convert this to a file format compatible with consumer ancestry-database upload (the .vcf or .csv formats accepted by GEDmatch and FamilyTreeDNA). Companies providing this service commercially include Parabon NanoLabs (Snapshot Forensic; based in Reston, Virginia) and Othram Inc. (The Woodlands, Texas). Both have been engaged by US law enforcement for cold-case investigations and by the Department of Defense DNA Registry for unidentified military-remains casework.
*The genealogist's task was not to identify DeAngelo, it was to identify enough of his cousins that the intersection of their family trees pointed to one man.*
The Golden State Killer investigation demonstrates the full IGG workflow with forensic precision. The original STR profile from a crime-scene sample had been in the California CODIS system for years without a hit. In 2017, investigator Paul Holes contracted Parabon NanoLabs to generate a phenotype prediction from the crime-scene DNA (the Snapshot facial-composite service). In early 2018, Holes engaged genetic genealogist Barbara Rae-Venter (who had previously identified a Jane Doe cold case using GEDmatch in 2017) to run the IGG analysis.
The workflow Rae-Venter followed has since been formalised:
The William Earl Talbott II case (resolved in 2019) followed essentially the same workflow and produced the first IGG-enabled conviction when Talbott was found guilty of murdering a young couple in Washington State in 1987. The Christy Mirack case (December 1992 murder in Lancaster County, Pennsylvania) was resolved in June 2018, weeks after DeAngelo, using GEDmatch and a family tree that traced back through Mirack's killer's maternal line. In the Jane Doe Buckskin Girl case (Ohio, victim killed around 1981, identified in 2018), IGG was used not to identify a perpetrator but to put a name to a long-unidentified victim, demonstrating that the technique works symmetrically for victim identification as well as perpetrator identification.
*The legal and ethical status of an upload to GEDmatch by law enforcement has changed three times in five years.*
GEDmatch was founded in 2010 as a free genealogy utility that allowed people to upload DNA data files from any consumer genetic testing company (23andMe, AncestryDNA, MyHeritage, etc.) and compare them against other users' uploads. By 2018 it had accumulated roughly 1 million profiles, all voluntarily uploaded. The site's terms of service in 2018 said nothing explicit about law-enforcement access; Parabon and Rae-Venter used GEDmatch under an informal policy that the platform permitted police use for violent-crime cases.
The privacy controversy following the DeAngelo arrest forced an immediate policy revision. In May 2019, GEDmatch changed its default to opt-out for law-enforcement matching: all existing profiles became invisible to law-enforcement searches unless the user explicitly opted in. This change is generally estimated to have reduced the law-enforcement-accessible database from roughly 1 million profiles to approximately 280,000, materially reducing the reach of the technique for profiles without close relatives who had opted in. GEDmatch was subsequently acquired by Verogen in 2020; Verogen maintains the opt-in architecture and a separate GEDmatch PRO tier for law-enforcement use.
FamilyTreeDNA (FTDNA), a Houston-based consumer testing company, announced in February 2019 that it had been cooperating with the FBI in uploading law-enforcement profiles to its database. This revelation also prompted a policy revision: FTDNA now operates an opt-in system for law-enforcement matching, and the company publishes a transparency report on the number of law-enforcement profiles active in the database.
Neither GEDmatch nor FTDNA allows access to the major commercial consumer databases: AncestryDNA (currently the largest at over 20 million profiles) and 23andMe (over 10 million profiles) do not accept law-enforcement uploads and require a valid court order or legal process before releasing any data about their users. This boundary is maintained by company policy and Terms of Service; in the US there is no federal statute that compels consumer genetics companies to accept law-enforcement profile uploads.
*The power of IGG is that it does not need to find the suspect's own profile in the database, it needs only to find enough of the suspect's cousins.*
The mathematical logic of third-cousin triangulation is grounded in IBD segment sharing statistics. Two individuals share an IBD segment when they have both inherited a particular chromosomal region from a common ancestor without an intervening recombination event breaking it. The expected amount of shared IBD decreases approximately by half for each additional generation of separation:
At the third-cousin level, an unknown individual (the suspect) will appear in a consumer database match list if one or more of their third cousins has uploaded their own profile. Three-to-five strong third-cousin matches, each traceable to a different common great-great-grandparent pair, are usually sufficient to triangulate the common ancestor pool and narrow the candidate list. Because each American with European ancestry has on average 190 third cousins, the probability of finding at least one in the approximately 1.3-million-profile opt-in GEDmatch database is mathematically high for profiles with Northern European ancestry.
The genealogical reconstruction work is intensive: for the DeAngelo case, Rae-Venter reportedly built family trees covering thousands of descendants to eliminate candidates and isolate DeAngelo. This work is now routinely conducted by a small number of certified genetic genealogists who specialise in IGG, including teams at Parabon NanoLabs, Othram, and independent practitioners trained through the Association of Professional Genealogists.
The Phoenix Canal murders (2015-2016, resolved in 2020 using IGG) involved a suspect of mixed heritage whose match network in GEDmatch was sparser than in the DeAngelo case, requiring more extensive family-tree reconstruction before a candidate emerged. This case demonstrated that the technique functions below the ideal density of third-cousin matches, but at the cost of substantially more genealogical labour.
*A policy designed specifically for IGG took 18 months from the first high-profile arrest to publication, which itself reflects how fast the technology outran the framework.*
The US Department of Justice published its Interim Policy on Forensic Genetic Genealogy DNA Analysis on 24 September 2019. The policy applies to all federal law-enforcement agencies and to state and local agencies receiving federal funding. Its key provisions:
The policy is explicitly interim, it is a guidance document, not a statute, and may be amended. Several US states have enacted or proposed state-level legislation that either restricts IGG further (Maryland, Montana) or explicitly authorises it under the DOJ conditions (Virginia).
International divergence is significant. Australia's Attorney-General's Department declined to adopt IGG, citing privacy law incompatibilities with the Privacy Act 1988. The UK Home Office has not authorised IGG use for domestic criminal investigations; the Forensic Science Regulator's office has noted that uploading to GEDmatch from a UK crime scene would likely require a legal basis under the Data Protection Act 2018 and Human Rights Act 1998 that does not currently exist for routine police use. The Netherlands National Police has used Bonaparte for kinship matching in DVI but has not publicly used IGG for criminal suspects. Canada's Privacy Commissioner has raised objections to IGG use that have not been resolved by formal legislation.
India has no specific policy or guidance on IGG. The Bharatiya Sakshya Adhiniyam 2023 (the new Evidence Act, replacing the Indian Evidence Act 1872) governs opinion evidence including DNA evidence, but does not address IGG or genealogy-database searches specifically. Consumer genetic testing uptake in India remains relatively low compared with the US, making large-scale third-cousin matching in existing databases practically limited even if legal questions were resolved.
*Not all cold cases with DNA are amenable to IGG; understanding the technique's limits is as important as understanding its power.*
The Anne Marie Fahey murder (Delaware, 1996): The case involves the murder of Governor Thomas Carper's scheduling secretary by Thomas Capano, a prominent Wilmington attorney. Capano was convicted in 1999 and died in prison in 2011. The case is not an IGG case, but it is routinely cited in the forensic-genealogy literature as an example of a conviction built on circumstantial evidence plus trace DNA that predates the CODIS era, where a genealogy-database search would have been theoretically applicable had the technology existed. Its inclusion in IGG training curricula illustrates the retrospective power of the technique.
The JonBenet Ramsey case (Colorado, 1996): Unidentified DNA from the crime scene has been in the Colorado CODIS system for decades without a hit. In 2024, a Boulder District Attorney's team publicly confirmed that IGG analysis had been applied to the unidentified DNA. Reports indicated that the SNP profile generated from the crime-scene swab had not produced the kind of strong third-cousin match network that the DeAngelo case yielded, likely due to limited genetic representation of the donor population in the opt-in GEDmatch database or due to DNA quality limitations in the 28-year-old sample. The case illustrates the practical ceiling of IGG: even a high-priority, high-resource investigation with DNA quality sufficient for genome-wide SNP profiling may not produce an actionable match if the suspect's relatives are not adequately represented in the accessible opt-in databases.
The JonBenet 2024 reporting also prompted a broader discussion about the reporting threshold for IGG non-results: when should an investigating agency disclose that IGG was attempted but inconclusive, and what obligation does that disclosure create regarding the management of the genealogical data generated from the crime scene?
| Platform | Database size (approx.) | Law-enforcement access model | Jurisdiction | Notable casework |
|---|---|---|---|---|
| GEDmatch PRO (Verogen) | 1.3M+ profiles (opt-in subset ~300K) | Opt-in; violent crimes + unidentified remains only | US policy (DOJ 2019) | DeAngelo 2018, Talbott 2019, Mirack 2018 |
| FamilyTreeDNA | 4M+ profiles (opt-in subset) | Opt-in; formal FBI program | US policy (DOJ 2019) | Multiple unresolved homicide cold cases |
| AncestryDNA | 20M+ profiles | No law-enforcement uploads; requires legal process for data | Company policy; all jurisdictions | Not used for IGG |
| 23andMe | 10M+ profiles | No law-enforcement uploads; requires legal process | Company policy; all jurisdictions | Not used for IGG |
| Parabon Snapshot / Othram FORCE | No public database; generates SNP profile from crime-scene DNA | Direct law-enforcement service | US; international on request | Phoenix Canal 2020, Buckskin Girl 2018 |
What distinguishes the SNP profile used in forensic genetic genealogy from the STR profile used in CODIS criminal database searches?