Practice with mock tests, learn from structured notes, and get your questions answered by a global forensic community, all in one place.
A guide to the major reference databases used to identify species, assign geographic origins, and link wildlife trade seizures, covering DNA barcoding libraries, STR databases, trade intelligence systems, and the significant coverage gaps that still exist.
Last updated:
A wildlife forensic identification is only as strong as the reference it is compared against. Sequence a mitochondrial gene from a fragment of bone and you have a string of base pairs. That string means nothing until you compare it against a database of sequences from known species and ask which match is closest. The same logic applies to a STR profile from a rhino horn, a trade permit number on a live bird consignment, or a seizure record from a port. Every forensic question in wildlife crime has a database behind the answer, and knowing which databases exist, what they cover, and where they fail is foundational knowledge.
These databases fall into three broad categories. Genetic reference libraries hold sequences or profiles from specimens of known species and, in some cases, known geographic populations. Trade intelligence databases track legal and illegal movement of wildlife through the permit system and seizure records. Specialised forensic databases for particular taxa, like RhODIS for African rhinos and ElePhant for elephants, combine both: they hold individual-level genetic profiles and link those profiles to seizure histories and origin data.
This topic covers the principal databases in each category, what an analyst can get out of each one, how they connect to casework, and the coverage gaps that represent real limits on what wildlife forensics can currently prove in court. Understanding these limits is not defeatist; it is what allows a forensic expert to qualify their conclusions honestly rather than overstating a match.
The global default for first-pass species ID from a short DNA sequence.
DNA barcoding standardised around the mitochondrial cytochrome c oxidase I (COI) gene for animals, and around rbcL and matK for plants. The Barcode of Life Data System at the University of Guelph holds more than nine million sequences from more than 300,000 species, making it the single largest reference library for species identification by short-read sequencing. An unknown sequence is submitted to the BOLD Identification Engine, and the system returns the closest matches with percentage identity and a statistical confidence measure.
In casework, BOLD is used as a first-pass screen: a result with 99% or better COI identity to a single species typically supports a species-level identification. Identifications between 95% and 99% may only support genus-level assignment, and anything below that is at best a family match. The analyst's report must state which threshold applied and what the coverage of the database is for the taxon in question, since a high-confidence match to the closest available sequence is not the same as a high-confidence match to the actual source species if the actual species is absent from the database.
A broader but less curated repository used alongside BOLD.
GenBank, maintained by the US National Center for Biotechnology Information, is not a forensic database. It is a general-purpose repository for nucleotide sequences and holds far more sequences than BOLD, but with less curation. Any researcher can deposit any sequence with any species annotation, and errors in species identification or geographic attribution can propagate without systematic correction.
Wildlife forensic analysts use GenBank and its partner databases (EMBL at EBI and DDBJ in Japan, the three forming the International Nucleotide Sequence Database Collaboration) as a complement to BOLD when a taxon has poor barcoding coverage but has published sequences from other gene regions in the scientific literature. For multi-gene analyses or whole-mitogenome approaches, GenBank is often the only resource available. The quality caveat remains: any match from GenBank needs the original accession record checked for specimen voucher information and the depositor's identification credentials.
Population assignment and individual matching for two flagship species.
For most wildlife species, DNA identification gets to species level and stops. For elephants and African rhinoceroses, the forensic DNA infrastructure has been built out to the population and, in some cases, individual level. These two databases are the model for what species-specific forensic databases can achieve when there is sufficient institutional investment.
ElePhant holds microsatellite profiles from hundreds of elephants sampled across African and Asian range states, plus profiles from large ivory seizures analysed in research by Samuel Wasser's laboratory at the University of Washington. When a new ivory seizure is profiled, the STR data is compared against the geographic reference populations to assign the seized material to a continental region and, with sufficient data, to a specific range state. This is not a match to a single identified animal; it is a probabilistic assignment to a population based on allele frequency distributions. In the Wasser et al. seizure studies, this approach consistently mapped large-scale ivory flows to central and east African source populations despite ivory having been traded and laundered across multiple countries.
RhODIS was developed in South Africa to address a specific enforcement problem: South Africa has the world's largest population of white rhinos and faces the highest volume of horn poaching. A national database of individual STR profiles from live-sampled, registered animals means that a seized horn can be matched to a known individual, which means it can be linked to a specific reserve, a registered owner, and in cases where the animal was already dead or reported missing, an existing investigation. New profile acquisitions from living animals, from post-mortem sampling after poaching events, and from horn seizures are continuously added.
Following the legal supply chain to find what does not belong.
The CITES trade database, managed by UNEP-WCMC on behalf of the CITES Secretariat, holds records of all legal trade in listed species as reported annually by state parties. By 2025 it contained over 23 million trade records going back to the 1970s. These records cover live animals, plants, and their derivatives, classified by taxon, quantity, trade purpose, and the importing and exporting countries.
For enforcement, the database serves several functions. Permit verification: a permit number accompanying a consignment can be cross-checked against known issued permits to detect forgeries or permits reused across multiple shipments. Trade pattern analysis: an unusually high reported export volume from a country with limited habitat for a species may indicate that wild-caught animals are being laundered through captive-breeding facilities. Anomaly detection: a species that shows near-zero reported trade but shows up repeatedly in seizure records is a signal of unlaundered illegal trade.
Connecting seizure dots across borders.
Genetic databases tell you what a specimen is and where it came from. Seizure and legal databases tell you whether a similar specimen was caught before, by whom, on what route, and what happened in court. WildCAP (Wildlife Contraband Analysis and Profiling) was developed as a shared logging tool for wildlife enforcement agencies. An agency logs details of a seized consignment, and investigators in other jurisdictions can query whether the same route, species, or quantity profile appears in other seizure records.
UNODC's SHERLOC platform indexes national legislation, case law, and court outcomes across multiple crime types including wildlife trafficking. The wildlife module gives prosecutors and investigators access to how courts in other jurisdictions have handled similar cases, what legal arguments have succeeded or failed, and what penalties have been applied. This kind of legal intelligence is particularly useful in countries where wildlife crime prosecution is new and courts have little precedent to draw on.
| Database | Type | Primary user | Key output |
|---|---|---|---|
| BOLD Systems | Genetic (barcode) | Laboratory analyst | Species ID by COI sequence similarity |
| GenBank / INSDC | Genetic (multi-gene) | Laboratory analyst | Broader sequence comparison for non-barcoded taxa |
| ElePhant | Genetic (STR) | Wildlife forensic specialist | Ivory population assignment to range state |
| RhODIS | Genetic (STR, individual) | Wildlife forensic specialist | Horn-to-animal match and population assignment |
| CITES trade database | Trade intelligence | Enforcement officer / analyst | Permit verification and trade pattern anomalies |
| WildCAP | Seizure intelligence | Enforcement officer | Cross-border seizure pattern linking |
| SHERLOC wildlife module | Legal intelligence | Prosecutor / investigator | Case law, legislation, court outcomes |
The database answer is only as reliable as the reference behind it.
Wildlife forensics can currently give species-level DNA identification for most commonly traded vertebrates. For less-studied taxa, the picture is much worse. Large proportions of traded invertebrates, tropical timber species, and Neotropical fauna have few or no COI reference sequences in BOLD. If the actual source species is absent from the database, the closest match in the database will still return a high percentage identity to its nearest sequenced relative, and an analyst who does not know the database coverage for that taxon may report a false species identification.
The practical consequence is that a forensic expert must always report not only the match result but the coverage completeness for the taxon queried. Saying a sample matches species X at 99% COI identity is a stronger statement if X is a well-sampled bird than if X is the only described species in a poorly sampled tropical genus. Courts and lawyers who understand this distinction will scrutinise it; those who do not may accept a weaker identification as more certain than it is.
At what COI sequence identity threshold does BOLD Systems typically support a species-level identification?
Test yourself on Wildlife Forensics with free, timed mocks.
Practice Wildlife Forensics questionsSpotted an error in this page? Report a correction or read our editorial standards.