Paternity, Complex Kinship and Missing-Persons Casework

Statistical kinship at the bench: the trio paternity index, the duo (motherless) case, sibling and grandparental indices, kinship LR software (Familias, DNA-VIEW, FaSTaR, EasyDNA), and the missing-persons reference-sample design that anchors national missing-persons databases like NamUs (US), the UK Missing Persons Unit, and India's Track Child platform.

Last updated: 18 Jun 2026

A forensic DNA laboratory receives two categories of kinship cases that share the same statistical engine but sit at opposite ends of the emotional and legal spectrum. In paternity testing, the question is whether a named man is the biological father of a named child. In missing-persons casework, the question is whether human remains or a living unidentified person shares the genome of an unknown family's relative who disappeared months or years earlier. Both reduce to the same computation: a likelihood ratio comparing the probability of the observed genotype data under a proposed biological relationship against the null hypothesis of unrelatedness.

Key takeaways

The trio paternity index (PI) is the likelihood ratio at each locus comparing true paternity against a random man; the Combined Paternity Index (CPI) is the product across all typed loci, typically reaching tens of millions for a true father-child pair.
The motherless duo case generally returns a lower CPI than the trio because each locus denominator must integrate over all possible maternal contributions from the population.
AABB accreditation standards (US) require a minimum CPI of 100 before any paternity inclusion can be reported; ENFSI DNA WG guidelines recommend a Probability of Paternity above 99.99%.
Familias (Norway), DNA-VIEW (US), FaSTaR, and EasyDNA are the four dominant kinship LR software platforms; Familias is the ISFG reference platform for kinship workshops.
For consanguineous populations, ISFG guidelines require a sensitivity analysis across a range of inbreeding coefficient (F) values rather than a single-point LR estimate.

The trio paternity case, in which mother, child, and alleged father are all genotyped, is the simplest kinship problem and the one that established the field. Its extension to the duo or motherless case, where only child and alleged father are tested, is arithmetically more demanding and more susceptible to error when allele frequencies are imprecise or the population is endogamous. Beyond paternity, sibling identification, grandparental analysis, half-sibling discrimination, and the reconstruction of complex pedigrees from a set of reference samples collected during a disaster or a missing-persons investigation all extend the same LR framework into territory where software is no longer optional.

Three continents run large-scale operational missing-persons programs that accept DNA reference samples from families: the US National Missing and Unidentified Persons System (NamUs), the UK Missing Persons Unit coordinated through the National Crime Agency, and India's Track Child portal managed by the Ministry of Women and Child Development in partnership with NCRB. Each program has different DNA-intake protocols, different laboratory networks, and different thresholds for reporting a candidate match. The mass-casualty end of this spectrum, where hundreds of families submit reference samples simultaneously, is covered in disaster victim identification: INTERPOL process and casework. Across all three, the quality of the reference-sample design, which relatives are sampled, which loci are typed, what population database is used for allele frequencies, determines whether a match is reportable or inconclusive.

The Trio Paternity Index: Mathematics and Interpretation

*The paternity index number on a report is not a probability; it is a likelihood ratio, and the distinction matters enormously in court.*

The paternity index (PI) for a single STR locus is the probability of the child's paternal allele given that the alleged father (AF) is the biological father, divided by the probability of that same allele given that a random unrelated man from the relevant population is the biological father. When the mother is tested, her contribution to the child's genotype is subtracted first; what remains is the one allele the child must have received from the father. If the AF carries that allele, the numerator is 1 (or 0.5 if the AF is heterozygous at that locus), and the denominator is the allele's frequency in the reference population. If the AF does not carry the obligate paternal allele, the numerator is 0 and the result is exclusion.

The combined paternity index (CPI) is the product of the individual-locus PIs across all typed loci. With a 20-locus CODIS or ESS panel and typical European or South Asian allele frequencies, a CPI of tens of millions is routine for a true father-child pair. The Probability of Paternity (POP), derived from the CPI using Bayes' theorem with a prior of 0.5 (equal prior for paternity and non-paternity), is what most commercial labs report as a percentage on the final certificate: 99.9999% Probability of Paternity means a CPI of 999,999.

AABB (Association for the Advancement of Blood Banking, US) accreditation standards require a minimum CPI of 100 before any inclusion can be reported. ENFSI DNA WG recommendations in Europe suggest a probability of paternity above 99.99% as the reporting threshold for civil paternity matters. In India, the DNA Technology (Use and Application) Regulation Bill 2019 contemplates a national standard for paternity testing laboratories, though no final regulation was in force as of the time of writing; many Indian accredited labs currently follow AABB or ENFSI guidance voluntarily.

The choice of allele-frequency population database has a material effect on the PI at each locus and thus on the CPI. For South Asian communities, the widely used databases (NIST STRBase US Caucasian and African-American, FSS UK Caucasian) may not accurately represent allele frequencies; when a case involves an Indian or South Asian family, use of a validated South Asian population database substantially reduces the uncertainty in the reported CPI.

Motherless Duo Paternity, Sibling Indices, and Grandparental Kinship

*Each step away from the trio configuration multiplies the statistical uncertainty by a factor the laboratory must explicitly account for.*

The duo (motherless) case arises when the mother is unavailable for testing, most commonly in post-mortem paternity disputes, immigration cases, or cases where the mother declines to participate. Without knowledge of the maternal allele contribution, the alleged father's alleles at each locus must be evaluated against all possible maternal contributions from the population, which broadens the denominator term and generally reduces the PI per locus relative to the trio. At highly polymorphic loci with many rare alleles, this effect is modest; at loci where the child's alleles are common, the duo PI can be substantially lower than the trio PI for the same alleged father. Courts in Germany, Australia, and the United States have all addressed when a duo result alone is sufficient for a civil paternity declaration; the threshold is typically a CPI above 1,000 in the duo configuration.

Sibling indices quantify how much more likely two people are to be full siblings rather than unrelated. A full-sibling likelihood ratio (SLR) is calculated locus by locus using the transmission probabilities for shared parental alleles, relying on the same STR multiplex kits used in criminal-database profiling. Unlike the paternity PI, the SLR does not exclude with certainty when alleles differ: full siblings share on average 50% of alleles, so mismatches at any locus are expected. SLRs for full-sibling pairs typically fall between 100 and 10,000 for a 20-locus profile; an SLR below 1 favours non-sibship. The half-sibling configuration (which shares one parent) produces an SLR roughly midway between the unrelated and full-sibling values, and distinguishing full from half-sibling pairs is a genuinely difficult problem when populations are endogamous and shared alleles are frequent.

Grandparental kinship (comparing a child to one or both grandparents when the alleged parent is deceased or unavailable) is the configuration most frequently encountered in inheritance disputes, immigration family reunification cases, and post-mortem paternity after a soldier's or accident victim's death. The grandparental LR is lower than the trio PI for the same allele set because only 25% of the child's genome is expected to derive from each grandparent. Typing additional relatives when available, aunts, uncles, the other grandparent, substantially increases the grandparental LR, and the software tools discussed in Section 3 handle these extended pedigrees. The admissibility framework that governs how these LRs are presented in court across different jurisdictions is covered in admissibility and ethics: Daubert, Frye and R v. Doheny.

Kinship LR Software: Familias, DNA-VIEW, FaSTaR, and EasyDNA

*Manual kinship LR calculation works for a trio; for anything more complex, validated software is the only defensible approach.*

Four software platforms dominate operational kinship LR calculation in forensic and immigration testing contexts worldwide.

Familias (developed by Thore Egeland and colleagues at the Norwegian Institute of Public Health and the Oslo University Hospital, freely available) is the most widely used academic and operational kinship platform outside North America. It accepts arbitrary pedigree structures, handles up to thousands of loci simultaneously, and supports a Monte Carlo simulation module for validating the reported LR distribution under the proposed pedigree. Familias is the reference platform for ISFG (International Society for Forensic Genetics) kinship workshops and has been validated against the ENFSI DNA WG database. Swedish, Norwegian, Dutch, and Polish national forensic labs run it in production.

DNA-VIEW (Charles Brenner, Forensic Mathematics, Oakland) is the long-standing North American reference platform, first deployed in the early 1990s and used extensively for kinship calculations in immigration casework by the US Department of State and USCIS. DNA-VIEW handles complex pedigrees including those with inbreeding loops, uses Brenner's analytical PI formulations rather than Monte Carlo simulation, and outputs interpretable results that courts across the US and Australia have accepted for decades.

FaSTaR Kinship (a more recent platform developed for high-throughput DVI kinship matching, deployed in the INTERPOL sphere through the European STADNAP and IDENTIFYING DVI projects) focuses specifically on large-scale reference-sample databases where thousands of family-victim pairs must be matched simultaneously. It uses a likelihood-ratio matrix computed across the entire pedigree database and flags candidate pairs above a specified threshold for expert review.

EasyDNA (the software tier of the commercial EasyDNA laboratory group, with operations in the UK, Australia, US, Canada, India, and South Africa) is the platform most frequently encountered in civil paternity and immigration testing outside the DVI context. It is accredited under ISO 17025 in its major markets and uses population databases specific to the test population (South Asian, Afro-Caribbean, etc.) for each jurisdiction's casework.

Kinship LR decision tree: the pedigree configuration drives the choice of numerator and denominator in the likelihood ratio, and software platforms handle the algebra across multiple loci.

Reference-Sample Design in Missing-Persons Casework

*Which relatives you choose to swab on day one determines whether a match will be reportable when remains surface two years later.*

A missing-persons DNA case begins with a reference-sample strategy before any laboratory work is done. The optimal reference for identifying unknown remains is a direct reference sample from the missing person themselves (a stored buccal swab, a biological sample from a personal effect). When that is unavailable, the laboratory must rely on kinship references from relatives, and the kinship LR it can ultimately report depends entirely on the combination and degree of relatedness of those relatives.

The hierarchy of kinship reference value, holding the number of relatives constant, is approximately:

Parent-child pair (mother + father of missing person, or parent + child of missing person if a child is missing): provides the highest combined LR per locus pair.
Two biological children of the missing person: also a parent-child configuration, high LR.
Full sibling: moderate LR; distinguishing from half-sibling requires population-specific care.
Grandparent (single): lower LR; combined with one parent substantially improves.
Aunt or uncle (avuncular index): the weakest individual reference, useful mainly as supplementary material to boost a near-threshold kinship LR.

In practice, the reference-collection officer, a family liaison officer, a DVI officer, or a law-enforcement detective, often collects samples from whoever presents at the police station on day one. This is a critical juncture: collecting a buccal swab from a sibling when the missing person's parent is also available and willing is a significant missed opportunity that may reduce the eventual LR from 1,000,000 to 10,000 on the same remains.

NamUs (National Missing and Unidentified Persons System, US): Established in 2007 and fully nationalised under the Brittany Smith Act 2012, NamUs maintains two matched databases: the Unidentified Persons database (PM profiles from unidentified remains) and the Missing Persons database (AM reference profiles from biological relatives). Labs across the US submit profiles to the FBI via NDIS CODIS and cross-reference with NamUs simultaneously. As of 2023, NamUs had facilitated more than 23,000 identifications. The system accepts kinship reference samples from any first-degree relative and stores pedigree linkages for the automated kinship search.

UK Missing Persons Unit (UKMPU): Coordinated by the National Crime Agency, the UKMPU maintains the UK Missing Persons DNA database, accepting reference samples from police forces across England, Wales, Scotland, and Northern Ireland. Post-mortem profiles of unidentified remains submitted to the database are cross-matched against both direct references and kinship references. The Forensic Science Regulator's Codes of Practice (now statutory under the Forensic Science Regulator Act 2021) require accredited laboratories handling UKMPU casework to validate their kinship calculations against specific population databases (UK Caucasian, UK South Asian, Afro-Caribbean as applicable) and to report LR thresholds and uncertainties explicitly.

India's Track Child: Managed by the Ministry of Women and Child Development (MWCD) and integrated with the National Crime Records Bureau (NCRB), Track Child is a portal for reporting and tracking missing children. DNA intake into Track Child casework is handled through state forensic science laboratories and CFSL, though the integration between the portal's biographic records and DNA laboratory outputs is less automated than NamUs. The DNA Technology (Use and Application) Regulation Bill 2019 proposes a National DNA Data Bank with a Missing Persons index that would formalise this infrastructure. India's Child Welfare Committees and District Child Protection Units (under the POCSO framework) serve as the primary family-contact nodes for reference-sample collection. The proposed national database infrastructure, including a Missing Persons Index, is described in national DNA databases: NDIS, NDNAD and the DNA Technology Bill.

Reference-sample hierarchy for missing-persons DNA casework: relative type, degree of relatedness, and approximate combined LR range. A first-degree pair (both parents) yields a combined LR up to 10 million, while a single aunt or uncle may barely reach the 10,000 reporting threshold.

Complex Pedigrees and the Inbreeding Problem

*Consanguineous families present a specific statistical trap that standard paternity software was not designed to handle.*

Standard kinship LR software assumes that the two parties being compared are either related in the proposed manner or unrelated. In populations with significant consanguinity, common in parts of South Asia, the Middle East, North Africa, and historically isolated communities worldwide, the alternative hypothesis of unrelatedness is not actually the correct null hypothesis, because the alleged unrelated person may in fact share distant kinship with the child or the missing person through the endogamous community structure.

When consanguinity is a realistic possibility, the kinship LR denominator should not use the random-population allele frequency alone; it should incorporate the probability of allele sharing under the actual population inbreeding coefficient (F). All four major software platforms (Familias, DNA-VIEW, FaSTaR, EasyDNA) can incorporate a non-zero F value if it is estimated or provided, but in operational casework this correction is frequently omitted because F is difficult to estimate reliably for a specific family. The ISFG commission on missing-persons DNA identification recommends that labs servicing high-consanguinity populations run sensitivity analyses across a range of F values (typically 0.01 to 0.0625) and report the LR range, not a single point estimate.

A second complexity arises in pedigree reconstruction from a set of survivors when the missing person's entire nuclear family has perished, leaving only cousins and extended relatives as references. This configuration, encountered in large-scale disasters where entire households are missing, requires dedicated software (M-FIsys for DVI; Familias extended-pedigree mode) to compute an LR at all, because the transmission probabilities through multiple meioses are beyond hand calculation.

Collect and map the pedigree
Before sampling any relative, draw the complete known pedigree of the missing person and identify which relatives are available and willing to provide buccal swabs. Prioritise first-degree relatives (parents, children, full siblings) over second-degree or more distant relatives.
Type reference samples on the full locus panel
Type all collected reference samples on the same STR kit (e.g., GlobalFiler, Investigator 24plex, PowerPlex Fusion) used for post-mortem samples. Supplementary Y-STR or X-STR panels may be added for sex-limited lineage confirmation. Ensure all samples are typed at the same laboratory or that inter-laboratory concordance data exist.
Select the population database
Choose the validated population database that best represents the biological population of the missing person and all reference donors. For South Asian families, use a validated South Asian STR frequency table. Document the database name, version, and source in the case file.
Compute kinship LR in validated software
Enter the pedigree structure and typed profiles into Familias, DNA-VIEW, or the applicable platform. Run the LR calculation and save the calculation file as a case record. For complex pedigrees or F > 0, run the sensitivity analysis range.
Apply the reporting threshold
Compare the computed LR (or CPI) to the laboratory's validated reporting threshold. For direct reference matches, most labs use LR > 10,000. For kinship matches, INTERPOL DVI guidance requires LR > 10,000 as a minimum, and most national authorities use higher internal thresholds. If LR is below threshold, document as inconclusive and flag for additional reference-sample collection.
Report with explicit uncertainty
The expert report must state: the pedigree structure assumed, the software and version, the population database and version, the per-locus LR table, the combined LR, the reporting threshold, and any sensitivity analysis results. Courts in the UK, Australia, and the Netherlands require all of these elements in the body of the expert statement.

Key terms

Paternity Index (PI): The likelihood ratio at a single STR locus comparing the probability of the child's allele given the alleged father is the biological father versus a random man from the population. Combined across loci to give the Combined Paternity Index (CPI).
Combined Paternity Index (CPI): The product of individual-locus paternity indices across all typed loci. Converted to Probability of Paternity (POP) using Bayes' theorem with an equal prior.
Duo / motherless case: A paternity test in which the mother's profile is unavailable; the PI denominator must integrate over all possible maternal allele contributions from the population, generally reducing the CPI relative to the trio configuration.
Sibling likelihood ratio (SLR): Likelihood ratio comparing the probability of the observed genotype data if two persons are full siblings versus unrelated. Values much greater than 1 support sibship; values less than 1 support non-sibship.
Familias: Open-source kinship LR software developed in Norway, the reference platform for ISFG kinship workshops and used operationally by forensic labs in Sweden, Norway, Netherlands, Poland, and many other countries.
NamUs: National Missing and Unidentified Persons System (US), a national database cross-matching PM profiles of unidentified remains against AM kinship reference profiles; has facilitated over 23,000 identifications since 2007.
Track Child: India's Missing Persons and Children portal managed by the Ministry of Women and Child Development and NCRB, the domestic framework through which DNA reference samples are collected for missing-persons casework.
Inbreeding coefficient (F): The probability that two alleles at a locus in an individual are identical by descent, reflecting population consanguinity. Non-zero F values affect kinship LR denominators and should be incorporated in sensitivity analyses for consanguineous populations.

Worked example

Disputed Paternity in an Immigration Case, Trio vs Duo Calculation and the Kinship LR

A UK Home Office immigration tribunal requires DNA evidence to confirm a paternity claim. The mother is not available for testing. How does the paternity index change without her profile?

Scene: A Home Office immigration case. A man claiming British citizenship as the biological child of a UK resident petitions for leave to remain. The alleged father is a British citizen; the mother (allegedly deceased) is not available for sampling. DNA testing is ordered by the tribunal.

Step 1 (Duo paternity calculation): An accredited laboratory (UKAS ISO/IEC 17025-accredited, authorised under the Child Support Act 1991 for legal parentage testing) collects buccal swabs from the child and alleged father. The GlobalFiler kit produces profiles at all 20 CODIS-compatible loci plus SE33. Per-locus paternity indices are calculated using Familias software with allele frequencies from a validated West African population database (both parties are of West African ancestry).

Step 2 (Absence of mother): Without the mother's profile, each per-locus PI denominator must integrate over all possible maternal contributions from the West African allele frequency table. Loci where the child and alleged father share common alleles produce lower PIs than if the maternal allele could be directly excluded. The combined CPI is 85,000, giving a Probability of Paternity of 99.999% with a prior of 0.5.

Step 3 (Sensitivity analysis): The ISFG guidelines recommend a sensitivity analysis for West African populations because consanguinity within some communities is possible. Familias is re-run with an inbreeding coefficient F = 0.03 (reflecting possible distant kinship within the community). CPI drops to 31,000, still above the UK ACTT paternity testing guidance threshold of LR = 10,000 for a positive identification.

Step 4 (Report and tribunal outcome): The expert report documents all calculation parameters, the Familias version, the population database, and the sensitivity analysis. The tribunal accepts the CPI of 85,000 (F = 0) with the F = 0.03 sensitivity result of 31,000 as supporting the parentage claim. Leave to remain is granted.

Conclusion: The case illustrates the real-world difference between trio and duo paternity calculations, the effect of consanguinity adjustment (F), and the reporting standards required by ISFG guidelines and UK ACTT for immigration paternity cases.

Frequently asked questions

What is the difference between a paternity index and a probability of paternity?

The Combined Paternity Index (CPI) is a likelihood ratio: it compares the probability of the child's genotype if the alleged father is the true father against the probability if a random unrelated man from the population is the father. A CPI of 10,000 means the observed data is 10,000 times more likely under true paternity. The Probability of Paternity (PoP, expressed as a percentage) is derived from the CPI using Bayes' theorem with a stated prior (usually 0.5). PoP depends on the prior; CPI does not. Courts and accreditation bodies such as AABB require the CPI, not just PoP.

How is paternity testing handled in Indian courts and what quality standards apply?

DNA paternity testing is admitted under Bharatiya Sakshya Adhiniyam 2023 § 39 (expert opinion evidence). Courts can order testing in paternity disputes, though they have held that a test cannot be ordered solely to challenge the legitimacy of a child born during a marriage without sufficient cause. Quality is governed by NABL accreditation under ISO/IEC 17025. India has no legislation equivalent to the UK's Child Support Act testing regime; many accredited Indian labs follow AABB or ENFSI guidance voluntarily.

When does a missing-persons kinship LR fail to reach a positive identification threshold?

Common failure scenarios include: reference donors too distant in the pedigree (second or third cousins share too few alleles); post-mortem samples so degraded that only a partial STR profile is available; a population allele-frequency database poorly matched to the family's ancestry; and consanguinity inflating allele sharing in the null hypothesis. The remedies are collecting closer relatives, seeking direct personal-effect reference samples, or using a validated population-specific database.

What is the inbreeding coefficient (F) and why does it matter for kinship LRs?

F is the probability that both alleles at a locus are identical by descent, reflecting consanguineous mating within a population. In outbred populations F is near zero. In communities where cousin marriage is common (parts of South Asia, the Middle East, West Africa), F may be 0.03 to 0.0625. A non-zero F affects the kinship LR denominator because the null hypothesis of unrelatedness is less strictly true. ISFG guidelines require a sensitivity analysis across a realistic F range rather than a single point estimate for casework involving consanguineous families.

Practice

Question 1 of 5· 0 answered

The Combined Paternity Index (CPI) is best described as:

Test yourself on Forensic Biotechnology with free, timed mocks.

Practice Forensic Biotechnology questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Frequently asked questions

Your journey to becoming a forensic professional starts here.