Bayesian Networks for Complex Evidence

A Bayesian network is a directed acyclic graph in which nodes represent propositions or variables and edges encode conditional dependencies, allowing a court or analyst to reason about multiple items of evidence without violating the rules of conditional probability. This topic covers the structure, mathematics, and forensic applications of Bayesian networks, with emphasis on mixed DNA profiles and multi-source trace evidence.

Last updated: 24 Jun 2026

A Bayesian network is a directed acyclic graph in which each node represents a variable or proposition and each directed edge encodes a conditional dependency between two nodes. Given the graph structure and a set of conditional probability tables, the network allows an analyst to compute the posterior probability of any node given observed values at other nodes. In forensic science, this provides a principled way to reason about multiple items of evidence, such as a mixed DNA profile, fibres, and a footwear mark, without treating them as independent when they are not. The network makes the dependency assumptions explicit and auditable, which satisfies the transparency requirement that courts in many jurisdictions impose on expert probabilistic reasoning.

The problem that Bayesian networks solve is well-illustrated by multi-source trace evidence. Suppose an analyst computes a likelihood ratio for DNA, a separate likelihood ratio for glass fragments, and a third for fibres, then multiplies the three together to obtain a combined likelihood ratio. That multiplication is valid only if the three items of evidence are conditionally independent given the propositions. If the glass and fibres both came from the same garment, they share a common cause and are not independent. A Bayesian network with a node representing the garment captures that dependency and prevents the analyst from inflating the combined weight of evidence by double-counting a single source.

Bayesian networks in forensic science trace back to work by Dawid, Mortera, and colleagues in the 1990s, and to the influence of Wigmore chart analysis on the formal representation of legal arguments. The technology became practically applicable as software platforms capable of exact and approximate inference on large networks became widely available. Today, tools such as Hugin, Netica, and AgenaRisk are used by forensic laboratories and academic research groups to model mixture interpretation, transfer and persistence of trace evidence, and the evaluation of complex multi-issue cases.

By the end of this topic you will be able to:

Describe the structure of a Bayesian network, including nodes, directed edges, and conditional probability tables, and explain the directed acyclic graph constraint.
Explain why the naive multiplication of likelihood ratios across multiple evidence items can overstate combined evidential weight, and how a Bayesian network corrects for this.
Describe how a Bayesian network is constructed for a two-person mixed DNA profile and identify the key nodes and conditional probability tables required.
Apply the concept of d-separation to determine whether two evidence nodes are conditionally independent in a simple network.
Identify the main criticisms of Bayesian network evidence in court, including sensitivity to prior probabilities and the difficulty of specifying realistic conditional probability tables.

Key terms

Directed acyclic graph (DAG): A graph in which edges have a direction (from parent to child node) and no path can return to a node it has already visited. The acyclic constraint ensures the conditional independence structure is well-defined and that probability inference algorithms can terminate.
Conditional probability table (CPT): A table that specifies the probability distribution of a node given every combination of states of its parent nodes. Every non-root node in a Bayesian network requires a CPT; root nodes require only a marginal probability distribution.
D-separation: A graphical criterion that determines whether two sets of nodes in a Bayesian network are conditionally independent given a third set. If nodes A and B are d-separated by set C, observing C renders A and B independent. Used to identify which evidence items remain informative given intermediate findings.
Belief propagation: An algorithm for computing marginal and posterior probabilities in a Bayesian network by passing messages between neighbouring nodes. Exact on tree-structured networks; approximate methods such as loopy belief propagation or Monte Carlo sampling are needed for networks with cycles.
Mixture likelihood ratio: The ratio of the probability of observing a mixed DNA profile if the person of interest is a contributor to the probability of observing the same profile if they are not. A Bayesian network computes this by marginalising over all unknown contributor genotypes.
Sensitivity analysis: A technique for assessing how much the posterior probabilities in a Bayesian network change when the values in the conditional probability tables are varied. Mandatory for forensic applications where CPT values are estimated from limited data; courts in several jurisdictions require sensitivity results to be disclosed alongside the main output.

Structure of a Bayesian network

A Bayesian network consists of two components: a directed acyclic graph and a set of conditional probability tables, one for each node. Each node represents a discrete or continuous variable. For forensic applications, variables are typically discrete: a node might have states {present, absent}, {contributor, non-contributor}, or {prosecution proposition true, defence proposition true}. Directed edges run from parent nodes to child nodes and encode the statement that the child's probability distribution depends on the state of the parent.

A node with no parents is a root node. Its probability table is simply a marginal distribution over its states. In a case network, the node representing the ultimate proposition, such as whether the defendant was the source of the trace, is often a root node assigned prior probabilities that reflect the prevalence of the competing propositions before any evidence is considered. A node with one or more parents has a conditional probability table specifying its distribution for each combination of parent states.

Once the graph and tables are specified, inference algorithms compute posterior probabilities for any node given observations at other nodes. In the forensic context, the observed nodes are the evidence items: the observed DNA profile, the glass refractive index, the fibre colour. The query node is typically the proposition at issue. The network propagates the observations through the graph and returns updated probabilities.

Conditional independence and the problem with naive multiplication

The joint probability of multiple evidence items can be written as a product of conditional probabilities using the chain rule. If the evidence items are conditionally independent given the propositions, that product simplifies to the product of individual likelihoods, and the combined likelihood ratio is the product of the individual likelihood ratios. This is the basis of the common forensic practice of multiplying likelihood ratios across different evidence types.

The assumption of conditional independence is often wrong. Consider a case where fibres and glass fragments are both recovered from a suspect. If the prosecution proposition is that the suspect entered the scene and the defence proposition is that they did not, the fibres and glass may both have been transferred from a single garment that the suspect wore. Their presence on the suspect is then not independent: both are consequences of the same event, contact with the scene via the garment. A Bayesian network with a node for garment contact captures this dependency. The network correctly calculates a combined likelihood ratio that is lower than the naive product of the two individual ratios.

Approach	Handles dependency?	Risk of over-statement	Transparency of assumptions
Product of LRs (naive)	No	High when evidence shares a common cause	Low: independence assumed implicitly
Bayesian network	Yes	Low if graph structure is correct	High: dependencies encoded and auditable
Probabilistic genotyping software (DNA)	Partial: within the DNA model only	Moderate if non-DNA items are also multiplied in	Varies by platform

The Bayesian network approach does not eliminate the risk of error: it relocates it. Instead of an implicit and unexamined independence assumption, the analyst must now explicitly specify the graph structure and the conditional probability table values. Both choices can be wrong. If the graph does not capture all relevant dependencies, or if the conditional probabilities in the tables are poorly estimated, the network output will be unreliable. The gain is that those choices are visible and can be challenged.

Bayesian networks for mixed DNA profiles

A mixed DNA profile is one that contains more than the two alleles expected from a single contributor at one or more loci. Mixtures arise when biological material from two or more people is combined, as can happen with contact stains on surfaces, sexual assault samples, or touched items passed between individuals. Interpreting a mixture requires reasoning about how many contributors are present and which alleles each contributor might carry.

A Bayesian network for a two-person mixture at a single locus has the following core structure. A root node encodes the proposition: either person of interest (POI) is contributor 1, or an unknown person is contributor 1. A second node represents the genotype of contributor 1, conditional on the proposition node. A third node represents the genotype of contributor 2, drawn from a relevant population database. An observed evidence node, the mixed profile at that locus, is conditional on both contributor genotypes. The analyst enters the observed mixture as evidence and computes the posterior probability of the proposition node.

In practice, a mixture network is replicated across multiple loci and the results combined. Modern probabilistic genotyping platforms such as STRmix, TrueAllele, and LikeLTD implement this logic, some using Bayesian network formalisms explicitly and others using equivalent probabilistic models. The likelihood ratio output of these platforms is in principle the same quantity that a manually specified Bayesian network would compute, though the specific models for peak height variation, stutter, and drop-out differ between platforms.

Multi-source trace evidence networks

Beyond DNA, Bayesian networks have been applied to glass fragments, fibres, soil, footwear marks, and combinations of these evidence types. The network structure for a multi-source case typically includes nodes for the propositions at issue, nodes for intermediate events such as contact and transfer, nodes for individual evidence items, and sometimes nodes representing background frequencies of the trace material in the environment.

A worked structure for a glass-transfer case might include: a proposition node (suspect broke the window vs unknown person broke it), a transfer node (glass transferred from window to suspect), a persistence node (glass retained on clothing until sampling), and an observation node (glass found on clothing with matching refractive index). The conditional probability table for the transfer node is informed by experimental transfer studies. The conditional probability table for the persistence node is informed by persistence experiments. The observation node's table reflects the frequency of glass with that refractive index in the relevant background population.

The network makes the transfer-and-persistence reasoning, which is often expressed informally in forensic reports, explicit and quantitative. Research groups in the UK, Netherlands, and Australia have published network structures for glass, fibres, and combined evidence scenarios. The Forensic Science International journal and the Journal of Forensic Sciences have published peer-reviewed network models that practitioners can adapt, though specifying realistic conditional probability tables from empirical data remains the most demanding part of the task.

D-separation and the anatomy of evidence flow

D-separation is the graphical criterion that tells an analyst whether two nodes in a network are conditionally independent, given a set of observed nodes. The criterion operates by examining all paths between the two nodes and checking whether each path is blocked by the observed set. A path is blocked if it passes through a non-collider that is in the observed set, or if it passes through a collider that is neither observed nor has an observed descendant.

For forensic interpretation, d-separation answers a practical question: once certain intermediate findings are fixed, does additional evidence about a particular variable still change the probability of the proposition? If the analyst has directly genotyped contributor 1 from the mixture and that genotype node is now observed, the DNA evidence from a separate single-source stain attributable to contributor 1 becomes conditionally independent of the mixture evidence, because all paths between them pass through the now-observed genotype node. Adding that single-source stain's likelihood ratio to the mixture's likelihood ratio would then be double-counting.

Collider nodes create a subtler dependency. In a network where two causes independently produce the same effect, the two causes are marginally independent but become dependent once the effect is observed. This is called explaining away. If a blood stain could have come from either an injury to the suspect or from secondary transfer, observing the stain makes the two explanations compete: evidence supporting injury reduces the posterior probability of transfer, and vice versa. A Bayesian network handles explaining away automatically through the inference algorithm.

Court admissibility and practical limitations

Courts in multiple jurisdictions have considered Bayesian network evidence. The English Court of Appeal addressed the general use of probabilistic reasoning in criminal cases in R v Adams (1996) and R v T (2010), with the latter ruling creating significant uncertainty about likelihood ratio evidence for non-DNA forensic science. The Dutch Supreme Court examined network-based reasoning in the case of Lucia de Berk, a nurse convicted partly on statistical evidence later shown to rely on flawed probability calculations. The case became a landmark for the risks of informal probabilistic reasoning in court and is frequently cited in arguments for explicit network formalisation.

In India, forensic expert evidence is governed by the Bharatiya Sakshya Adhiniyam 2023 (which replaced the Indian Evidence Act 1872). Section 39 of the BSA addresses expert opinion and requires that the basis of the opinion be stated. A Bayesian network output would fall under expert opinion, and the expert would be required to disclose the network structure, the source of the conditional probability table values, and the results of sensitivity analysis. In the United States, Daubert v Merrell Dow Pharmaceuticals (1993) and the associated Federal Rules of Evidence require that expert methodology be based on sufficient facts, rest on reliable principles and methods, and have been reliably applied to the case facts. Courts applying Daubert have generally required disclosure of the network assumptions and validation data.

Beyond admissibility, the practical limitations of Bayesian networks in forensic practice include: the difficulty of specifying accurate conditional probability tables when experimental data on transfer, persistence, and background rates are sparse; the computational cost of exact inference in large networks with many continuous variables; and the expertise required to build, validate, and communicate a network to a court. Sensitivity analysis, in which the analyst varies the uncertain CPT values and reports how much the output changes, is a partial remedy for parameter uncertainty and should be reported in every forensic network analysis.

Worked example

A Bayesian network for a two-contributor DNA mixture

Tracing how a simple two-node network evaluates a mixed profile at a single locus, from proposition to likelihood ratio.

A mixed stain recovered from a door handle contains three alleles at locus D3S1358: alleles 15, 16, and 18. The reference profile of the person of interest (POI) shows genotype 15,16 at this locus. The prosecution proposition Hp is that the POI and one unknown contributor deposited the stain. The defence proposition Hd is that two unknown contributors deposited the stain. We build a network with four nodes at this locus.

Proposition node P. States: {Hp, Hd}. This is a root node. The marginal probabilities at this node are not assigned by the analyst (to avoid the prior-probability-of-guilt problem); instead, the network will be queried for the likelihood ratio.
Contributor 1 genotype node G1. Parent: P. Conditional on Hp, G1 = 15,16 with probability 1 (the POI's known genotype). Conditional on Hd, G1 takes values according to the allele frequencies in the relevant population database for locus D3S1358.
Contributor 2 genotype node G2. No parent. Under both Hp and Hd, G2 is an unknown person drawn from the population. Under the Hardy-Weinberg assumption, genotype probabilities are computed from allele frequencies. For the allele 18 to appear in the mixture, contributor 2 must carry allele 18: so G2 states that are relevant are those containing allele 18.
Evidence node E. Parents: G1 and G2. The observed state is the mixture {15,16,18}. The conditional probability table specifies the probability of observing this mixture given each combination of G1 and G2 states, accounting for the possibility of allele sharing between contributors. For genotype combinations where G1 and G2 together account for exactly the observed alleles, the probability is high; for combinations where extra or missing alleles are predicted, the probability is low or zero under a simple model.
Computing the likelihood ratio. Enter E = {15,16,18} as an observation. The network returns P(E | Hp) and P(E | Hd) by marginalising over the genotype nodes. The likelihood ratio LR = P(E | Hp) / P(E | Hd). If allele 18 is relatively rare in the population, the LR will be substantially greater than 1, supporting the prosecution proposition. The analyst reports the LR, not the posterior probability of Hp.

Check your understanding

Question 1 of 4· 0 answered

An analyst recovers fibres and glass from a suspect. She computes a likelihood ratio of 500 for the fibres and 200 for the glass, then multiplies them to get a combined LR of 100,000. When is this multiplication valid?

Key Takeaways

A Bayesian network is a directed acyclic graph paired with conditional probability tables; it allows an analyst to compute posterior probabilities for any node given observations elsewhere in the graph, without violating the rules of conditional probability.
Multiplying likelihood ratios across evidence items implicitly assumes conditional independence. When evidence items share a common cause, that assumption is wrong and the product overstates the combined weight of evidence. A Bayesian network corrects this by modelling the shared cause explicitly.
For mixed DNA profiles, a Bayesian network with proposition, contributor genotype, and observed mixture nodes computes the likelihood ratio by marginalising over all possible contributor genotypes, allowing formal evaluation of mixtures from two or more contributors.
D-separation identifies when two evidence nodes are conditionally independent given intermediate findings; this prevents double-counting evidence that flows through the same path in the graph.
Courts in the UK, US, Netherlands, and under India's Bharatiya Sakshya Adhiniyam 2023 require disclosure of the network structure, conditional probability table sources, and sensitivity analysis; the expert should report a likelihood ratio at the evidence boundary, not a posterior probability of guilt.

What is a Bayesian network and why is it used in forensic science?

A Bayesian network is a directed acyclic graph where nodes represent variables or propositions and edges represent conditional dependencies. In forensic science it is used to combine multiple items of evidence, such as DNA, fibres, and fingerprints, without incorrectly assuming they are independent. The network encodes which variables influence which others, allowing the analyst to propagate updated probabilities through the graph using Bayes' theorem.

What is the difference between a Bayesian network and a simple likelihood ratio?

A simple likelihood ratio evaluates one item of evidence given one pair of propositions. A Bayesian network extends this to multiple evidence items and intermediate variables by encoding conditional dependencies among all of them. Where a single likelihood ratio requires the analyst to assume evidence items are independent, a Bayesian network allows that assumption to be relaxed and models the real dependency structure.

How are Bayesian networks applied to mixed DNA profiles?

A mixed DNA profile contains alleles from two or more contributors. A Bayesian network for mixture interpretation nodes the number of contributors, each contributor's genotype, and the observed mixture as dependent variables. The analyst specifies prior probabilities and conditional probability tables, then enters the observed mixture as evidence to compute the posterior probability that a person of interest is a contributor.

What is d-separation in a Bayesian network?

D-separation is a graphical criterion for determining whether two nodes in a Bayesian network are conditionally independent given a set of observed nodes. If two nodes are d-separated by an observed set, knowing the values of those observed nodes renders the two nodes independent. Forensic analysts use d-separation to identify which items of evidence remain informative once certain intermediate propositions are fixed.

Have courts accepted Bayesian network evidence?

Courts in several jurisdictions have admitted Bayesian network analyses. The UK Court of Appeal addressed probabilistic reasoning in R v Adams (1996) and related cases, and the Dutch Supreme Court considered network-based reasoning in the Lucia de Berk case. The English courts have repeatedly emphasised that the underlying assumptions must be disclosed and that experts must not present a network output as if it were a direct probability of guilt.

Test yourself on Forensic Statistics with free, timed mocks.

Practice Forensic Statistics questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Key Takeaways

Your journey to becoming a forensic professional starts here.