Bayesian Networks for Complex Evidence
A Bayesian network is a directed acyclic graph in which nodes represent propositions or variables and edges encode conditional dependencies, allowing a court or analyst to reason about multiple items of evidence without violating the rules of conditional probability. This topic covers the structure, mathematics, and forensic applications of Bayesian networks, with emphasis on mixed DNA profiles and multi-source trace evidence.
Last updated:
A Bayesian network is a directed acyclic graph in which each node represents a variable or proposition and each directed edge encodes a conditional dependency between two nodes. Given the graph structure and a set of conditional probability tables, the network allows an analyst to compute the posterior probability of any node given observed values at other nodes. In forensic science, this provides a principled way to reason about multiple items of evidence, such as a mixed DNA profile, fibres, and a footwear mark, without treating them as independent when they are not. The network makes the dependency assumptions explicit and auditable, which satisfies the transparency requirement that courts in many jurisdictions impose on expert probabilistic reasoning.
The problem that Bayesian networks solve is well-illustrated by multi-source trace evidence. Suppose an analyst computes a likelihood ratio for DNA, a separate likelihood ratio for glass fragments, and a third for fibres, then multiplies the three together to obtain a combined likelihood ratio. That multiplication is valid only if the three items of evidence are conditionally independent given the propositions. If the glass and fibres both came from the same garment, they share a common cause and are not independent. A Bayesian network with a node representing the garment captures that dependency and prevents the analyst from inflating the combined weight of evidence by double-counting a single source.
Bayesian networks in forensic science trace back to work by Dawid, Mortera, and colleagues in the 1990s, and to the influence of Wigmore chart analysis on the formal representation of legal arguments. The technology became practically applicable as software platforms capable of exact and approximate inference on large networks became widely available. Today, tools such as Hugin, Netica, and AgenaRisk are used by forensic laboratories and academic research groups to model mixture interpretation, transfer and persistence of trace evidence, and the evaluation of complex multi-issue cases.
By the end of this topic you will be able to:
- Describe the structure of a Bayesian network, including nodes, directed edges, and conditional probability tables, and explain the directed acyclic graph constraint.
- Explain why the naive multiplication of likelihood ratios across multiple evidence items can overstate combined evidential weight, and how a Bayesian network corrects for this.
- Describe how a Bayesian network is constructed for a two-person mixed DNA profile and identify the key nodes and conditional probability tables required.
- Apply the concept of d-separation to determine whether two evidence nodes are conditionally independent in a simple network.
- Identify the main criticisms of Bayesian network evidence in court, including sensitivity to prior probabilities and the difficulty of specifying realistic conditional probability tables.
- Directed acyclic graph (DAG)
- A graph in which edges have a direction (from parent to child node) and no path can return to a node it has already visited. The acyclic constraint ensures the conditional independence structure is well-defined and that probability inference algorithms can terminate.
- Conditional probability table (CPT)
- A table that specifies the probability distribution of a node given every combination of states of its parent nodes. Every non-root node in a Bayesian network requires a CPT; root nodes require only a marginal probability distribution.
- D-separation
- A graphical criterion that determines whether two sets of nodes in a Bayesian network are conditionally independent given a third set. If nodes A and B are d-separated by set C, observing C renders A and B independent. Used to identify which evidence items remain informative given intermediate findings.
- Belief propagation
- An algorithm for computing marginal and posterior probabilities in a Bayesian network by passing messages between neighbouring nodes. Exact on tree-structured networks; approximate methods such as loopy belief propagation or Monte Carlo sampling are needed for networks with cycles.
- Mixture likelihood ratio
- The ratio of the probability of observing a mixed DNA profile if the person of interest is a contributor to the probability of observing the same profile if they are not. A Bayesian network computes this by marginalising over all unknown contributor genotypes.
- Sensitivity analysis
- A technique for assessing how much the posterior probabilities in a Bayesian network change when the values in the conditional probability tables are varied. Mandatory for forensic applications where CPT values are estimated from limited data; courts in several jurisdictions require sensitivity results to be disclosed alongside the main output.
Structure of a Bayesian network
A Bayesian network consists of two components: a directed acyclic graph and a set of conditional probability tables, one for each node. Each node represents a discrete or continuous variable. For forensic applications, variables are typically discrete: a node might have states {present, absent}, {contributor, non-contributor}, or {prosecution proposition true, defence proposition true}. Directed edges run from parent nodes to child nodes and encode the statement that the child's probability distribution depends on the state of the parent.
A node with no parents is a root node. Its probability table is simply a marginal distribution over its states. In a case network, the node representing the ultimate proposition, such as whether the defendant was the source of the trace, is often a root node assigned prior probabilities that reflect the prevalence of the competing propositions before any evidence is considered. A node with one or more parents has a conditional probability table specifying its distribution for each combination of parent states.
Once the graph and tables are specified, inference algorithms compute posterior probabilities for any node given observations at other nodes. In the forensic context, the observed nodes are the evidence items: the observed DNA profile, the glass refractive index, the fibre colour. The query node is typically the proposition at issue. The network propagates the observations through the graph and returns updated probabilities.
Conditional independence and the problem with naive multiplication
The joint probability of multiple evidence items can be written as a product of conditional probabilities using the chain rule. If the evidence items are conditionally independent given the propositions, that product simplifies to the product of individual likelihoods, and the combined likelihood ratio is the product of the individual likelihood ratios. This is the basis of the common forensic practice of multiplying likelihood ratios across different evidence types.
The assumption of conditional independence is often wrong. Consider a case where fibres and glass fragments are both recovered from a suspect. If the prosecution proposition is that the suspect entered the scene and the defence proposition is that they did not, the fibres and glass may both have been transferred from a single garment that the suspect wore. Their presence on the suspect is then not independent: both are consequences of the same event, contact with the scene via the garment. A Bayesian network with a node for garment contact captures this dependency. The network correctly calculates a combined likelihood ratio that is lower than the naive product of the two individual ratios.
| Approach | Handles dependency? | Risk of over-statement | Transparency of assumptions |
|---|---|---|---|
| Product of LRs (naive) | No | High when evidence shares a common cause | Low: independence assumed implicitly |
| Bayesian network | Yes | Low if graph structure is correct | High: dependencies encoded and auditable |
| Probabilistic genotyping software (DNA) | Partial: within the DNA model only | Moderate if non-DNA items are also multiplied in | Varies by platform |
The Bayesian network approach does not eliminate the risk of error: it relocates it. Instead of an implicit and unexamined independence assumption, the analyst must now explicitly specify the graph structure and the conditional probability table values. Both choices can be wrong. If the graph does not capture all relevant dependencies, or if the conditional probabilities in the tables are poorly estimated, the network output will be unreliable. The gain is that those choices are visible and can be challenged.
Bayesian networks for mixed DNA profiles
A mixed DNA profile is one that contains more than the two alleles expected from a single contributor at one or more loci. Mixtures arise when biological material from two or more people is combined, as can happen with contact stains on surfaces, sexual assault samples, or touched items passed between individuals. Interpreting a mixture requires reasoning about how many contributors are present and which alleles each contributor might carry.
A Bayesian network for a two-person mixture at a single locus has the following core structure. A root node encodes the proposition: either person of interest (POI) is contributor 1, or an unknown person is contributor 1. A second node represents the genotype of contributor 1, conditional on the proposition node. A third node represents the genotype of contributor 2, drawn from a relevant population database. An observed evidence node, the mixed profile at that locus, is conditional on both contributor genotypes. The analyst enters the observed mixture as evidence and computes the posterior probability of the proposition node.
In practice, a mixture network is replicated across multiple loci and the results combined. Modern probabilistic genotyping platforms such as STRmix, TrueAllele, and LikeLTD implement this logic, some using Bayesian network formalisms explicitly and others using equivalent probabilistic models. The likelihood ratio output of these platforms is in principle the same quantity that a manually specified Bayesian network would compute, though the specific models for peak height variation, stutter, and drop-out differ between platforms.
Multi-source trace evidence networks
Beyond DNA, Bayesian networks have been applied to glass fragments, fibres, soil, footwear marks, and combinations of these evidence types. The network structure for a multi-source case typically includes nodes for the propositions at issue, nodes for intermediate events such as contact and transfer, nodes for individual evidence items, and sometimes nodes representing background frequencies of the trace material in the environment.
A worked structure for a glass-transfer case might include: a proposition node (suspect broke the window vs unknown person broke it), a transfer node (glass transferred from window to suspect), a persistence node (glass retained on clothing until sampling), and an observation node (glass found on clothing with matching refractive index). The conditional probability table for the transfer node is informed by experimental transfer studies. The conditional probability table for the persistence node is informed by persistence experiments. The observation node's table reflects the frequency of glass with that refractive index in the relevant background population.
The network makes the transfer-and-persistence reasoning, which is often expressed informally in forensic reports, explicit and quantitative. Research groups in the UK, Netherlands, and Australia have published network structures for glass, fibres, and combined evidence scenarios. The Forensic Science International journal and the Journal of Forensic Sciences have published peer-reviewed network models that practitioners can adapt, though specifying realistic conditional probability tables from empirical data remains the most demanding part of the task.
D-separation and the anatomy of evidence flow
D-separation is the graphical criterion that tells an analyst whether two nodes in a network are conditionally independent, given a set of observed nodes. The criterion operates by examining all paths between the two nodes and checking whether each path is blocked by the observed set. A path is blocked if it passes through a non-collider that is in the observed set, or if it passes through a collider that is neither observed nor has an observed descendant.
For forensic interpretation, d-separation answers a practical question: once certain intermediate findings are fixed, does additional evidence about a particular variable still change the probability of the proposition? If the analyst has directly genotyped contributor 1 from the mixture and that genotype node is now observed, the DNA evidence from a separate single-source stain attributable to contributor 1 becomes conditionally independent of the mixture evidence, because all paths between them pass through the now-observed genotype node. Adding that single-source stain's likelihood ratio to the mixture's likelihood ratio would then be double-counting.
Collider nodes create a subtler dependency. In a network where two causes independently produce the same effect, the two causes are marginally independent but become dependent once the effect is observed. This is called explaining away. If a blood stain could have come from either an injury to the suspect or from secondary transfer, observing the stain makes the two explanations compete: evidence supporting injury reduces the posterior probability of transfer, and vice versa. A Bayesian network handles explaining away automatically through the inference algorithm.
Court admissibility and practical limitations
Courts in multiple jurisdictions have considered Bayesian network evidence. The English Court of Appeal addressed the general use of probabilistic reasoning in criminal cases in R v Adams (1996) and R v T (2010), with the latter ruling creating significant uncertainty about likelihood ratio evidence for non-DNA forensic science. The Dutch Supreme Court examined network-based reasoning in the case of Lucia de Berk, a nurse convicted partly on statistical evidence later shown to rely on flawed probability calculations. The case became a landmark for the risks of informal probabilistic reasoning in court and is frequently cited in arguments for explicit network formalisation.
In India, forensic expert evidence is governed by the Bharatiya Sakshya Adhiniyam 2023 (which replaced the Indian Evidence Act 1872). Section 39 of the BSA addresses expert opinion and requires that the basis of the opinion be stated. A Bayesian network output would fall under expert opinion, and the expert would be required to disclose the network structure, the source of the conditional probability table values, and the results of sensitivity analysis. In the United States, Daubert v Merrell Dow Pharmaceuticals (1993) and the associated Federal Rules of Evidence require that expert methodology be based on sufficient facts, rest on reliable principles and methods, and have been reliably applied to the case facts. Courts applying Daubert have generally required disclosure of the network assumptions and validation data.
Beyond admissibility, the practical limitations of Bayesian networks in forensic practice include: the difficulty of specifying accurate conditional probability tables when experimental data on transfer, persistence, and background rates are sparse; the computational cost of exact inference in large networks with many continuous variables; and the expertise required to build, validate, and communicate a network to a court. Sensitivity analysis, in which the analyst varies the uncertain CPT values and reports how much the output changes, is a partial remedy for parameter uncertainty and should be reported in every forensic network analysis.
An analyst recovers fibres and glass from a suspect. She computes a likelihood ratio of 500 for the fibres and 200 for the glass, then multiplies them to get a combined LR of 100,000. When is this multiplication valid?
Key Takeaways
- A Bayesian network is a directed acyclic graph paired with conditional probability tables; it allows an analyst to compute posterior probabilities for any node given observations elsewhere in the graph, without violating the rules of conditional probability.
- Multiplying likelihood ratios across evidence items implicitly assumes conditional independence. When evidence items share a common cause, that assumption is wrong and the product overstates the combined weight of evidence. A Bayesian network corrects this by modelling the shared cause explicitly.
- For mixed DNA profiles, a Bayesian network with proposition, contributor genotype, and observed mixture nodes computes the likelihood ratio by marginalising over all possible contributor genotypes, allowing formal evaluation of mixtures from two or more contributors.
- D-separation identifies when two evidence nodes are conditionally independent given intermediate findings; this prevents double-counting evidence that flows through the same path in the graph.
- Courts in the UK, US, Netherlands, and under India's Bharatiya Sakshya Adhiniyam 2023 require disclosure of the network structure, conditional probability table sources, and sensitivity analysis; the expert should report a likelihood ratio at the evidence boundary, not a posterior probability of guilt.
What is a Bayesian network and why is it used in forensic science?
What is the difference between a Bayesian network and a simple likelihood ratio?
How are Bayesian networks applied to mixed DNA profiles?
What is d-separation in a Bayesian network?
Have courts accepted Bayesian network evidence?
Test yourself on Forensic Statistics with free, timed mocks.
Practice Forensic Statistics questionsSpotted an error in this page? Report a correction or read our editorial standards.