Practice with national-level exam (FACT, FACT Plus, NET, CUET, etc.) mocks, learn from structured notes, and get your doubts solved in one place.
Acoustic evidence in forensic casework: sound-wave physics (frequency + amplitude + propagation + reflection + Doppler), the forensic-recording examination workflow under the SWGDE + ENFSI Forensic Speech and Audio guidelines; gunshot acoustic analysis — separation of the muzzle blast (subsonic spherical wave) from the supersonic bullet shock wave (Mach cone) for distance-to-shooter localisation, the JFK Warren Commission + HSCA acoustic-evidence debate, modern ShotSpotter + SST gunshot-locator systems; the ENF (Electrical Network Frequency) timestamping for recording-authenticity validation.
Last updated:
Sound leaves physical traces. A gunshot produces two distinct pressure waves that travel at different speeds, and a careful analyst can use their time separation to calculate how far the shooter stood from a microphone. A mains-powered audio recording contains an invisible timestamp baked into its electrical interference, accurate enough to place a recording within minutes on the calendar. These are not theoretical possibilities: they are techniques that have influenced verdicts in courts from Washington D.C. to New Delhi, and they sit at the intersection of physics, engineering, and law.
Forensic acoustics covers a broader territory than gunshots alone. Any audio or vibration recording offered as evidence passes through an examination workflow that addresses integrity (has the file been edited or re-encoded?), authenticity (does the embedded metadata match the signal physics?), and content (what exactly was said, and when?). The Forensic Speech and Audio Guidelines published by the ENFSI (European Network of Forensic Science Institutes) and the parallel guidance from the US Scientific Working Group on Digital Evidence (SWGDE) define the examination steps that survive court scrutiny in EU member states, the US federal system, and, increasingly, courts in India applying BSA 2023 § 63 to electronic records.
The Electrical Network Frequency (ENF) technique is among the most elegant tools in audio forensics. The mains electricity supply oscillates at a nominal frequency, 50 Hz in most of the world and 60 Hz in North America, but the actual instantaneous frequency drifts continuously around that nominal value. Any recording made near an active power grid captures that drift as a faint hum embedded in the signal. The recorded ENF trace can be compared against a reference database of measured grid-frequency history to establish when, and sometimes where, the recording was made. The technique has been used to expose fabricated alibi recordings and to authenticate recordings in criminal proceedings in the UK (see R v. Flynn and St John [2008] EWCA Crim 970) and in trials before Indian sessions courts.
The sections below work from first principles outward: sound-wave physics, the gunshot acoustic event and its two-component structure, the geometry of localisation, the landmark JFK debate that put acoustic forensics in the public eye, modern gunshot-detection networks, and finally the ENF timestamping method that links audio evidence to calendar time.
*You cannot analyse what you cannot model. Every acoustic examination begins with the physics of the pressure wave.*
Sound is a mechanical disturbance that propagates through a medium as alternating compressions and rarefactions. In air, the disturbance is a longitudinal pressure wave: the molecules oscillate parallel to the direction of travel, and the wave speed at standard atmospheric conditions (20 degrees Celsius, sea level) is approximately 343 m/s. That speed changes with temperature, humidity, and altitude, a variation that becomes significant when reconstructing outdoor gunshot geometry across hundreds of metres.
Frequency, amplitude and wavelength. The frequency of a sound wave (measured in Hertz, cycles per second) determines pitch; the amplitude (measured in Pascals, or more practically in decibels relative to the threshold of human hearing) determines loudness. The relationship between frequency (f), wavelength (lambda), and propagation speed (c) is c = f x lambda. A 1 kHz tone at 20 degrees Celsius has a wavelength of approximately 0.34 m. This wavelength relationship matters forensically because it controls how a sound diffracts around obstacles and how a microphone at a given position will respond to a source at a different azimuth.
Reflection, absorption and the forensic environment. Sound reflects from hard surfaces (concrete, brick, glass, metal) and is absorbed by soft ones (vegetation, carpeting, soil). In a partially enclosed urban environment, a single acoustic event produces a direct wave and multiple reflections. These reflections arrive at a microphone as a sequence of lower-amplitude copies of the original event, with the delay determined by path-length difference divided by the speed of sound. Forensic acoustic reconstruction of a shooting in an urban courtyard must therefore account for this multipath structure when identifying which arrival corresponds to the direct sound and which are reflections.
The Doppler effect. When a sound source moves toward or away from a listener, the apparent frequency shifts. A moving vehicle horn shifts pitch measurably. For forensic purposes, the Doppler shift of a supersonic projectile is not the primary analysis tool (the Mach-cone model, described in Section 3, is more useful), but Doppler analysis of vehicle sounds in accident reconstruction recordings has been used to corroborate or challenge claimed vehicle speeds.
Decibel notation. The decibel scale is logarithmic because human loudness perception is approximately logarithmic. A 20 dB increase corresponds to a ten-fold increase in amplitude and roughly a four-fold increase in perceived loudness. A typical handgun muzzle blast at 1 m measures approximately 157-165 dB SPL (Sound Pressure Level). At 100 m in open air, applying the inverse-square-law attenuation and standard atmospheric absorption, the blast has attenuated to approximately 130-140 dB SPL. These figures are relevant for assessing whether a claimed recording device at a stated distance would realistically have captured the event without clipping.
*Before a recording can tell you what happened, it must tell you that it has not been touched.*
The forensic examination of an audio recording follows a documented chain-of-custody and an examination protocol that parallels the digital forensics workflow for any electronic evidence. In the US, SWGDE (Scientific Working Group on Digital Evidence) has published Best Practices for Examining Digital Audio and Video Evidence (current edition 2017, updated 2020). The ENFSI Forensic Speech and Audio Analysis Working Group (FSAAWG) publishes parallel guidelines used across EU jurisdictions. The UK's Forensic Science Regulator (FSR) references the ENFSI guidelines in its accreditation framework. In India, BSA 2023 § 63 and the associated certificate provisions cover the admissibility of electronic records, and the Supreme Court's rulings in Tukaram S. Dighole v. Manikrao Shivaji Kokate (2010) and Arjun Panditrao Khotkar v. Kailash Kushanrao Gorantyal (2020) establish that the certificate of electronic authenticity must be signed by a responsible person familiar with the system that produced the recording.
File integrity. The first step is hash-based integrity verification. An MD5 or SHA-256 hash computed from the audio file's binary data provides a fingerprint that changes if any bit in the file is altered. If the submitting agency provides a hash computed at acquisition, the examiner re-computes it on the working copy and compares. A mismatch does not necessarily indicate tampering (errors in transcription or file-system damage produce identical symptoms) but requires explanation. The SWGDE guidelines require hash values to be recorded in the case file before any analytical processing.
Format and compression analysis. Audio files may be uncompressed (WAV, AIFF) or compressed (MP3, AAC, OPUS, FLAC). Lossy compression formats (MP3, AAC) introduce characteristic artefacts. An audio file claimed to be a first-generation recording that shows MP3 compression artefacts has been converted at some point, which is inconsistent with the claim if the original device does not use MP3. Resampling, level normalisation, and noise-reduction processing all leave detectable signatures in the spectral and statistical properties of the file. The examiner uses spectrum analysis (typically with software such as Adobe Audition, iZotope RX, or the open-source Audacity and SoX) to characterise the signal and identify any discontinuities.
Discontinuity and cut detection. A recording that has been spliced typically shows a phase discontinuity at the cut point: the waveform does not flow naturally from one segment to the next. Spectral analysis of a splice often reveals a brief artefact in the spectrogram at the cut timestamp. For recordings with continuous background noise (HVAC hum, traffic, 50/60 Hz mains interference), a cut may be detectable as a sudden change in the noise character or an instantaneous silence that does not match normal recording behaviour.
Voice authentication. Whether a particular voice in a recording belongs to a named individual is addressed by speaker identification, a separate discipline that uses formant analysis, long-term average spectrum, and probabilistic likelihood-ratio comparison. This sits at the boundary between forensic acoustics and forensic speech science, and the authoritative guidelines are the ENFSI FSAAWG speaker comparison guidelines (2016, under revision 2024) and, in Australia, the ANZFSS guidelines. Voice-comparison evidence has a contested history in UK courts (see R v. O'Doherty [2002] EWCA Crim 2898 and the subsequent Criminal Practice Directions), and in Canada, the Supreme Court's R v. J.L.J. (2000) frames the Mohan admissibility test that applies to acoustic expert evidence. The mechanics of speaker identification are covered in the planned Biometrics and Voice subject; this topic focuses on the acoustic physics of the recording itself.
*Two waves, two different physics, one firing event. Separating them is the core of gunshot acoustic analysis.*
A firearm discharge produces two distinct acoustic events, and understanding both is essential for distance and direction estimation.
The muzzle blast. When the propellant gases escape from the muzzle, they expand rapidly and produce a pressure transient that propagates outward in approximately spherical form. This is a subsonic wave (its propagation speed is the ambient speed of sound, approximately 343 m/s in standard air). The muzzle blast is the dominant acoustic event heard by a bystander in the same room or at close range, and it is the wave captured by most surveillance microphones. The spectral content of the muzzle blast encodes information about calibre: larger-calibre weapons produce lower-frequency content and longer pulse durations. High-performance muzzle-blast classification algorithms trained on calibre-specific datasets can identify calibre from a single acoustic recording with meaningful probability, though calibre identification from acoustics alone remains probabilistic rather than definitive.
The supersonic bullet: the Mach cone. Bullets fired from most rifle and many handgun cartridges travel faster than the speed of sound. A supersonic projectile generates a Mach cone: a conical shock wave that propagates at an angle determined by the projectile's Mach number. If the projectile's speed is v and the speed of sound is c, the half-angle of the Mach cone (alpha) satisfies sin(alpha) = c/v. A 5.56 x 45 mm NATO rifle round travelling at approximately 940 m/s at the muzzle has a Mach number of about 2.7 and a half-angle of approximately 22 degrees. The shock wave from the bullet passes a microphone at a different time from the muzzle blast because the shock wave travels with the bullet rather than outward from the muzzle, producing a distinct pressure arrival that precedes the muzzle blast when the microphone is down-range from the shooter.
Time-of-arrival analysis for localisation. In a multi-microphone sensor network (a city-wide gunshot-detection network, or a surveillance camera array with audio), each microphone records the muzzle blast at a slightly different time because each microphone is at a different distance from the event. The Time Difference of Arrival (TDOA) between sensor pairs, combined with the known sensor positions, constrains the source location geometrically. With three non-collinear sensors, the intersecting hyperbolas of constant TDOA define a 2D location. With four or more sensors, an overdetermined system yields a least-squares location estimate with quantifiable uncertainty.
Muzzle-blast-to-shock-wave interval and shooter distance. When both the muzzle blast and the bullet shock wave are captured on a single microphone recording, their time separation encodes the geometry of the event. The muzzle blast travels at the speed of sound from the muzzle to the microphone. The shock wave travels with the bullet (at bullet speed) as a propagating cone. Because the bullet slows along its flight path, the shock wave's time of arrival at a microphone depends on both the muzzle-to-microphone distance and the angle between the bullet trajectory and the microphone direction. The mathematical framework for this inversion was formalised by Lorenor and Dahm, and their model is implemented in the acoustic analysis software used by ShotSpotter, the leading commercial gunshot-detection network.
*No acoustic evidence in history has been scrutinised more intensely, or reversed more completely, than the HSCA's 1979 recording.*
The assassination of President John F. Kennedy in Dallas on 22 November 1963 produced the most publicly debated acoustic evidence in forensic history. The acoustic issue centres on a dictabelt recording made from a motorcycle radio microphone left open during the motorcade. This recording was analysed by two sets of investigators and reached opposite conclusions.
The HSCA 1979 analysis. The House Select Committee on Assassinations (HSCA) commissioned a study by James Barger (Bolt, Beranek and Newman) and Mark Weiss and Ernest Aschkenasy (Queens College, CUNY). They applied a gunshot-impulse matching technique: test shots were fired from positions consistent with the grassy knoll and the Book Depository, the acoustic signatures recorded from multiple microphone positions on Dealey Plaza, and the resulting waveform templates compared against the dictabelt recording. Barger identified four candidate impulse sequences; Weiss and Aschkenasy focused on the third, matching it to a position on the grassy knoll. They reported a greater-than-95% probability of a shot from the grassy knoll, supporting a conspiracy finding. The HSCA published its report with this conclusion in 1979.
The NRC 1982 committee. The National Research Council convened a panel of acoustic and signal-processing experts who re-examined the dictabelt recording in 1982. The panel identified a critical problem: a cross-correlation analysis by Norman Ramsey showed that the candidate impulse sequence on the dictabelt occurred approximately one minute after the assassination, based on crosstalk synchronisation with a separate police channel recording of the shooting itself. This timing meant the candidate impulse could not be a gunshot from the event. The NRC concluded that the HSCA acoustic evidence for a second shooter was invalid. The Warren Commission's single-shooter conclusion therefore rests on non-acoustic evidence; the acoustic debate has not been definitively settled to the satisfaction of all researchers, and subsequent analyses by Linsker et al. (1998) and others have challenged the NRC timing argument, but the dominant forensic consensus follows the 1982 NRC report.
Scientific lessons for casework. The JFK debate established several procedural lessons that now inform acoustic forensics practice. First, acoustic matching requires a synchronised timing reference: a recording without an absolute time anchor cannot be placed in event chronology without corroborating evidence. Second, test-recording replication (firing test shots at the original location) is the proper validation for template-matching analysis. Third, the peer-review process exposed methodological weaknesses that the original commission did not catch, underscoring the need for independent reproducibility.
India and UK context. Audio evidence disputes have arisen in India in high-profile cases including telephone-intercept evidence in the 2008 Mumbai attacks trial, where the authenticity of recorded conversations was challenged before the special TADA court. The UK's admissibility framework for audio evidence in terrorism cases is detailed in the Regulation of Investigatory Powers Act 2000 and the associated Codes of Practice; authentication of intercept recordings follows procedures roughly analogous to SWGDE guidelines but with additional statutory constraints.
*More than 100 cities now operate real-time acoustic gunshot detection. The technology depends on the same physics that reconstructed Dealey Plaza.*
ShotSpotter (now trading as SoundThinking) is the dominant commercial gunshot-detection network, deployed in more than 100 cities across the US and in select jurisdictions internationally, including several South African and Latin American cities. The underlying physics is the multi-microphone TDOA framework described in Section 3.
System architecture. A typical ShotSpotter deployment places acoustic sensors (hardened omnidirectional microphones with embedded signal-processing chips) at approximately one-per-acre density in covered areas. Each sensor timestamps detected acoustic events to within approximately 0.1 milliseconds, synchronised via GPS. When a sensor detects an event above a threshold amplitude, it transmits a short audio snippet and its timestamp to a central server. The server applies TDOA triangulation to the arriving reports and produces a location estimate with a reported radius of uncertainty, typically quoted as within 25 metres for a three-sensor cluster.
Classification. Not every impulsive sound is a gunshot. Vehicles backfiring, fireworks, metal dumpster lids, and construction nail-guns all produce impulsive sounds that can trigger sensors. ShotSpotter's classification pipeline uses a combination of spectral features (gunshots have characteristic energy in the 100 Hz to 2 kHz range, a sharp rise and exponential decay) and a trained classifier to distinguish gunshots from non-gunshot impulsive events. The company reports a false-positive rate of approximately 0.5% in published evaluations, though independent assessments by academics (see Asher and colleagues, University of Chicago, 2021) have challenged some of the deployment methodology and questioned whether ShotSpotter evidence influences policing patterns in ways that deserve scrutiny beyond the acoustic accuracy.
Evidentiary status. ShotSpotter location reports and audio recordings have been admitted in US state and federal courts as business records and as expert opinion, though challenges under Daubert have succeeded in some jurisdictions. In People v. Johnson (Cook County, Illinois, 2021), the defence challenged ShotSpotter data as unreliable; the court allowed the evidence but required the company to produce documentation of the classifier's validation dataset. In UK practice, comparable acoustic-location evidence from CCTV audio has been admitted in Crown Court proceedings following forensic acoustic analysis; the CPS guidance on expert evidence requires a statement of the method's accuracy and uncertainty for any quantitative forensic opinion.
SST (ShotSpotter successor platforms) and international variants. Variants include Genetec's AV-900 acoustic sensor used in Canadian municipalities (notably Toronto and Winnipeg), and the Shot Detector platform piloted in parts of Mumbai and Bangalore as part of smart-city surveillance infrastructure. No published court-admissibility ruling on these Indian deployments is available in the open literature as of 2025, but BSA 2023 § 63's certificate framework and the Supreme Court's Arjun Panditrao guidance on electronic records provide the applicable admissibility architecture.
*The power grid is an involuntary witness. Every recording made near a mains socket carries a timestamp hidden in the noise floor.*
The Electrical Network Frequency (ENF) technique exploits a physical property of mains electricity that is invisible to most listeners. The power grid nominally operates at 50 Hz (Europe, Asia, Africa, Australia) or 60 Hz (North America, parts of Latin America and Japan), but the actual instantaneous frequency varies slightly and continuously from that nominal value. Grid operators maintain the long-run average at nominal by adjusting generator speeds in real time as load fluctuates, but at any given second the frequency may be 49.98 Hz or 50.02 Hz rather than exactly 50.00 Hz. This fluctuation is deterministic, continuous, and measurable, and it is recorded on the grid-wide database maintained by system operators.
How ENF enters a recording. Any recording made with mains-powered equipment, or made near mains-powered equipment, captures the 50 Hz or 60 Hz hum from the power supply as a low-amplitude signal in the recording. Even battery-powered devices may capture ENF through electromagnetic induction from nearby mains equipment or, for recordings made near lighting and electrical circuits, through acoustic coupling. The ENF signal in the recording is typically 40 to 60 dB below the recording's main content but is extractable using narrow-band spectral analysis centred on the nominal frequency.
The comparison database. The UK National Grid operates a continuous ENF measurement database, as do grid operators in other EU countries, the US (various regional operators), and India (the National Load Despatch Centre). These databases record the instantaneous grid frequency at 1-second resolution. A recording of unknown creation date is analysed to extract its embedded ENF trace, then this trace is compared against the database chronologically to find the time window where the correlation peaks. If the recording is authentic and un-edited, the ENF trace should match the database exactly for one continuous time window, producing a unique best-match timestamp.
Forensic applications. The UK Court of Appeal considered ENF evidence for the first time in R v. Flynn and St John [2008] EWCA Crim 970. The court acknowledged that the ENF database-match approach is scientifically valid but noted that the expert must explain the uncertainty in the match (arising from database measurement error and from the possibility of ENF variation between geographically distant parts of the same synchronous grid zone). The FBI has used ENF analysis in US criminal investigations (published case references in the SWGDE guidance), and European forensic laboratories including the German BKA and the Dutch NFI have published ENF analysis protocols. In India, ENF analysis of disputed audio recordings has been applied in sessions court proceedings, typically through CFSL (Central Forensic Science Laboratory) experts.
Limitations. ENF analysis requires sufficient recording duration (at least 10 seconds of continuous signal for a meaningful match, with longer segments preferred) and a recording with detectable mains-frequency content. Battery-powered portable recorders in outdoor locations with no nearby mains equipment may show no detectable ENF. Recordings that have been re-encoded from one digital format to another may show resampling artefacts that distort the ENF frequency slightly. The technique identifies when a recording was created but cannot confirm who made it or where (unless a regional ENF variation analysis is performed, which is only feasible in some grid architectures).
*Acoustic evidence passes through three gatekeeping questions before it reaches the jury.*
The admissibility of forensic acoustic evidence varies across jurisdictions but follows a broadly consistent set of principles: is the underlying science valid, has it been applied correctly in this case, and has the expert communicated the uncertainty of the opinion appropriately?
United States (Daubert/Frye). The US Supreme Court's Daubert v. Merrell Dow Pharmaceuticals (1993) and the codification in Federal Rule of Evidence 702 require that expert testimony be based on sufficient facts, employ reliable methods, and reliably apply those methods to the case facts. The four Daubert factors (peer review and publication, known error rate, general acceptance, testability) are applied by judges as a pre-trial gatekeeping inquiry. ShotSpotter acoustic localisation and ENF analysis have both been assessed under Daubert. Courts have generally admitted them with appropriate expert qualification, while rejecting expert opinion that overstated the confidence of location estimates or that failed to disclose the classifier's validation dataset.
United Kingdom (CPS and FSR standards). In England and Wales, expert evidence is governed by Part 19 of the Criminal Procedure Rules and the Crown Prosecution Service Expert Evidence guidance updated in 2021. The Forensic Science Regulator's accreditation framework requires acoustic analysis laboratories to demonstrate competence to ISO/IEC 17025. The ENFSI FSAAWG guidelines are the reference methodology. An acoustic expert's report must include a statement of methodology, a statement of uncertainty, the data on which the opinion rests, and a clear separation of scientific finding from interpretive opinion. R v. O'Doherty (2002) established that voice comparison using acoustic evidence requires a statistical foundation rather than impressionistic similarity.
India (BSA 2023 and BNSS 2023). Under the Bharatiya Sakshya Adhiniyam 2023, electronic records (including audio recordings) are admissible when accompanied by a certificate under § 63(4) signed by a responsible official. Expert opinion on audio authenticity would be admitted under § 45, which covers expert opinions on foreign law, science, art, handwriting, and fingerprints. The existing Supreme Court guidance on electronic records (Arjun Panditrao 2020) emphasises that the certificate must be produced by someone with knowledge of the system, not merely a chain-of-custody signature. CFSL, the central body that examines audio evidence in Indian criminal cases, applies a protocol aligned with SWGDE guidelines.
Australia and Canada. Australia's admissibility framework under the Uniform Evidence Acts relies on expert evidence rules similar to those in the UK, with ANZFSS guidelines providing the methodological reference. The Canadian Supreme Court's Mohan framework (R v. Mohan [1994] 2 SCR 9) sets four criteria for expert admissibility (relevance, necessity, absence of exclusionary rule, properly qualified expert); acoustic evidence has been admitted in Canadian superior courts under this framework in gunshot-detection and voice-comparison contexts.
| Technique | Primary question answered | Key limitation | Jurisdictional adoption |
|---|---|---|---|
| Muzzle-blast spectral analysis | Calibre classification of firearm from recorded blast | Probabilistic, not definitive; environment-dependent spectrum | US FBI, ENFSI labs, CFSL India |
| TDOA multi-sensor localisation | Where in 2D/3D space did the gunshot occur? | Requires 3+ calibrated synchronised sensors; accuracy ~25 m | ShotSpotter US 100+ cities; pilots UK, India, South Africa |
| Blast-to-shockwave time separation | How far was the shooter from the recording microphone? | Requires both wave types to be captured; caliber-dependent modelling | Specialist forensic labs; documented in US and EU case reports |
| ENF timestamping | When was this recording created? | Requires detectable mains hum; battery-powered outdoor recordings may have no ENF | UK National Grid / BKA Germany / NFI Netherlands / US FBI |
| Discontinuity / cut detection | Has this recording been edited? | Detection probability depends on edit type; sophisticated editing reduces artefacts | SWGDE US, ENFSI EU, CFSL India standard workflow |
A surveillance microphone records a gunshot event. The shock wave from the bullet arrives 12 milliseconds before the muzzle blast. If the speed of sound is 340 m/s and the bullet was travelling at approximately 900 m/s, which statement best describes how to interpret this time separation?
Test yourself on Forensic Physics with free, timed mocks.
Practice Forensic Physics questions