Compression History Estimation in Video Streams

Compression history estimation determines how many encoding generations a video has undergone and what parameters governed each pass. These traces, found in macroblock statistics, quantisation parameter distributions, and residual energy patterns, reveal whether footage was recorded once or re-encoded from previously compressed material.

Last updated: 24 Jun 2026

Compression history estimation is a forensic technique for determining how many times a video file has been encoded and, where possible, what quantisation parameters governed each generation. Every pass through a lossy codec such as H.264 or H.265 leaves a set of measurable artefacts in the bitstream: periodically spaced intra-coded frames, characteristic shifts in the quantisation parameter distribution, and residual energy patterns that differ markedly from a single-generation recording. Examiners measure these traces from the encoded bitstream and use statistical models or machine-learning regressors to infer the encoding history. The forensic purpose is to distinguish footage recorded directly from a camera or screen from footage that has been re-encoded, which is a common step when forged video is assembled from multiple source clips or when a genuine clip is re-exported after editing.

The practical importance of the technique is clearest in cases where the provenance of video evidence is disputed. A recording claimed to be an unbroken capture from a security camera may turn out to have been re-encoded at a different bit-rate, frame rate, or resolution. A deepfake video assembled from separately generated segments will often carry the compression signature of its constituent parts even after a final export pass. Compression history estimation does not establish the content of the video as authentic or fabricated, but it narrows or contradicts the claimed recording history and can direct investigators toward more targeted analysis.

The field builds on earlier work in image double-JPEG detection, which showed that re-saving a JPEG at a different quality setting leaves a distinctive pattern in the DCT coefficient histogram. The extension to video is technically more complex because temporal prediction across frames, variable-bit-rate encoding, and scene-adaptive quantisation all interact with the re-encoding artefacts. H.264 and H.265 are the dominant container codecs for forensic casework, and most published methods target one or both standards. Work on VP9, AV1, and older MPEG-2 sources exists but is less standardised.

By the end of this topic you will be able to:

Explain how intra-coded macroblock and coding-unit mode statistics reveal the GOP structure of a prior encoding generation.
Describe how quantisation parameter traces and DCT coefficient histograms differ between single-generation and multi-generation video.
Identify the H.264 and H.265 bitstream features that machine-learning regressors use as input for encoding-history classification.
Explain the limitations of compression history estimation when the re-encoding quality is high or when transcoding between different codec families.
Apply the correct chain of custody and documentation requirements when presenting compression history evidence in court under US, UK, EU, or Indian evidentiary standards.

Key terms

Group of Pictures (GOP): The periodic structure of I-frames, P-frames, and B-frames in a compressed video stream. Each encoding generation imposes its own GOP structure. When a video is re-encoded, the second encoder's I-frames may fall at positions that reveal the prior generation's GOP period.
Quantisation Parameter (QP): A per-block or per-slice value that controls the coarseness of the DCT coefficient rounding step in H.264 and H.265. Higher QP means more data loss. Re-encoding a video that was already compressed at a given QP leaves a characteristic distribution in the second-generation residuals that differs from compressing raw content.
Macroblock (MB) / Coding Tree Unit (CTU): The basic spatial coding unit: 16x16 pixels in H.264 (macroblock) and up to 64x64 pixels in H.265 (CTU). Each unit is classified as intra-coded (from the current frame only) or inter-coded (predicted from a reference frame). The distribution of coding modes across the video is a forensic feature.
Double compression: The condition in which a video has been encoded at least twice through a lossy codec. The term is also used loosely for any multi-generation encoding. It is the most common form of re-encoding encountered in forensic casework and the condition most methods are designed to detect.
DCT coefficient histogram: A histogram of the discrete cosine transform coefficients across macroblocks or coding units in a frame or clip. After a single encode, the histogram has a characteristic shape with peaks at quantisation multiples. After re-encoding, additional peaks or gaps appear that reflect the quantisation grid of the first generation.
Intra-refresh artefact: The elevated count of intra-coded macroblocks or CTUs at frame positions that correspond to the I-frame interval of a previous encoding generation. This artefact persists through re-encoding because the inter-prediction residuals at those positions are larger, causing the second encoder to prefer intra-coding.

GOP structure and intra-coded macroblock statistics

Every H.264 and H.265 encoder organises a video into groups of pictures, each starting with an I-frame that is coded entirely from its own spatial content without reference to other frames. The I-frame interval, commonly 1 or 2 seconds at typical frame rates, is a parameter set by the encoder. In a single-generation recording, I-frames appear at the configured interval and nowhere else, and the proportion of intra-coded macroblocks in P-frames and B-frames is low, typically driven only by scene changes or fast motion.

When that video is re-encoded, the second encoder processes content that has already been through one lossy cycle. At frame positions that coincide with the first generation's I-frames, the spatial content changed more abruptly than the inter-frame prediction can account for, because the first encoder refreshed all blocks at those positions. The second encoder finds that intra-coding is more efficient at those positions than inter-prediction from a reference frame that was itself a P or B frame. The result is an elevated count of intra-coded macroblocks at frame indices that are multiples of the first generation's I-frame period. This signal, called the intra-refresh artefact, can be extracted without access to the original recording by parsing the macroblock mode data from the bitstream.

The detection method works by computing, for each frame, the fraction of macroblocks coded in intra mode. The resulting series is then analysed for periodicity using a Fourier transform or autocorrelation. A periodic spike in the intra-MB fraction at interval T indicates that a prior generation used a GOP length of T frames. The signal is clearest when the first-generation QP was high (strong compression) and degrades when re-encoding is at very high quality, because the second encoder has more bit budget to absorb the residuals without reverting to intra-coding.

Quantisation parameter traces and DCT coefficient analysis

The quantisation parameter controls the step size applied to DCT coefficients before entropy coding. Large QP values produce small files with visible blocking and ringing artefacts; small QP values produce large files with high perceptual quality. In a single encoding from raw content, the DCT coefficients before quantisation come from the raw pixel differences. After quantisation and dequantisation, the reconstructed coefficients fall at multiples of the quantisation step. The distribution of non-zero coefficients forms a characteristic comb-like histogram with peaks at those multiples.

When the video is re-encoded, the second encoder's DCT operates on content that is already quantised. The input coefficients are concentrated at multiples of the first-generation step size. After the second quantisation, the resulting histogram retains traces of the first step size as secondary peaks or systematic gaps in the distribution. These are the same artefacts exploited in double-JPEG detection, adapted to the video context where both the spatial and temporal prediction residuals carry the signal.

Feature	Single-generation encode	Double-generation encode
DCT coefficient histogram shape	Smooth, peaks at QP multiples of current encode	Secondary peaks or gaps at multiples of first-generation QP step
QP variance across frames	Varies smoothly with scene complexity	Additional variance from residual structure of prior compression
Intra MB fraction per frame	Low except at scene changes and I-frames	Elevated periodically at prior GOP boundaries
Blocking artefact energy	Consistent with current QP	Higher than expected for stated current QP
Residual magnitude at high frequencies	Decays with increasing frequency	Truncated earlier due to prior high-frequency discard

A practical complication is that most modern encoders use rate control to vary QP across frames and across macroblocks within a frame. This means the QP trace from a single-generation video already shows variation, and the examiner must distinguish the legitimate variation from the distortions introduced by re-encoding. Methods based on the QP trace typically model the expected variance for a given encoder and bit-rate target and then test whether the observed trace is consistent with that model. The double-compression signature is the residual variance that cannot be explained by the expected rate-control behaviour.

Machine-learning regressors for encoding history estimation

Statistical analysis of individual features such as the intra-MB fraction or the DCT histogram can detect double compression but struggles when re-encoding quality is high or when the first and second generation QPs are close together. Machine-learning approaches treat encoding history estimation as a classification or regression task, combining many features to improve discrimination.

The most common feature sets draw from three levels of the bitstream. At the frame level, features include the temporal distribution of intra-MB fractions, the frame-level QP sequence, and the ratio of I-frame to non-I-frame size. At the macroblock level, features include the partition mode histogram (the distribution of 4x4, 4x8, 8x4, 8x8, 8x16, 16x8, and 16x16 prediction partitions) and the distribution of motion vector magnitudes. At the DCT level, features include the coefficient histogram shape, the proportion of zero coefficients by frequency band, and the BDCT (block DCT) artefact measure.

Random forest classifiers trained on these features can distinguish single-generation from double-generation H.264 video with accuracy above 90% on benchmark datasets when the first-generation QP is 28 or higher. Accuracy falls when the first-generation QP is 18 or below, because the high-quality first encode discards little information and the second encoder sees content close to raw. Convolutional neural networks applied to the residual frames in the DCT domain can capture spatial patterns that the hand-crafted feature sets miss and achieve similar or better accuracy without manual feature engineering. H.265 regressors require separate training because the CTU structure and the larger partition space produce different feature distributions from H.264 macroblocks.

H.264 vs H.265: differences in compression history signatures

H.264 (AVC) and H.265 (HEVC) share the same fundamental architecture: transform coding with DCT, quantisation, and entropy coding, combined with motion-compensated inter-prediction. The differences that matter for compression history estimation are the coding unit structure, the increased partition flexibility in H.265, and the differences in the default encoder settings used in practice.

In H.264, the basic coding unit is the 16x16 macroblock. Macroblocks can be split into smaller prediction units down to 4x4 pixels. The intra-refresh artefact in H.264 manifests at the macroblock level and can be measured by counting intra-coded 16x16 blocks per frame. In H.265, the basic unit is the coding tree unit (CTU), which can be up to 64x64 pixels and is recursively split into coding units, prediction units, and transform units. The greater flexibility means the second encoder can adapt its coding structure more finely to the re-encoded content, which slightly reduces the intra-refresh artefact at any given level of analysis. Methods for H.265 typically operate at the CTU or coding unit level rather than a fixed macroblock size.

The QP range also differs: H.264 uses QP values 0 to 51, while H.265 uses 0 to 51 for luma and applies separate scaling factors for chroma. The DCT coefficient analysis must account for the chroma QP offset when analysing H.265 content. In both codecs, encoding parameters such as the reference frame count and the sub-pixel motion estimation precision affect how strongly inter-prediction suppresses the intra-refresh artefact. High reference-frame counts in H.264 allow the second encoder to find good reference matches across longer temporal distances, partially masking the refresh boundary.

Property	H.264 (AVC)	H.265 (HEVC)
Basic spatial unit	Macroblock (16x16)	Coding Tree Unit (up to 64x64)
QP range	0 to 51	0 to 51 (with separate chroma scaling)
Intra-refresh artefact level	Macroblock	CTU or coding unit
Partition flexibility	Moderate (4x4 to 16x16)	High (recursive CTU splitting)
Typical forensic tool support	Broad (FFmpeg, MediaInfo, custom parsers)	Growing (libx265 bitstream analysis)

Practical forensic workflow and tool use

A compression history examination begins with acquiring a verified copy of the evidence file and confirming its hash value before and after any analysis step. The file is then characterised: container format, codec, resolution, frame rate, bit rate, and encoder tag. The encoder tag recorded in the container metadata may identify the software that produced the final generation. When this tag matches the claimed recording device, it is a consistency check, not a confirmation, because the tag is trivial to alter or remove.

Bitstream-level analysis requires a tool capable of reading the raw encoded data without decoding to pixel values. FFprobe (part of FFmpeg) can output per-frame and per-packet statistics including frame type (I, P, B), frame size, and QP values for H.264 content. Full macroblock-level analysis requires a specialised parser; open-source options include the H.264 bitstream analyser in the JM reference implementation and custom scripts built on the openh264 or libavcodec APIs. For H.265, the HM reference decoder can output CTU-level coding decision data.

After extracting the per-frame and per-macroblock features, the examiner applies the statistical or ML-based detection method. Any detected periodicity in the intra-MB fraction is compared against the claimed recording frame rate and GOP settings. QP histogram analysis is performed on a representative sample of frames, avoiding the first few frames of each scene cut where the encoder behaviour is atypical. The findings are documented as a written report with the tool names, version numbers, input hash, output data tables, and the statistical basis for any conclusion.

A key link to the broader authentication workflow is that compression history findings should be integrated with other analysis streams: metadata and container integrity checks, noise consistency analysis, and frame deletion or insertion detection. No single method is definitive on its own. Compression history estimation is one element of a multi-method assessment.

Evidentiary standards and court presentation

Compression history evidence has been presented in criminal and civil proceedings in multiple jurisdictions. The analysis is based on published peer-reviewed methods, which satisfies the scientific literature criterion under US Federal Rule of Evidence 702 (as interpreted after Daubert v. Merrell Dow Pharmaceuticals, 1993). UK courts apply the Criminal Practice Directions and consider the Law Commission 2011 guidance on the admissibility of expert evidence, which requires the method to be sufficiently established to pass scrutiny by an independent expert. EU member states vary in their civil-law treatment of expert evidence, but most require the expert to demonstrate formal qualifications and a documented methodology.

In India, expert evidence in digital matters is assessed under the Bharatiya Sakshya Adhiniyam 2023 (BSA), which replaced the Indian Evidence Act 1872. Section 79A of the earlier act (and its successor provisions in the BSA) defines an examiner of electronic evidence and sets the basis for certifying digital forensic findings. The Supreme Court of India has addressed the reliability of digital evidence in several cases including Arjun Panditrao Khotkar v. Kailash Kushanrao Gorantyal (2020), which emphasised that electronic evidence must be accompanied by a certificate under the relevant section. Compression history analysis should be accompanied by a certificate that identifies the tools, methods, and validation basis.

Across all jurisdictions, the expert report should state clearly what the analysis can and cannot conclude. Compression history estimation can establish that a video has been re-encoded, that the prior encoding had a specific QP range or GOP period, and that the encoding history is inconsistent with the claimed recording provenance. It cannot determine the content of any prior generation, reconstruct deleted frames, or establish the identity of who performed the re-encoding. Overstating conclusions is the most common ground for challenge in cross-examination.

Worked example

Estimating encoding history from a disputed security camera recording

A security camera recording submitted as evidence in a property crime investigation is claimed to be an unbroken export from the camera's DVR. Compression history analysis is used to assess whether the file is consistent with that claim.

The file is an MP4 container, H.264 video, 1080p at 25 frames per second, 2 hours duration. The encoder tag in the container metadata reads 'libx264 core 164'. The DVR manufacturer is confirmed to use a hardware H.264 encoder, not libx264. The metadata inconsistency is noted but not treated as conclusive, because the file may have been legitimately exported through conversion software.

Hash and characterise. The file SHA-256 hash is recorded. FFprobe confirms: H.264 High Profile, QP range 18 to 42, average bit rate 2.1 Mbps, keyframe interval 50 frames (2 seconds at 25 fps).
Extract per-frame intra-MB fractions. A custom parser extracts the macroblock mode data from the bitstream. The intra-coded macroblock fraction is computed for each frame and plotted over time.
Detect GOP periodicity. The intra-MB fraction series shows elevated values not only at the current I-frames (every 50 frames) but also at every 75 frames, an interval that does not correspond to any stated camera or encoder parameter. Autocorrelation of the series confirms the 75-frame period with a coefficient of 0.61.
Interpret the 75-frame period. At 25 fps, 75 frames corresponds to 3 seconds, a common DVR GOP setting. The DVR manufacturer's technical specifications confirm that the camera model in question uses a 3-second GOP (75 frames at 25 fps) in its default recording mode.
DCT coefficient analysis. The DCT coefficient histogram for non-I-frames shows secondary peaks at quantisation multiples consistent with a prior QP of approximately 32, which is within the range of the DVR's typical quality setting. The current re-encode QP averages 26, which is lower (higher quality), consistent with a deliberate upgrade in quality on re-export.
Conclusion and report. The analysis finds that the submitted file shows clear signatures of double compression. The prior generation had a GOP of 75 frames (3 seconds at 25 fps), consistent with the DVR's default setting. The final encoding was performed with libx264 at a higher bit rate than the DVR's native output. The file is a re-encoded copy of a DVR recording, not a direct DVR export. This is not consistent with the claim that the file is an unbroken, unmodified export from the DVR.

Check your understanding

Question 1 of 4· 0 answered

A forensic examiner finds that the intra-coded macroblock fraction in a P-frame video shows periodic spikes at every 60 frames. The video is encoded at 30 fps. What does this finding most likely indicate?

Key Takeaways

Re-encoding a video creates an intra-refresh artefact: an elevated fraction of intra-coded macroblocks at frame positions corresponding to the prior generation's GOP period, detectable by autocorrelation of the per-frame intra-MB fraction series.
The DCT coefficient histogram of a re-encoded video shows secondary peaks or gaps at multiples of the first-generation quantisation step size, a pattern absent in single-generation recordings of raw content.
Machine-learning regressors combining frame-level, macroblock-level, and DCT-level features can classify single versus double compression with high accuracy when the first-generation QP is 28 or above, but accuracy degrades at high first-generation quality settings.
H.265 requires separate methods from H.264 because the CTU structure and partition space produce different feature distributions; models must be validated against the specific encoder type present in the evidence.
Court presentation in any jurisdiction requires a documented methodology, stated tool versions, validated error rates, and conclusions limited to what the analysis can actually establish: re-encoding history, not content authenticity or the identity of who performed the re-encoding.

What is compression history estimation in video forensics?

Compression history estimation is the process of inferring how many times a video has been encoded and what parameters were used in each encoding pass. Each lossy compression cycle leaves distinctive artefacts in the bitstream: periodically spaced I-frames, shifts in the quantisation parameter distribution, and residual energy patterns that differ from a single-generation recording. Forensic examiners analyse these traces to determine whether footage is a direct camera recording or a re-encoded copy.

What are macroblock mode statistics and why do they matter forensically?

In H.264 and H.265 encoding, each frame is divided into macroblocks or coding tree units that can be coded in different prediction modes: intra (coded independently from the frame) or inter (coded relative to a reference frame). When a video is re-encoded, the encoder re-optimises mode choices, creating a distribution of intra-coded blocks that is statistically different from a single-generation encode. The unexpected elevation of intra-coded macroblocks at positions that match a prior GOP structure is one of the primary signatures of double or multiple compression.

How does quantisation parameter analysis detect re-encoding?

Each encoding generation applies a quantisation step that irreversibly discards high-frequency detail. When a video is re-encoded, the second encoder operates on data that has already been quantised once. The resulting QP distribution shows artefacts: QP values tend to cluster at multiples of the first-generation quantisation step size, and the distribution of residual DCT coefficients is narrower than would be expected from a single encode of raw content. These signatures can be measured from the bitstream without access to the original material.

Can machine-learning models reliably estimate the number of encoding generations?

Machine-learning regressors, particularly convolutional neural networks trained on artefacts in the DCT domain and random forest models trained on bitstream-level features, can reliably distinguish single-generation from double-generation video and estimate quantisation step sizes from prior generations. Accuracy degrades as the number of generations increases beyond two and as the re-encoding quality is raised. Models trained on H.264 do not always transfer to H.265 without retraining because the coding unit structure differs.

Is compression history evidence admissible in court?

Admissibility depends on jurisdiction and the standards applied to expert evidence. In the US, Daubert and Federal Rule of Evidence 702 require the method to be tested, peer-reviewed, and accepted in the relevant scientific community, with a known error rate. UK courts apply the Criminal Practice Directions on expert evidence and the Law Commission guidance on reliability. Courts in India assess expert evidence under the Bharatiya Sakshya Adhiniyam 2023 (successor to the Indian Evidence Act). Compression history analysis is published in peer-reviewed literature but examiners must document their methodology, software, and validation data carefully to survive challenge.

Test yourself on Multimedia Authentication and Deepfake Forensics with free, timed mocks.

Practice Multimedia Authentication and Deepfake Forensics questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Key Takeaways

Your journey to becoming a forensic professional starts here.