Skip to content

Compression History Estimation in Video Streams

Compression history estimation determines how many encoding generations a video has undergone and what parameters governed each pass. These traces, found in macroblock statistics, quantisation parameter distributions, and residual energy patterns, reveal whether footage was recorded once or re-encoded from previously compressed material.

Last updated:

Share

Compression history estimation is a forensic technique for determining how many times a video file has been encoded and, where possible, what quantisation parameters governed each generation. Every pass through a lossy codec such as H.264 or H.265 leaves a set of measurable artefacts in the bitstream: periodically spaced intra-coded frames, characteristic shifts in the quantisation parameter distribution, and residual energy patterns that differ markedly from a single-generation recording. Examiners measure these traces from the encoded bitstream and use statistical models or machine-learning regressors to infer the encoding history. The forensic purpose is to distinguish footage recorded directly from a camera or screen from footage that has been re-encoded, which is a common step when forged video is assembled from multiple source clips or when a genuine clip is re-exported after editing.

The practical importance of the technique is clearest in cases where the provenance of video evidence is disputed. A recording claimed to be an unbroken capture from a security camera may turn out to have been re-encoded at a different bit-rate, frame rate, or resolution. A deepfake video assembled from separately generated segments will often carry the compression signature of its constituent parts even after a final export pass. Compression history estimation does not establish the content of the video as authentic or fabricated, but it narrows or contradicts the claimed recording history and can direct investigators toward more targeted analysis.

The field builds on earlier work in image double-JPEG detection, which showed that re-saving a JPEG at a different quality setting leaves a distinctive pattern in the DCT coefficient histogram. The extension to video is technically more complex because temporal prediction across frames, variable-bit-rate encoding, and scene-adaptive quantisation all interact with the re-encoding artefacts. H.264 and H.265 are the dominant container codecs for forensic casework, and most published methods target one or both standards. Work on VP9, AV1, and older MPEG-2 sources exists but is less standardised.

By the end of this topic you will be able to:

  • Explain how intra-coded macroblock and coding-unit mode statistics reveal the GOP structure of a prior encoding generation.
  • Describe how quantisation parameter traces and DCT coefficient histograms differ between single-generation and multi-generation video.
  • Identify the H.264 and H.265 bitstream features that machine-learning regressors use as input for encoding-history classification.
  • Explain the limitations of compression history estimation when the re-encoding quality is high or when transcoding between different codec families.
  • Apply the correct chain of custody and documentation requirements when presenting compression history evidence in court under US, UK, EU, or Indian evidentiary standards.
Key terms
Group of Pictures (GOP)
The periodic structure of I-frames, P-frames, and B-frames in a compressed video stream. Each encoding generation imposes its own GOP structure. When a video is re-encoded, the second encoder's I-frames may fall at positions that reveal the prior generation's GOP period.
Quantisation Parameter (QP)
A per-block or per-slice value that controls the coarseness of the DCT coefficient rounding step in H.264 and H.265. Higher QP means more data loss. Re-encoding a video that was already compressed at a given QP leaves a characteristic distribution in the second-generation residuals that differs from compressing raw content.
Macroblock (MB) / Coding Tree Unit (CTU)
The basic spatial coding unit: 16x16 pixels in H.264 (macroblock) and up to 64x64 pixels in H.265 (CTU). Each unit is classified as intra-coded (from the current frame only) or inter-coded (predicted from a reference frame). The distribution of coding modes across the video is a forensic feature.
Double compression
The condition in which a video has been encoded at least twice through a lossy codec. The term is also used loosely for any multi-generation encoding. It is the most common form of re-encoding encountered in forensic casework and the condition most methods are designed to detect.
DCT coefficient histogram
A histogram of the discrete cosine transform coefficients across macroblocks or coding units in a frame or clip. After a single encode, the histogram has a characteristic shape with peaks at quantisation multiples. After re-encoding, additional peaks or gaps appear that reflect the quantisation grid of the first generation.
Intra-refresh artefact
The elevated count of intra-coded macroblocks or CTUs at frame positions that correspond to the I-frame interval of a previous encoding generation. This artefact persists through re-encoding because the inter-prediction residuals at those positions are larger, causing the second encoder to prefer intra-coding.

GOP structure and intra-coded macroblock statistics

Every H.264 and H.265 encoder organises a video into groups of pictures, each starting with an I-frame that is coded entirely from its own spatial content without reference to other frames. The I-frame interval, commonly 1 or 2 seconds at typical frame rates, is a parameter set by the encoder. In a single-generation recording, I-frames appear at the configured interval and nowhere else, and the proportion of intra-coded macroblocks in P-frames and B-frames is low, typically driven only by scene changes or fast motion.

When that video is re-encoded, the second encoder processes content that has already been through one lossy cycle. At frame positions that coincide with the first generation's I-frames, the spatial content changed more abruptly than the inter-frame prediction can account for, because the first encoder refreshed all blocks at those positions. The second encoder finds that intra-coding is more efficient at those positions than inter-prediction from a reference frame that was itself a P or B frame. The result is an elevated count of intra-coded macroblocks at frame indices that are multiples of the first generation's I-frame period. This signal, called the intra-refresh artefact, can be extracted without access to the original recording by parsing the macroblock mode data from the bitstream.

The detection method works by computing, for each frame, the fraction of macroblocks coded in intra mode. The resulting series is then analysed for periodicity using a Fourier transform or autocorrelation. A periodic spike in the intra-MB fraction at interval T indicates that a prior generation used a GOP length of T frames. The signal is clearest when the first-generation QP was high (strong compression) and degrades when re-encoding is at very high quality, because the second encoder has more bit budget to absorb the residuals without reverting to intra-coding.

Quantisation parameter traces and DCT coefficient analysis

The quantisation parameter controls the step size applied to DCT coefficients before entropy coding. Large QP values produce small files with visible blocking and ringing artefacts; small QP values produce large files with high perceptual quality. In a single encoding from raw content, the DCT coefficients before quantisation come from the raw pixel differences. After quantisation and dequantisation, the reconstructed coefficients fall at multiples of the quantisation step. The distribution of non-zero coefficients forms a characteristic comb-like histogram with peaks at those multiples.

When the video is re-encoded, the second encoder's DCT operates on content that is already quantised. The input coefficients are concentrated at multiples of the first-generation step size. After the second quantisation, the resulting histogram retains traces of the first step size as secondary peaks or systematic gaps in the distribution. These are the same artefacts exploited in double-JPEG detection, adapted to the video context where both the spatial and temporal prediction residuals carry the signal.

FeatureSingle-generation encodeDouble-generation encode
DCT coefficient histogram shapeSmooth, peaks at QP multiples of current encodeSecondary peaks or gaps at multiples of first-generation QP step
QP variance across framesVaries smoothly with scene complexityAdditional variance from residual structure of prior compression
Intra MB fraction per frameLow except at scene changes and I-framesElevated periodically at prior GOP boundaries
Blocking artefact energyConsistent with current QPHigher than expected for stated current QP
Residual magnitude at high frequenciesDecays with increasing frequencyTruncated earlier due to prior high-frequency discard

A practical complication is that most modern encoders use rate control to vary QP across frames and across macroblocks within a frame. This means the QP trace from a single-generation video already shows variation, and the examiner must distinguish the legitimate variation from the distortions introduced by re-encoding. Methods based on the QP trace typically model the expected variance for a given encoder and bit-rate target and then test whether the observed trace is consistent with that model. The double-compression signature is the residual variance that cannot be explained by the expected rate-control behaviour.

Machine-learning regressors for encoding history estimation

Statistical analysis of individual features such as the intra-MB fraction or the DCT histogram can detect double compression but struggles when re-encoding quality is high or when the first and second generation QPs are close together. Machine-learning approaches treat encoding history estimation as a classification or regression task, combining many features to improve discrimination.

The most common feature sets draw from three levels of the bitstream. At the frame level, features include the temporal distribution of intra-MB fractions, the frame-level QP sequence, and the ratio of I-frame to non-I-frame size. At the macroblock level, features include the partition mode histogram (the distribution of 4x4, 4x8, 8x4, 8x8, 8x16, 16x8, and 16x16 prediction partitions) and the distribution of motion vector magnitudes. At the DCT level, features include the coefficient histogram shape, the proportion of zero coefficients by frequency band, and the BDCT (block DCT) artefact measure.

Random forest classifiers trained on these features can distinguish single-generation from double-generation H.264 video with accuracy above 90% on benchmark datasets when the first-generation QP is 28 or higher. Accuracy falls when the first-generation QP is 18 or below, because the high-quality first encode discards little information and the second encoder sees content close to raw. Convolutional neural networks applied to the residual frames in the DCT domain can capture spatial patterns that the hand-crafted feature sets miss and achieve similar or better accuracy without manual feature engineering. H.265 regressors require separate training because the CTU structure and the larger partition space produce different feature distributions from H.264 macroblocks.

H.264 vs H.265: differences in compression history signatures

H.264 (AVC) and H.265 (HEVC) share the same fundamental architecture: transform coding with DCT, quantisation, and entropy coding, combined with motion-compensated inter-prediction. The differences that matter for compression history estimation are the coding unit structure, the increased partition flexibility in H.265, and the differences in the default encoder settings used in practice.

In H.264, the basic coding unit is the 16x16 macroblock. Macroblocks can be split into smaller prediction units down to 4x4 pixels. The intra-refresh artefact in H.264 manifests at the macroblock level and can be measured by counting intra-coded 16x16 blocks per frame. In H.265, the basic unit is the coding tree unit (CTU), which can be up to 64x64 pixels and is recursively split into coding units, prediction units, and transform units. The greater flexibility means the second encoder can adapt its coding structure more finely to the re-encoded content, which slightly reduces the intra-refresh artefact at any given level of analysis. Methods for H.265 typically operate at the CTU or coding unit level rather than a fixed macroblock size.

The QP range also differs: H.264 uses QP values 0 to 51, while H.265 uses 0 to 51 for luma and applies separate scaling factors for chroma. The DCT coefficient analysis must account for the chroma QP offset when analysing H.265 content. In both codecs, encoding parameters such as the reference frame count and the sub-pixel motion estimation precision affect how strongly inter-prediction suppresses the intra-refresh artefact. High reference-frame counts in H.264 allow the second encoder to find good reference matches across longer temporal distances, partially masking the refresh boundary.

PropertyH.264 (AVC)H.265 (HEVC)
Basic spatial unitMacroblock (16x16)Coding Tree Unit (up to 64x64)
QP range0 to 510 to 51 (with separate chroma scaling)
Intra-refresh artefact levelMacroblockCTU or coding unit
Partition flexibilityModerate (4x4 to 16x16)High (recursive CTU splitting)
Typical forensic tool supportBroad (FFmpeg, MediaInfo, custom parsers)Growing (libx265 bitstream analysis)

Practical forensic workflow and tool use

A compression history examination begins with acquiring a verified copy of the evidence file and confirming its hash value before and after any analysis step. The file is then characterised: container format, codec, resolution, frame rate, bit rate, and encoder tag. The encoder tag recorded in the container metadata may identify the software that produced the final generation. When this tag matches the claimed recording device, it is a consistency check, not a confirmation, because the tag is trivial to alter or remove.

Bitstream-level analysis requires a tool capable of reading the raw encoded data without decoding to pixel values. FFprobe (part of FFmpeg) can output per-frame and per-packet statistics including frame type (I, P, B), frame size, and QP values for H.264 content. Full macroblock-level analysis requires a specialised parser; open-source options include the H.264 bitstream analyser in the JM reference implementation and custom scripts built on the openh264 or libavcodec APIs. For H.265, the HM reference decoder can output CTU-level coding decision data.

After extracting the per-frame and per-macroblock features, the examiner applies the statistical or ML-based detection method. Any detected periodicity in the intra-MB fraction is compared against the claimed recording frame rate and GOP settings. QP histogram analysis is performed on a representative sample of frames, avoiding the first few frames of each scene cut where the encoder behaviour is atypical. The findings are documented as a written report with the tool names, version numbers, input hash, output data tables, and the statistical basis for any conclusion.

A key link to the broader authentication workflow is that compression history findings should be integrated with other analysis streams: metadata and container integrity checks, noise consistency analysis, and frame deletion or insertion detection. No single method is definitive on its own. Compression history estimation is one element of a multi-method assessment.

Evidentiary standards and court presentation

Compression history evidence has been presented in criminal and civil proceedings in multiple jurisdictions. The analysis is based on published peer-reviewed methods, which satisfies the scientific literature criterion under US Federal Rule of Evidence 702 (as interpreted after Daubert v. Merrell Dow Pharmaceuticals, 1993). UK courts apply the Criminal Practice Directions and consider the Law Commission 2011 guidance on the admissibility of expert evidence, which requires the method to be sufficiently established to pass scrutiny by an independent expert. EU member states vary in their civil-law treatment of expert evidence, but most require the expert to demonstrate formal qualifications and a documented methodology.

In India, expert evidence in digital matters is assessed under the Bharatiya Sakshya Adhiniyam 2023 (BSA), which replaced the Indian Evidence Act 1872. Section 79A of the earlier act (and its successor provisions in the BSA) defines an examiner of electronic evidence and sets the basis for certifying digital forensic findings. The Supreme Court of India has addressed the reliability of digital evidence in several cases including Arjun Panditrao Khotkar v. Kailash Kushanrao Gorantyal (2020), which emphasised that electronic evidence must be accompanied by a certificate under the relevant section. Compression history analysis should be accompanied by a certificate that identifies the tools, methods, and validation basis.

Across all jurisdictions, the expert report should state clearly what the analysis can and cannot conclude. Compression history estimation can establish that a video has been re-encoded, that the prior encoding had a specific QP range or GOP period, and that the encoding history is inconsistent with the claimed recording provenance. It cannot determine the content of any prior generation, reconstruct deleted frames, or establish the identity of who performed the re-encoding. Overstating conclusions is the most common ground for challenge in cross-examination.

Check your understanding
Question 1 of 4· 0 answered

A forensic examiner finds that the intra-coded macroblock fraction in a P-frame video shows periodic spikes at every 60 frames. The video is encoded at 30 fps. What does this finding most likely indicate?

Key Takeaways

  • Re-encoding a video creates an intra-refresh artefact: an elevated fraction of intra-coded macroblocks at frame positions corresponding to the prior generation's GOP period, detectable by autocorrelation of the per-frame intra-MB fraction series.
  • The DCT coefficient histogram of a re-encoded video shows secondary peaks or gaps at multiples of the first-generation quantisation step size, a pattern absent in single-generation recordings of raw content.
  • Machine-learning regressors combining frame-level, macroblock-level, and DCT-level features can classify single versus double compression with high accuracy when the first-generation QP is 28 or above, but accuracy degrades at high first-generation quality settings.
  • H.265 requires separate methods from H.264 because the CTU structure and partition space produce different feature distributions; models must be validated against the specific encoder type present in the evidence.
  • Court presentation in any jurisdiction requires a documented methodology, stated tool versions, validated error rates, and conclusions limited to what the analysis can actually establish: re-encoding history, not content authenticity or the identity of who performed the re-encoding.
What is compression history estimation in video forensics?
Compression history estimation is the process of inferring how many times a video has been encoded and what parameters were used in each encoding pass. Each lossy compression cycle leaves distinctive artefacts in the bitstream: periodically spaced I-frames, shifts in the quantisation parameter distribution, and residual energy patterns that differ from a single-generation recording. Forensic examiners analyse these traces to determine whether footage is a direct camera recording or a re-encoded copy.
What are macroblock mode statistics and why do they matter forensically?
In H.264 and H.265 encoding, each frame is divided into macroblocks or coding tree units that can be coded in different prediction modes: intra (coded independently from the frame) or inter (coded relative to a reference frame). When a video is re-encoded, the encoder re-optimises mode choices, creating a distribution of intra-coded blocks that is statistically different from a single-generation encode. The unexpected elevation of intra-coded macroblocks at positions that match a prior GOP structure is one of the primary signatures of double or multiple compression.
How does quantisation parameter analysis detect re-encoding?
Each encoding generation applies a quantisation step that irreversibly discards high-frequency detail. When a video is re-encoded, the second encoder operates on data that has already been quantised once. The resulting QP distribution shows artefacts: QP values tend to cluster at multiples of the first-generation quantisation step size, and the distribution of residual DCT coefficients is narrower than would be expected from a single encode of raw content. These signatures can be measured from the bitstream without access to the original material.
Can machine-learning models reliably estimate the number of encoding generations?
Machine-learning regressors, particularly convolutional neural networks trained on artefacts in the DCT domain and random forest models trained on bitstream-level features, can reliably distinguish single-generation from double-generation video and estimate quantisation step sizes from prior generations. Accuracy degrades as the number of generations increases beyond two and as the re-encoding quality is raised. Models trained on H.264 do not always transfer to H.265 without retraining because the coding unit structure differs.
Is compression history evidence admissible in court?
Admissibility depends on jurisdiction and the standards applied to expert evidence. In the US, Daubert and Federal Rule of Evidence 702 require the method to be tested, peer-reviewed, and accepted in the relevant scientific community, with a known error rate. UK courts apply the Criminal Practice Directions on expert evidence and the Law Commission guidance on reliability. Courts in India assess expert evidence under the Bharatiya Sakshya Adhiniyam 2023 (successor to the Indian Evidence Act). Compression history analysis is published in peer-reviewed literature but examiners must document their methodology, software, and validation data carefully to survive challenge.

Test yourself on Multimedia Authentication and Deepfake Forensics with free, timed mocks.

Practice Multimedia Authentication and Deepfake Forensics questions

Found this useful? Pass it along.

Share

Spotted an error in this page? Report a correction or read our editorial standards.

Your journey to becoming a forensic professional starts here.

Practice with mock tests, learn from structured notes, and get your questions answered by a global forensic community, all in one place.