Practice with mock tests, learn from structured notes, and get your questions answered by a global forensic community, all in one place.
JPEG is the dominant image format in forensic casework, and its compression process leaves a distinctive statistical fingerprint that manipulation disturbs. This topic covers DCT block structure, quantisation tables as a camera fingerprint, double-JPEG detection, and the JPEG ghost method for localising pasted regions.
Last updated:
The overwhelming majority of digital photographs in forensic casework are JPEG files. This is not because JPEG is particularly secure. It is because JPEG is the default format of virtually every consumer camera, smartphone, and social media platform. That ubiquity is forensically useful: the JPEG compression process is well-understood, deterministic, and leaves a statistical fingerprint at every step. When someone edits a JPEG and saves it again, those fingerprints are disturbed in predictable ways that trained analysis can detect.
The mechanism starts with the discrete cosine transform. JPEG tiles the image into 8x8 pixel blocks, converts each block from pixel values to frequency coefficients, and then quantises those coefficients, rounding them to coarser values according to a table that varies with the chosen quality level. The quantisation step is what throws information away and what makes JPEG lossy. It is also the step that records the quality setting and the table used, which varies between camera manufacturers and software packages like Photoshop, GIMP, or Instagram's pipeline.
When a JPEG is edited and re-saved, the image passes through quantisation twice, and the second quantisation acts on values that are already constrained by the first. The resulting coefficient histograms have a different shape than those of a singly-compressed image. Methods published by Farid in 2009 and refined by Bianchi and Piva in 2012 formalised this observation into workable detection algorithms. This topic builds that machinery step by step, and then adds the JPEG ghost method, which localises pasted regions by their inconsistent compression history.
Understanding the compression pipeline is the prerequisite for understanding what breaks when manipulation occurs.
JPEG compression follows a fixed pipeline. First, the image is optionally converted from RGB to YCbCr colour space, separating luminance (Y) from two chrominance channels (Cb, Cr). The human eye is more sensitive to luminance detail than colour detail, so the chrominance channels are often downsampled. Then the luminance and chrominance channels are each divided into non-overlapping 8x8 pixel blocks.
The DCT converts each 8x8 block into 64 frequency coefficients. The top-left coefficient (called DC) represents the block's average brightness. The remaining 63 coefficients (called AC) represent progressively finer spatial detail. These coefficients are then divided by the values in the quantisation table and rounded to integers. The table values for high-frequency entries are large, so fine detail is rounded aggressively, which is where the image quality loss occurs. The rounded coefficients are finally passed through entropy coding (Huffman or arithmetic), which is lossless.
When the file is decoded, the process reverses: entropy decoding, inverse quantisation (multiply back by the table values), and inverse DCT. But the original real-valued coefficients are gone; only rounded integers survive. That irreversible rounding is what makes the process lossy and what gives investigators a record of the compression quality.
Every manufacturer has a slightly different table. That difference is a forensic asset.
The JPEG standard specifies the DCT and the entropy coding but not the quantisation tables. Manufacturers are free to use their own tables, and they do. Canon's firmware uses different table values from Nikon's; Photoshop's default tables differ from those produced by smartphone ISPs from Apple, Samsung, or Google. The tables are stored in the JPEG file header and can be extracted in seconds with any EXIF reader.
The forensic implication runs in two directions. First, the table narrows the field of possible creation tools. A file claiming to come from a specific camera model should carry that model's characteristic tables. A mismatch between the claimed source and the embedded table is an immediate red flag. Second, if a region of the image was pasted in from a source that used different tables, the pasted region's coefficient distribution will carry the fingerprint of the source table, not the host table. Segmenting the image and comparing per-region table estimates is a form of localisation.
Save once and the histogram is smooth. Save twice and it develops a periodic pattern.
In a singly-compressed JPEG, the DCT coefficient histograms for any given frequency tend to follow a smooth, roughly Laplacian distribution peaking at zero. When the image is decoded and re-saved at a different quality setting, the second quantisation step divides already-rounded values again. The interaction between two different quantisation steps produces a periodic pattern of peaks and valleys in the coefficient histograms. This structure is absent in genuine single-save images and is the signature Farid (2009) used to build his detector.
Bianchi and Piva (2012) extended this approach with a statistical framework for estimating the first (primary) quality factor even when the final (secondary) quality factor is known. This is useful because an attacker who saves a manipulated image at a higher quality than the original may hope to wash out evidence of double compression. The Bianchi-Piva estimator can still recover the ghost of the primary quantisation in many cases.
A complication arises when the primary and secondary quality settings are the same. If an image is decoded and re-saved at the identical quality factor, the periodic histogram structure is largely suppressed because the second quantisation steps are aligned with the first. Same-quality double compression is the hardest case for histogram-based detectors, and practitioners should be aware that a clean histogram does not rule it out.
| Scenario | Histogram shape | Primary quality recoverable? |
|---|---|---|
| Single JPEG at quality Q | Smooth Laplacian | N/A |
| Double JPEG: Q1 then Q2, Q1 < Q2 | Periodic peaks at Q1 step intervals | Usually yes (Bianchi-Piva) |
| Double JPEG: Q1 then Q2, Q1 > Q2 | Periodic structure may be suppressed by coarser Q2 | Partially |
| Double JPEG: Q1 = Q2 | Largely smooth; signal suppressed | Difficult to detect |
If a pasted region had a different compression history, it shows its minimum error at a different quality level.
The double-compression histogram methods work at the whole-image level. For localising which region of an image was tampered with, the JPEG ghost method (Farid 2009) is more directly useful. The logic is straightforward. Take the image, save it at quality Q, then compute the per-pixel absolute difference between the re-saved version and the original. Do this for many quality values, say Q = 50, 55, 60, ... , 95.
A region that was originally captured at quality Q_src will have minimum re-compression error when the re-save quality is close to Q_src. If the whole image was captured at quality 85, the minimum-error map will be near-uniform across the image at Q = 85. But if a region was pasted in from a source captured at quality 70, that region's minimum error will land near Q = 70, producing a visible patch at a different quality level in the error map. That patch is the ghost.
Limitations are real. JPEG ghost relies on the pasted region having come from a source with a meaningfully different quality setting. If the source quality was close to the host quality, the ghost is weak and may not be distinguishable from compression noise. Progressive JPEG encoding, subsampling differences, and some modern adaptive-quality schemes can also affect the result. As with all single-method results, JPEG ghost findings should be corroborated by at least one other method before being presented as the primary evidence of manipulation.
An 8x8 grid imported from a different image rarely aligns with the grid of the host.
Every JPEG image has a fixed 8x8 block grid anchored to the image's top-left corner. When content is pasted from another image, the pasted region brings its own block grid with it. Unless the attacker is careful to crop and paste at 8-pixel-aligned boundaries, the pasted region's block edges will be offset from the host image's grid. When the composite is saved as JPEG, the encoder tiles the whole image with a new 8x8 grid, and the previously-aligned blocks in the pasted region are now split across new block boundaries. The result is a local increase in blockiness at a phase that differs from the surrounding image.
Grid-alignment analysis exploits this. By computing a blockiness measure at different spatial phases across the image (i.e., at different offsets of the 8x8 tiling), investigators can identify regions where the dominant phase shifts. A step-change in dominant block phase at a sharp boundary is consistent with content from a different source. This method works even when the pasted region and the host share similar quality settings that defeat JPEG ghost analysis.
What is the forensic significance of the quantisation table embedded in a JPEG file?
Test yourself on Forensic Audio, Video and Image Analysis with free, timed mocks.
Practice Forensic Audio, Video and Image Analysis questionsSpotted an error in this page? Report a correction or read our editorial standards.