Digital Image Fundamentals: Pixels, Sensors, and Formats

Every digital image is a grid of numbers shaped by the sensor that captured it, the demosaicing algorithm that filled in the gaps, and the format that compressed and stored the result. Those choices leave forensically exploitable signatures in every file.

Last updated: 19 Jun 2026

A digital image is a rectangular grid of integers produced when a semiconductor sensor converts incident light into charge, filtered through a Bayer colour filter array so each photosite records only one colour channel. Demosaicing software estimates the missing two channels at every site, and the resulting data is then encoded by a format, such as RAW, JPEG, or PNG, each of which preserves or discards different parts of the original signal. These pipeline stages, from sensor physics through compression, leave distinct statistical signatures in every file. Forensic analysis of digital images depends on recognising those signatures and detecting where they have been disrupted.

A digital image is a grid of integers produced by a semiconductor sensor, shaped by a colour filter mosaic, processed by a demosaicing algorithm, and then compressed by a codec that discards information deemed imperceptible to the human eye. Each of those steps leaves a trace in the file, and those traces are the raw material of digital image forensics.

For an analyst, a manipulated image is not just one that looks suspicious. It is one whose internal statistics break the pattern that honest capture would have produced. A patch pasted from another photograph disrupts the local noise texture. Re-encoding at a different quality setting adds a second layer of JPEG artefacts on top of the first. Geometric resampling changes the correlation structure between neighbouring pixels. Every operation that touched the file left a mark, and understanding the normal marks of the capture pipeline is what makes the abnormal ones visible.

This topic builds the foundation for all of that. It covers how sensor physics and the Bayer mosaic create the raw data, how demosaicing reconstructs a full-colour image from it, how RAW, JPEG, PNG, and TIFF differ in what they preserve and what they discard, and how bit depth, colour spaces, and the sensor noise model feed into forensic methods. The analytical techniques downstream depend on a firm grasp of this pipeline.

By the end of this topic you will be able to:

Describe how the Bayer CFA and demosaicing together produce the first camera-specific forensic signature in a captured image.
Compare RAW, JPEG, PNG, and TIFF formats in terms of what each preserves and discards for forensic purposes.
Explain the four sensor noise components and identify which one forms the basis of PRNU camera fingerprinting.
Distinguish pixel dimensions from PPI metadata and explain how upsampling artefacts are detectable in scaled images.
Interpret a convergent multi-signal finding (PRNU absence, CFA disruption, double-compression) in a manipulation case.

Key terms

Bayer CFA: A colour filter array arranged in a repeating 2×2 mosaic of red, green (×2), and blue filters over the sensor. Each photosite records only one colour; demosaicing estimates the other two, producing the first forensic signature of the camera.
Demosaicing: The interpolation step that reconstructs all three colour channels at every pixel from the single-channel Bayer mosaic. The algorithm used is camera-specific and leaves a spatial correlation pattern that can be used to detect local regions of tampering.
Fixed-pattern noise (FPN): Spatially consistent noise arising from photosite manufacturing variations. FPN is unique to each sensor and stable across exposures, making it the basis for photo response non-uniformity (PRNU) camera fingerprinting.
RAW format: A proprietary or standardised container (DNG, CR2, NEF, ARW) that stores the unprocessed or minimally processed photosite values from the sensor. RAW preserves the most forensic information of any capture format.
JPEG compression: A lossy codec that divides an image into 8×8 pixel blocks, applies the discrete cosine transform, quantises the coefficients, and entropy-codes the result. Every save round at a given quality setting introduces a new quantisation layer whose artefacts are detectable.
Colour space: A defined mapping between numerical RGB values and real-world colours. sRGB is the web default with a smaller gamut; Adobe RGB covers a wider gamut; ProPhoto RGB is used in high-end workflows. Conversion between spaces alters pixel values and can mask noise patterns.

How a sensor turns light into numbers

A digital camera sensor is a rectangular grid of photosites, each a small photodiode that accumulates charge in proportion to the light it receives during an exposure. When the shutter closes, the camera reads out those charge values and converts them to digital integers. The resulting grid, one integer per photosite, is the rawest form of the image data.

Sensors are inherently monochrome. To record colour, most consumer and professional cameras place a Bayer Colour Filter Array (CFA) over the sensor. The most common pattern is a 2×2 tile with one red, one blue, and two green filters. Green is doubled because the human visual system is most sensitive to green wavelengths, and the extra green sample reduces noise in the luminance channel.

Bayer CFA 2x2 tile: R, G, G, B arrangement.

The Bayer CFA means each photosite records only one of the three colour channels. Demosaicing is the step that estimates the missing two channels at every site using neighbouring values. The quality and specific algorithm of demosaicing varies between camera manufacturers. Canon, Nikon, and Sony each ship different interpolation kernels, and those kernels leave characteristic spatial correlations in the pixel array that have been used since the mid-2000s as camera-identification signals.

RAW, JPEG, PNG, and TIFF: what each format preserves

Not every image file carries the same forensic richness. The format a camera writes, or that a user later converts to, determines how much of the original sensor signal is available to an examiner. Each format makes a different set of trade-offs between file size, image quality, and information preservation.

Format	Compression	Bit depth (typical)	Forensic note
RAW (DNG, CR2, NEF)	Lossless or none	12–16 bits per channel	Maximum noise, demosaic, and metadata information retained
TIFF	Lossless (LZW/ZIP) or none	8–16 bits per channel	No lossy degradation; large files; metadata often present
PNG	Lossless (DEFLATE)	8 or 16 bits per channel	No compression artefacts; common in screenshots and edited images
JPEG	Lossy (DCT)	8 bits per channel	Block artefacts, chroma subsampling; multiple saves compound the loss

JPEG is the dominant format in investigations because it is the default for most phone cameras and social media platforms. Each time a JPEG is opened and saved at a given quality level, the DCT quantisation step reintroduces quantisation error on top of whatever was already there. This double-compression signature is one of the oldest and most reliable signals in JPEG forensics.

PNG is the standard for screenshots, graphics, and images that have been edited and explicitly saved without lossy compression. Because PNG is lossless, the pixel values are exact. This means noise analysis and clone-detection methods retain their full power. The absence of JPEG block artefacts in a PNG claimed to be a camera-original should itself prompt scrutiny: most cameras do not output PNG natively.

Bit depth, colour spaces, and gamma encoding

Bit depth sets how many discrete tonal steps each colour channel can represent. An 8-bit channel has 256 steps; a 12-bit RAW channel has 4,096; a 16-bit channel has 65,536. This matters forensically because editing operations that spread or compress the tonal range leave characteristic gaps or spikes in the histogram.

Colour spaces define how numerical RGB triples map to actual colours. sRGB, the default for the web and most consumer cameras, covers roughly 35% of all visible colours. Adobe RGB covers a wider gamut and is common in professional photography. ProPhoto RGB covers nearly all of the Pointer gamut. Converting between these spaces is not lossless: out-of-gamut values are clipped, and in-gamut values are numerically remapped, altering the statistical fingerprints that camera-identification methods rely on.

Gamma encoding is a non-linear mapping applied before storage. Human vision is more sensitive to differences in dark tones than in bright ones, so images are stored with a curve (gamma ≈ 2.2 for sRGB) that allocates more code values to shadows and fewer to highlights. An analyst working with raw sensor output must account for this encoding or noise estimates will be skewed toward bright pixels.

Sensor noise model and PRNU

A sensor that perfectly measured every photon it received would produce identical output from identical scenes. Real sensors do not. Several independent noise processes add uncertainty to each photosite reading, and understanding them is the basis for both noise-consistency analysis and camera fingerprinting.

Photon shot noise: random arrival of photons follows Poisson statistics. Variance equals the mean signal level, so bright regions are noisier in absolute terms but less noisy relative to their brightness.
Read noise: electronic noise introduced during charge readout and analogue-to-digital conversion. Roughly Gaussian and signal-independent. Dominant in dark regions and long-exposure images.
Dark current noise: charge that accumulates even without light, driven by thermal electrons. Grows with exposure time and sensor temperature. Subtracted by dark-frame calibration in astrophotography, but rarely corrected in forensic materials.
Fixed-pattern noise (FPN) / PRNU: deterministic per-photosite offsets caused by manufacturing variation. Photo response non-uniformity is the multiplicative component of FPN. It is stable, unique to each sensor, and the basis for camera identification.

Sensor noise model showing four independent noise components.

Camera identification via PRNU, developed by Jan Lukáš, Jessica Fridrich, and Miroslav Goljan at Binghamton University in 2006, works by extracting the high-frequency noise residual from an image after suppressing the scene content with a denoising filter. Averaging this residual across many images amplifies the stable PRNU component and cancels random noise. The resulting fingerprint can be correlated against the residual from a questioned image. The method has been validated in court proceedings in multiple jurisdictions and is now routinely applied to images from phone cameras recovered in criminal investigations.

Resolution, PPI, and print dimensions

Resolution has two distinct meanings that are often conflated. Pixel dimensions (e.g., 4000×3000 pixels) describe the actual data in the file. PPI (pixels per inch) is a metadata tag that tells printing software at what density to render those pixels on paper. PPI has no effect on the pixel data: the same 4000×3000 file looks identical on screen whether its embedded PPI is 72 or 300. Only the printed output size changes.

This matters for document fraud. A forger who scales a small image up to produce a larger one introduces resampling artefacts: the interpolation algorithm creates smooth gradients where the original had sharp transitions, and periodic patterns appear in the Fourier spectrum of the image. A document examiner comparing pixel-level sharpness across different regions of a scanned form can use these artefacts to detect that one region was captured at a different resolution and then resampled to match.

Forensically exploitable signatures: summary

The imaging pipeline is a chain of decisions: sensor design, Bayer pattern, demosaicing algorithm, in-camera processing, colour-space assignment, format, and compression. Each link in that chain writes something into the file. Forgeries that alter the pixel content break one or more of those consistent writes. The analyst's job is to know what each link normally writes and to recognise the breakage.

CFA / demosaicing pattern: local disruption of the demosaicing correlation indicates pixel-level manipulation. Detected by measuring the periodicity of colour-channel correlations on a 2×2 grid.
PRNU / sensor fingerprint: local absence of the camera's fixed-pattern noise indicates a region that did not originate on that sensor. Works even after mild JPEG compression.
JPEG block artefacts and double-compression: a single-compression image has a clean DCT coefficient histogram. Re-encoding introduces a secondary quantisation pattern detectable in the frequency domain.
Noise inconsistency: genuine images have a noise level that follows a predictable function of brightness (Poisson + Gaussian). Regions with anomalously low or high noise relative to that function are candidates for spliced or generated content.
Resampling and interpolation artefacts: scaling or rotating a region introduces periodic correlations in the pixel grid that do not appear in unmodified images from that sensor.

Worked example

Detecting a spliced region in a document photograph

A property deed photograph in a fraud case: the Bayer inconsistency gives it away.

Investigators receive a JPEG photograph of a property deed, allegedly taken with a Samsung Galaxy phone. The accused claims the document proves ownership. The photograph is sharp and visually convincing. The task is to determine whether it is authentic.

EXIF check. Metadata claims Samsung SM-G991B (Galaxy S21), focal length 4.3 mm, exposure 1/60 s, ISO 250. Nothing obviously wrong, but metadata is trivially editable.
PRNU extraction. The analyst obtains five additional photographs confirmed to have been taken on the same phone. PRNU fingerprint is estimated. The questioned photograph's noise residual is correlated with this fingerprint. The background text region shows the expected correlation; a rectangular patch containing the owner's name shows a correlation near zero.
CFA analysis. The CFA periodicity map, computed by measuring the strength of the expected demosaicing correlation on a sliding window, shows normal values everywhere except in the same rectangular patch. The patch has lower periodicity, consistent with a region that was upsampled from a lower-resolution source or generated independently.
Double-compression test. The DCT coefficient histogram of the patch shows a different quantisation step from the surrounding document. The background was compressed at one JPEG quality setting; the patch was compressed at a higher quality and then the whole image was saved again at a lower quality. The double-compression signature is clearest in the high-frequency DCT coefficients.
Conclusion. Three independent signals (PRNU absence, CFA disruption, double compression) converge on the same region. The analyst reports that this region shows characteristics inconsistent with the claimed single-capture origin, and that the convergent evidence supports the hypothesis of local digital manipulation.

No single test is conclusive. A region with unusual noise could have been photographed through a dirty lens. A double-compression signature could arise from an innocent edit-and-resave workflow. The convergence of three independent methods on the same spatial region, combined with the semantic significance of the altered content (the owner's name), is what makes the case reportable.

Check your understanding

Question 1 of 4· 0 answered

Why does the Bayer CFA use two green photosites in each 2×2 tile instead of one?

Key Takeaways

The Bayer CFA means every photosite records only one colour channel; demosaicing fills in the missing two and leaves a spatial correlation pattern that is camera-specific and disrupted by pixel-level manipulation.
RAW files preserve the most forensic information; JPEG introduces lossy block artefacts whose double-compression signature exposes re-encoding; PNG is lossless and useful for noise analysis but not a native camera format.
Sensor noise has four components (shot, read, dark current, and fixed-pattern), and the stable fixed-pattern component, PRNU, is the basis for camera fingerprinting that can associate questioned images with specific devices.
Bit depth, colour spaces, and gamma encoding all alter the numerical representation of pixel values; colour-space conversion can clip out-of-gamut values and alter noise statistics relied upon by camera-identification methods.
Forensic signatures including PRNU absence, CFA disruption, double-compression patterns, noise inconsistency, and resampling artefacts each probe a different stage of the imaging pipeline; convergence of multiple signals strengthens a finding.

What is a Bayer CFA and why does it matter for image forensics?

A Bayer Colour Filter Array is a mosaic of red, green, and blue filters placed over a sensor so each photosite records only one colour channel. Demosaicing software estimates the missing two channels at each site. The interpolation pattern it leaves is camera-specific, and its disruption is a reliable indicator of local tampering.

Why do forensic examiners prefer RAW files over JPEGs?

RAW files store the sensor's unprocessed photosite values with minimal in-camera processing. JPEG applies lossy compression and discards fine detail. RAW preserves the full noise signature, the original white-balance metadata, and the demosaicing artefact pattern that JPEGs mask or destroy through successive compression rounds.

What is fixed-pattern noise and how is it used to identify cameras?

Fixed-pattern noise (FPN) arises from tiny manufacturing variations that make each photosite respond slightly differently to identical light. It is stable across images from the same sensor, so averaging many flat-field images reveals the camera's unique FPN fingerprint, which can be matched to questioned images even after resampling or mild editing.

How does bit depth affect the forensic analysis of an image?

Bit depth sets the number of discrete tonal levels per channel. An 8-bit JPEG has 256 levels per channel; a 16-bit RAW has 65,536. Low bit depth means fine tonal gradients are quantised into visible steps. Histogram gaps produced by resampling or curves adjustments are more easily detected in high-bit-depth files because the original distribution is more continuous.

Can colour-space conversion destroy forensic evidence in an image?

Yes. Converting from a wide-gamut space such as Adobe RGB to sRGB clips out-of-gamut values and alters pixel values globally. The conversion can mask noise patterns, shift colour ratios used in source-camera analysis, and produce new compression artefacts if the file is re-saved as JPEG afterward. Examiners prefer working with the camera's native colour space whenever possible.

Test yourself on Forensic Audio, Video and Image Analysis with free, timed mocks.

Practice Forensic Audio, Video and Image Analysis questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Key Takeaways

Your journey to becoming a forensic professional starts here.