AI-Generated Image Detection and Provenance

How forensic scientists and provenance systems detect whether an image was captured by a camera or generated by an AI, covering GAN fingerprints, diffusion model spectral signatures, C2PA cryptographic manifests, and the limitations of binary classifiers on unseen generators.

Last updated: 19 Jun 2026

AI-generated image detection uses two complementary approaches: passive forensic analysis of artefact signatures left by generative models (GAN periodic fingerprints, diffusion model spectral anomalies, absence of camera-native sensor noise) and active provenance infrastructure (C2PA cryptographic manifests, invisible watermarking) that authenticates creation history rather than detecting synthesis. No single method is sufficient: binary classifiers fail on generator families not seen during training, provenance chains require cooperative adoption, and watermarks are vulnerable to adversarial removal. In court or investigative settings, findings from multiple methods must be combined and each method's validated scope must be stated explicitly.

Every photograph ever taken carries invisible marks of how it was made: the noise pattern of the sensor, the optical imperfections of the lens, the compression fingerprint of the in-camera processor. Every image generated by an AI carries a different set of marks: the periodic patterns of transposed-convolution upsampling, the spectral distribution of a diffusion model's denoising residual, the absence of any camera-native statistics at all. The forensic task is to read those marks and reach a reliable conclusion about the image's origin.

This matters at two levels. At the individual-image level, a court or investigative body may need to know whether a specific image showing an alleged event was taken by a camera or generated by a model. At the systemic level, the ability to make any photographic truth claim is eroding as synthetic images become indistinguishable to human viewers. The provenance infrastructure response to that erosion, led by standards like C2PA, attempts to solve the problem not by detecting synthesis artefacts but by establishing a verifiable chain of custody for the image from capture to publication.

This topic covers the forensic detection methods for AI-generated images, from the GAN fingerprint as a device-fingerprint analogue to the spectral signatures specific to diffusion models, and then the provenance infrastructure layer, C2PA and invisible watermarking proposals. It is honest about the limitations: binary classifiers break on out-of-distribution generators, provenance chains only work if cooperating parties adopt them, and watermarks can be removed. The sections that follow give a realistic assessment of what the current toolkit can and cannot support in court or investigative settings.

By the end of this topic you will be able to:

Explain the GAN fingerprint analogue to PRNU and describe the extraction and correlation workflow used for model attribution.
Identify the spectral and noise properties that distinguish diffusion model output from genuine photographs and explain why GAN-trained detectors fail on diffusion images.
Describe how a C2PA provenance manifest is constructed, what a verifier can and cannot confirm, and what constitutes a circumvention attack.
Explain out-of-distribution (OOD) failure in binary AI-vs-real classifiers and evaluate the mitigations proposed in the published literature.
Apply a multi-method assessment workflow to an unauthenticated image, combining metadata inspection, frequency-domain analysis, noise residual analysis, and OSINT corroboration.

Key terms

GAN fingerprint: A model-specific periodic pattern embedded in GAN-generated images, detectable by averaging outputs and subtracting a reference. Analogous to photo-response non-uniformity (PRNU) in cameras. First documented systematically by Marra et al. (2019).
PRNU: Photo-Response Non-Uniformity. The per-pixel variation in light sensitivity across a camera sensor, which produces a consistent, device-specific noise pattern in every photograph it takes. Used in traditional camera forensics to attribute a photograph to a specific device.
C2PA: Coalition for Content Provenance and Authenticity. An open technical standard for attaching cryptographically signed provenance manifests to media files, recording the creation and editing chain from capture device through publishing. Developed by Adobe, Microsoft, BBC, and others.
Content Credentials: Adobe's brand name for C2PA-compliant provenance manifests. A Content Credentials badge on an image indicates that a signed, auditable creation history is attached. The manifest includes generation tool, human editor, and date/location if consented.
Invisible watermark: A signal embedded in an image or audio file that is imperceptible to a human viewer but detectable by an algorithm. In AI-generated content, proposals exist to embed watermarks during model inference to mark all output as synthetic.
Out-of-distribution (OOD) failure: The drop in a classifier's accuracy when test samples come from a different statistical distribution than the training data. AI-generated image detectors suffer OOD failure when a novel generator family not seen during training is encountered.

The GAN fingerprint as a model signature

In traditional digital camera forensics, PRNU is a reliable way to link a photograph to a specific device. The per-pixel sensitivity variation of a camera sensor is stable across photographs and can be extracted by averaging many images and subtracting the low-frequency content. Marra et al. (2019) demonstrated that GAN generators have an analogue: each trained model's weights imprint a periodic fingerprint on every image it generates. The fingerprint is model-specific, not architecture-specific. Two ProGAN models trained with different random seeds produce detectable, distinguishable fingerprints.

The practical forensic workflow mirrors PRNU attribution. A reference fingerprint is extracted from a set of known images from the model under investigation, typically by averaging the high-pass residuals of hundreds of generated samples. The fingerprint is then correlated with the residual of the query image. High correlation indicates the query was generated by that model; low correlation rules it out. This approach can attribute a specific generated image to a specific model instance, which is potentially useful for source tracing in disinformation investigations.

GAN fingerprint extraction and attribution workflow.

Diffusion model artefacts and spectral signatures

Diffusion models generate images through a denoising process that operates in latent space and decodes to pixel space via a VAE (variational autoencoder) decoder. The decoder's upsampling layers introduce spectral periodicity, similar to but distinct from the GAN checkerboard. More forensically significant is the absence of properties that all genuine photographs carry: camera sensor noise (photon shot noise, dark current), optical vignetting, chromatic aberration, and the spatial correlation structure of natural image statistics in the very high-frequency band.

Property	Genuine photograph	Diffusion model output
Sensor noise	Present: photon shot noise + dark current per pixel	Absent: denoising residual instead
High-frequency spectral slope	Follows natural scene statistics (1/f)	Deviates; flatter or steeper depending on model
Optical artefacts	Vignetting, chromatic aberration, barrel distortion	Absent unless post-processed
EXIF metadata	Authentic device + GPS + settings chain	None, or fabricated
Compression history	Single in-camera JPEG or raw decode	Often re-encoded; no original raw
Spectral periodicity	None from optics (slight from demosaicing)	U-Net upsampling periodicity in decoder

Corvi et al. (2023) published a systematic study of diffusion model artefacts across Stable Diffusion, DALL-E 2, and several other systems, finding that detectors trained on GAN output performed significantly worse on diffusion output. They proposed features specific to the diffusion denoising residual that improved cross-family generalisation, but noted that performance on novel, unseen diffusion model checkpoints was still substantially lower than on held-out test sets from known models.

C2PA and cryptographic provenance manifests

The Coalition for Content Provenance and Authenticity (C2PA) was formed in 2021 by Adobe, Arm, BBC, Intel, Microsoft, and Truepic, with the stated aim of building an open technical standard for media provenance. The core concept is a manifest: a structured record attached to a media file that describes its creation history, with each step signed by the tool or actor responsible for it. A photograph taken with a C2PA-enabled camera gets a manifest signed with the camera's hardware key. Each edit in C2PA-aware software adds a signed update. If the image was generated by an AI tool, that tool records its contribution.

Capture or generation
A C2PA-enabled camera or AI tool creates the initial manifest, signed with a hardware or software key. The manifest records the device, timestamp, GPS if enabled, and for AI tools, the model version and prompt hash.
Editing
Each editing application that participates in C2PA adds a signed entry to the manifest recording which operations were applied. Cropping, colour correction, and AI-assisted editing tools each leave a record.
Publishing
The publishing platform can add its own signed entry attesting to when and how it received the file. The full chain from capture to publication is readable by anyone with the public keys of the signing parties.
Verification
A viewer or forensic analyst runs the manifest through a C2PA-compatible verifier. Any broken signature in the chain, from tampering with the image content or the manifest itself, is flagged. The verifier reports what it can confirm and what is unverifiable.

C2PA does not make forgery impossible. An attacker who controls the signing keys, or who captures a screen-grab of an authentic image to strip the original manifest, circumvents the chain entirely. What C2PA provides is a positive claim: when the chain is intact, the creation history it describes can be verified. When the chain is absent, the image makes no authenticated provenance claim, which is itself information. As of 2024, Leica, Sony, and Nikon have released or announced C2PA-enabled camera firmware, and Adobe Photoshop generates C2PA manifests for exported files.

C2PA provenance chain: capture, edit, publish, verify.

Invisible watermarking and signing proposals

Invisible watermarking proposes embedding a signal in generated images during model inference, before any post-processing, so that all output from a cooperating generator is marked as synthetic. Several technically distinct approaches exist. Spectral watermarking places a low-amplitude pattern at specific spatial frequencies chosen to survive moderate JPEG compression. Latent-space watermarking applies a fixed perturbation to the latent representation before decoding, so the mark appears in every generated image from that model. Semantic watermarking trains the model to include a specific detectable pattern in generated textures without visible distortion.

Meta's Stable Signature (Fernandez et al., 2023): a watermark finetuned into the VAE decoder of a latent diffusion model so that every decoded image carries an imperceptible but persistent mark. Demonstrated survival through JPEG compression and mild geometric transforms.
Google DeepMind SynthID: applies a learned watermark to images generated by Imagen, audio generated by Lyria, and text generated by Gemini. Released publicly for text-to-image and text-to-audio use cases. Claims the mark survives common social media re-compression.
Watermark removal attacks: multiple published works have shown that existing watermarking schemes are vulnerable to adversarial removal via small noise perturbations or diffusion-based regeneration, which reconstructs the semantic content of an image while replacing the frequency-domain signature.

Limits of binary classifiers on out-of-distribution generators

The core limitation of binary AI-vs-real classifiers is the same as for deepfake video detectors: they learn the artefact signature of the generators in their training set. When a novel generator with a different architecture is released, the classifier's features may not activate on the new output, causing accuracy to collapse. Wang et al. (2020) demonstrated this with CNNDetect, showing that a ResNet-50 classifier trained on ProGAN output achieved near-perfect accuracy on ProGAN test images but dropped to chance-level accuracy on several other GAN architectures when trained only on ProGAN.

Proposed mitigations include training on augmented data that mimics a wide range of post-processing operations, so the classifier learns to detect features that survive processing rather than artefacts specific to one generator; using frequency-domain features that are more architecture-agnostic; and ensemble approaches that combine signals from multiple detection methods. None fully solves the OOD problem, particularly against diffusion models, which have different spectral properties from the GANs used in most published training sets as of 2022-2023.

Approach	In-distribution accuracy	OOD accuracy	Notes
Pixel-domain CNN (e.g., XceptionNet)	Very high	Low	Learns generator-specific textures
Frequency-domain classifier	High	Moderate	More architecture-agnostic spectral features
GAN fingerprint correlation	High for attribution	Not applicable for unseen models	Requires reference outputs from target model
Noiseprint / residual CNN	High	Moderate	Transfers across some generator families
C2PA manifest check	N/A (provenance, not detection)	Applies to all cooperating generators	Fails if chain absent or circumvented

Photographic truth claims and the forensic burden

The photographic truth claim, the implicit assertion that a photograph depicts a real event, has been legally and socially significant since the 19th century. Courts in common-law and civil-law jurisdictions have developed evidentiary standards for photographic evidence that presume an authentic capture chain unless challenged. Digital image forensics inherited and extended this framework, developing tools to detect manipulation of genuine photographs.

AI-generated images introduce a qualitatively different challenge. The question is no longer whether a genuine photograph was altered but whether any camera-capture event occurred at all. A fully synthetic image has no original: it was never a photograph. The forensic and evidentiary consequence is that authentication cannot proceed by establishing chain-of-custody back to a capture device, because no capture device exists. Instead, the burden falls on detection of synthesis artefacts, provenance manifest verification, or contextual and OSINT corroboration of the depicted events.

Worked example

Assessing a news photograph with no provenance

Applying detection and provenance methods to a contested conflict-zone image.

A news agency receives a high-resolution photograph purportedly showing civilian infrastructure damage in a conflict zone. No photographer credit is attached, no camera metadata is present, and no C2PA manifest is embedded. The image desk asks for a forensic opinion before publication.

Metadata inspection. No EXIF camera model, no GPS, no creation timestamp. The file was exported from an unknown application. Missing EXIF triggers further analysis but is not diagnostic on its own.
Frequency-domain analysis. The 2-D FFT of sky and shadow regions shows no GAN checkerboard artefact. The high-frequency spectral slope in natural textures (grass, rubble) is consistent with camera-captured content, not the flatter slope typical of diffusion model output.
Noise residual analysis. A Noiseprint map shows consistent fingerprint energy across the image with no regional discontinuity. This is consistent with a genuine photograph from a single device, though it does not exclude sophisticated post-processing or inpainting.
Semantic review. Shadow directions are consistent with solar elevation for the claimed geographic location and time. Building structures and vehicle models are consistent with known infrastructure in the region. No obvious geometry inconsistencies.
OSINT corroboration. A reverse image search finds no prior publication. Geolocation via landmark features and satellite imagery comparison places the view at a specific coordinate consistent with the claimed location. Street-view data from before the conflict shows the same buildings undamaged.
Conclusion. No signal analysis method identified synthetic generation artefacts. The Noiseprint and spectral results are consistent with a genuine photograph. OSINT independently corroborates the claimed location. The absence of EXIF is unexplained but insufficient to withhold publication given convergent evidence. The conclusion is reported with its basis and limitations, not as a binary pass/fail.

Check your understanding

Question 1 of 4· 0 answered

What is the forensic analogy between PRNU and the GAN fingerprint identified by Marra et al. (2019)?

Key Takeaways

GAN models leave a model-specific periodic fingerprint extractable by averaging high-pass residuals, analogous to PRNU in cameras; this can support both detection and source attribution.
Diffusion model outputs differ from genuine photographs in sensor noise structure, spectral slope, and absence of optical artefacts; detectors trained on GAN artefacts perform poorly on diffusion output.
C2PA provides cryptographic provenance manifests signed at each creation step; a valid chain confirms the recorded history but cannot attest to the reality of the depicted content.
Invisible watermarking proposals (SynthID, Stable Signature) embed imperceptible marks at generation time, but are vulnerable to adversarial removal and only cover output from cooperating generators.
Binary AI-vs-real classifiers suffer out-of-distribution failure on novel generator families not seen during training; forensic reports must state the detector's validated scope.
The liar's dividend, dismissing genuine evidence as AI-generated, may cause more systemic harm than direct synthetic media use, and forensic practice must account for it explicitly.

What is the GAN fingerprint discovered by Marra et al. (2019)?

Marra et al. (2019) showed that each GAN model leaves a detectable periodic pattern in its output images, analogous to PRNU in camera sensors. This fingerprint is a property of the specific trained model's weights. By averaging many outputs from one model and subtracting a reference, the fingerprint can be extracted and used both to classify an image as GAN-generated and to attribute it to a specific model instance.

What is C2PA and how does it support image provenance?

C2PA is an open standard for attaching cryptographically signed provenance manifests to media files. A manifest records the creation chain: the capture device, any editing software, the editor's identity, and whether AI generation tools were used. Because each link is digitally signed, tampering is detectable. The standard is backed by Adobe, Microsoft, BBC, and others, and is being integrated into camera firmware and editing tools.

Why do binary classifiers for AI-generated images perform poorly on unseen generators?

Binary classifiers trained on a fixed set of generator families learn the specific artefact signatures of those families. When a novel generator is released with a different architecture or post-processing pipeline, the classifier's learned features may not activate on the new artefacts, causing accuracy to drop sharply. This out-of-distribution failure is the central limitation of current AI-generated image detectors.

What are invisible watermarking proposals for generative AI output?

Invisible watermarking embeds a signal in a generated image that is imperceptible to a viewer but detectable by an algorithm. Proposals include spectral watermarks, adversarially resilient spatial watermarks, and latent-space imprints. Limitations include vulnerability to adversarial removal and the fact that only content from cooperating generators would carry them.

Test yourself on Forensic Audio, Video and Image Analysis with free, timed mocks.

Practice Forensic Audio, Video and Image Analysis questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Key Takeaways

Your journey to becoming a forensic professional starts here.