Cloud Logging, VM Snapshots and Cloud Incident Response
Cloud forensic logging across AWS, Azure and GCP, EBS and managed-disk snapshot acquisition, NIST 800-61 incident response in cloud accounts, and the CERT-In April 2022 180-day retention rule that anchors Indian cloud cases.
Last updated:
Cloud forensic investigation relies on three log layers available across AWS, Azure, and GCP: control-plane logs (management API calls), data-plane logs (access to data inside resources), and network flow logs. Forensic acquisition replaces physical disk imaging with a five-step snapshot workflow: take a storage-layer snapshot of the managed volume, share it to a dedicated forensic account, create a volume from the snapshot, attach it read-only to a forensic instance, then image and hash it with a BSA Section 63(4) certificate. Because data-plane logs are off by default on all three platforms, and default retention windows run 30 to 90 days, log readiness before a breach is the decisive factor in whether a cloud case is solvable. The CERT-In Direction of April 2022 mandates a 180-day retention minimum and a six-hour reporting window for notifiable incidents, applying to all data centres, cloud service providers, VPS providers, and intermediaries serving Indian users.
Cloud forensic investigations begin with logs the investigator did not configure. The disks belong to the provider, the hypervisor is outside the customer's visibility, and the storage volume is distributed across an availability zone. The only durable record of who did what is whatever the customer enabled before the breach. CloudTrail disabled in a region, an Activity Log never exported to a workspace, audit logs retained for 30 days when the attacker dwelled for 60: these are the gaps that determine whether a cloud case is solvable. Acquisition no longer means pulling a physical disk. It means snapshotting the managed volume, sharing it cross-account, attaching it to a forensic instance, hashing it, and writing the BSA Section 63 certificate.
Key takeaways
- The cloud acquisition workflow replaces disk imaging with a five-step process: snapshot the managed volume, share it cross-account, attach it to a forensic instance, hash it, and write the BSA Section 63 certificate.
- Log readiness before a breach is the single biggest determinant of whether a cloud case is solvable, because CloudTrail or Activity Log disabled at incident time cannot be reconstructed retroactively.
- Each major cloud provider uses different terminology for the same log layers, so an examiner must learn three vocabularies, AWS, Azure, and GCP, all sharing one underlying control-plane versus data-plane mental model.
- The CERT-In Direction of April 2022 mandates 180-day log retention for service providers and intermediaries in India, setting the minimum baseline an examiner can demand from a cloud provider during an investigation.
- NIST SP 800-61 incident response phases map onto cloud-specific containment actions, but the absence of physical hardware access means containment relies entirely on IAM policy changes and snapshot controls.
This topic walks the digital forensics cloud incident response workflow as practised in Indian SOCs and CERT-In coordinated cases. The control-plane vs data-plane log split is treated in detail across AWS, Azure and GCP. The forensic snapshot procedure is given as a five-step runbook with the IAM and KMS gotchas that turn a clean acquisition into a contaminated one. NIST SP 800-61 incident response is mapped onto cloud-specific containment. The Indian anchor is the CERT-In Direction of April 2022 that mandates 180-day log retention for service providers and intermediaries, the financial-sector RBI overlays, and the cross-link to network-side defence at Network security, firewalls, IDS, IPsec, SSL/TLS, VPN, PKI, SIEM.
By the end of this topic you will be able to:
- Identify which log surface (control-plane, data-plane, network flow, anomaly findings) feeds each step of a cloud incident response on AWS, Azure, and GCP.
- Execute the five-step forensic snapshot workflow for an EBS, Azure managed disk, or GCP persistent disk, including KMS key handling and chain-of-custody documentation.
- Apply the snapshot-first containment rule and distinguish quarantine-policy containment from credential revocation and instance termination.
- Map NIST SP 800-61 incident response phases onto cloud-specific containment actions such as IAM credential revocation, security group isolation, and S3 bucket policy lockdown.
- Identify the Indian regulatory obligations triggered by a cloud breach, including CERT-In six-hour reporting, DPDP Act notification to the Data Protection Board, and BSA Section 63(4) certification for electronic records.
- Control-plane log
- A record of management API calls against cloud resources (create, delete, modify, attach IAM role). AWS CloudTrail, Azure Activity Log, and GCP Admin Activity logs are the canonical examples.
- Data-plane log
- A record of operations on the data inside a resource: S3 object reads, KMS decrypts, database queries. Off by default in most services and the single biggest gap in cloud cases.
- VM snapshot
- A point-in-time copy of a managed disk taken at the storage layer (EBS, Azure managed disk, GCP persistent disk). Crash-consistent by default; application-consistent only if the OS is quiesced.
- Forensic account
- A dedicated cloud account, separate from the production account, that holds shared snapshots, the forensic analysis instance, and the chain-of-custody artefacts. Isolating the analysis prevents attacker IAM from reaching the evidence.
- CERT-In Direction (April 2022)
- Direction No. 20(3)/2022-CERT-In under Section 70B(6) of the IT Act mandating 180-day log retention by data centres, virtual private server providers, cloud service providers, and intermediaries, with reporting to CERT-In within six hours of a notifiable incident.
- Service Control Policy
- An AWS Organizations policy attached to an account or OU that constrains what IAM principals inside the account can do. Used in forensic readiness to deny `cloudtrail:StopLogging`, `cloudtrail:DeleteTrail` and `s3:DeleteBucketPolicy` on the audit bucket.
The cloud log inventory by provider

The first task on any cloud case is to know which logs exist, which are on by default, and which had to be turned on before the incident. The vocabulary is different across AWS, Azure and GCP, but the mental model is the same: a control-plane log records management API calls, a data-plane log records access to the data, a network log records flow tuples at the VPC layer, and a security service log records derived findings from anomaly detection.
| Layer | AWS | Azure | GCP |
|---|---|---|---|
| Control plane | CloudTrail (management events) | Activity Log | Cloud Audit Logs · Admin Activity |
| Data plane | CloudTrail data events (S3, Lambda, DynamoDB) | Resource Logs (storage, key vault, SQL) | Cloud Audit Logs · Data Access |
| Network flow | VPC Flow Logs | NSG Flow Logs | VPC Flow Logs |
| Object access | S3 server access logs, S3 Object Lambda logs | Storage analytics logs | Cloud Storage usage logs |
| Load balancer | ELB / ALB access logs | App Gateway, Front Door diagnostics | HTTP(S) LB logging |
| Anomaly findings | GuardDuty, Security Hub | Microsoft Defender for Cloud, Sentinel | Security Command Center |
| Config drift | AWS Config history | Azure Policy compliance, Resource Graph | Asset Inventory + Recommender |
| Custom / system | CloudWatch Logs | Log Analytics workspace | Cloud Logging |
CloudTrail in AWS is on by default for the last 90 days of management events through the Event History view, but the default retention is short and the events are not searchable at scale until a trail is configured to write to an S3 bucket. Data events (S3 GetObject, Lambda Invoke) are off by default; their absence is the single most common gap in compromised-account cases. Azure Activity Log is on by default for 90 days but does not persist beyond that unless exported to a Log Analytics workspace, an Event Hub, or a storage account. GCP's Admin Activity logs are on by default and retained for 400 days; Data Access logs are off by default for cost reasons and must be enabled per service.
The Indian anchor sits across all three providers. The CERT-In Direction of April 2022 (No. 20(3)/2022-CERT-In) makes 180 days of log retention a statutory minimum for service providers, intermediaries, data centres, VPS providers, and cloud service providers serving Indian users. The same direction sets a six-hour reporting window for notifiable incidents. RBI's Master Direction on Information Technology Governance (2023) pushes scheduled commercial banks to longer retention and to dedicated log sinks. A breach response that finds only 30 days of CloudTrail history on an Indian bank tenant is a compliance failure as much as a forensic one.
Retention, immutability and the readiness story
Default retention windows are short by design because logging at cloud scale costs money. The forensic readiness step is to move every log of interest to an immutable, write-once destination before any incident happens. The control surfaces that matter:
- AWS: CloudTrail organisation trail writing to a dedicated audit account S3 bucket; bucket has S3 Object Lock in compliance mode, KMS encryption with a key in the audit account, and a bucket policy denying
s3:DeleteObject,s3:PutObjectLegalHold:false, ands3:DeleteBucketPolicyto everyone including root. A service control policy on every account in the organisation deniescloudtrail:StopLogging,cloudtrail:DeleteTrail,cloudtrail:UpdateTrail, andcloudtrail:PutEventSelectors. - Azure: Diagnostic settings exporting Activity Log and Resource Logs to a Log Analytics workspace in a separate subscription. Immutability policy on the underlying storage account (time-based retention or legal hold). Azure Policy denies modifications to the diagnostic settings. Sentinel pulls from the workspace for detection.
- GCP: Aggregated log sink at the organisation level routing to a Cloud Storage bucket with bucket lock retention or to a BigQuery dataset with partition expiration. Organisation policy
iam.disableServiceAccountKeyCreationandstorage.uniformBucketLevelAccesscut off common attacker paths. Audit log configs explicitly enable Data Access for the high-value services.
The point of the immutability story is that an attacker who lands in an account with cloudtrail:StopLogging permission will use it. The 2019 Capital One breach used a server-side request forgery to assume a role and exfiltrate from S3; the 2024 Snowflake-customer compromises ran for weeks because data access logs were off. In both pattern cases the gap was on the customer side, not the provider side. Forensic readiness fixes the gap before the incident, not during.
Centralised collection is the second half of readiness. SIEM ingestion into Splunk Cloud, Microsoft Sentinel, Sumo Logic, or Elastic Security gives the responder one query surface across the providers. The trade-off is cost: VPC Flow Logs at any real scale generate terabytes a day, and ingestion-by-volume pricing on a SIEM can run into lakhs per month. The standard Indian SOC pattern is to send control-plane and security-service logs in full, and to filter VPC Flow Logs at the source (sampling 1 in 100, or sending only rejected traffic) until an incident makes the full feed worth the bill.
VM snapshot forensics across the big three
The cloud equivalent of a write-blocked drive image is the storage-layer snapshot. AWS EBS, Azure managed disks, and GCP persistent disks all expose a snapshot API that captures the block device state at a point in time. The snapshot is incremental on the back end but presents as a full copy. The forensic value is that a snapshot can be taken without disturbing the running VM and shared across accounts to a clean forensic environment.
- Snapshot the running volumeAWS: `aws ec2 create-snapshot --volume-id vol-xxxx --description "forensic case 2026-CSL-417"`. Azure: snapshot the managed disk through the portal or `az snapshot create`. GCP: `gcloud compute disks snapshot`. Crash-consistent by default; application-consistent if the OS filesystem is quiesced (`fsfreeze` on Linux, VSS on Windows) or the VM is paused first.
- Share the snapshot to the forensic accountAWS: `aws ec2 modify-snapshot-attribute --create-volume-permission Add=UserId=<forensic-account>`. If the volume is KMS-encrypted, also share the CMK or re-encrypt the snapshot with a forensic-account KMS key. Azure: snapshot can be exported as a SAS URL, or the managed disk can be moved to a forensic resource group with RBAC restricted to the case examiner. GCP: snapshot is granted IAM access for the forensic project service account.
- Create a volume from the snapshot in the forensic accountAWS: `aws ec2 create-volume --snapshot-id snap-xxxx --availability-zone <fz>`. The new volume sits in the forensic VPC, behind a security group that allows only the analyst workstation.
- Attach to a forensic instanceAttach the volume to a hardened, read-only forensic EC2 (or equivalent) running SIFT Workstation or a custom Ubuntu with dc3dd, ewfacquire, libewf, sleuthkit, Volatility 3, and Plaso. Mount with `mount -o ro,noatime,noexec`.
- Image and hash`dc3dd if=/dev/nvme1n1 of=/case/forensic.E01 hash=sha256 hash=md5 log=/case/forensic.log` or `ewfacquire`. Record the source snapshot ID, the volume ID, the instance ID, the IAM role, the CMK ARN and the dual hash in the BSA Sec 63(4) certificate.
Memory acquisition from cloud VMs presents constraints absent in physical investigations because the hypervisor is outside customer visibility. The realistic approaches in 2026:
- AWS: the AWS Forensics Pipeline (open-source reference from AWS Security) takes an EBS snapshot of the root volume and triggers
vmsyncoravml(Microsoft's Linux memory tool) on the running instance to dump RAM to an S3 bucket in the forensic account. The dump is then fed to Volatility 3 with the appropriate Linux symbol pack or Windows profile. The catch is that running the agent on the live instance is itself an action that touches the system; it is documented in the runbook. - Azure: the Azure VM Inspection Pack (Microsoft) pauses the VM at the hypervisor layer and exports a memory dump for the supported sizes. Outside the pack, the standard approach is to run
winpmemoravmlinside the guest. - GCP: no first-party memory acquisition product; the in-guest approach is the only viable one.
Across all three platforms, the correct sequence is to take the disk snapshot first and the memory dump second, because the memory acquisition agent modifies the disk state. Snapshot-first preserves the unaltered disk for the chain-of-custody record, and the memory dump is documented as a known footprint.
Cloud incident response under NIST 800-61
NIST SP 800-61 Revision 2 is the incident response lifecycle that every Indian SOC builds on. The four phases (Preparation, Detection and analysis, Containment, eradication and recovery, Post-incident activity) apply to cloud the same way they apply to on-prem, but the mechanics shift.
- PreparationForensic readiness: logging enabled with immutable destinations, runbooks rehearsed, forensic account stood up with cross-account roles pre-provisioned, SCPs in place, IAM Identity Center wired up, emergency break-glass account documented. The CERT-In incident response policy is on file and the six-hour reporting clock is understood.
- Detection and analysisAlerts triage from GuardDuty, Defender for Cloud, Security Command Center, plus SIEM correlations. Scope determination: which accounts, which roles, which resources. The output is an incident timeline with timestamps in UTC and an initial impact statement.
- Containment, eradication, recoveryContainment first via security-group narrowing and IAM credential revocation, then eradication by destroying compromised resources, rotating keys, and rebuilding from clean images, then recovery by restoring traffic to the rebuilt resources.
- Post-incidentLessons learned, written incident report, IoC sharing with CERT-In and sectoral CERTs (CERT-Fin for BFSI, NCIIPC for critical infrastructure), update of the runbooks. The post-incident review is where the cost of skipped readiness becomes visible.
Cloud containment differs from on-prem containment in the absence of physical access. The first move on a compromised EC2 instance is to replace the security group with a quarantine security group that allows inbound only from the forensic analyst IP and outbound nowhere. The instance is left running long enough for memory acquisition. IAM credentials attached to the instance profile are revoked with aws iam put-user-policy adding an explicit Deny on everything, or by attaching the AWS managed AWSCompromisedKeyQuarantineV3 policy that denies the dangerous actions while leaving read access for investigation. The snapshot of the EBS volume is taken before the instance is terminated.
On a compromised S3 bucket, containment is a bucket policy that denies all principals except the investigation role, plus enabling versioning and Object Lock if they were not already enabled. On a compromised user account, the access keys are deactivated (not deleted, because deletion destroys the audit trail of which key did what), MFA is forced, the password is rotated, and active sessions are revoked. On a compromised Azure tenant the equivalent is conditional access lockdown and revocation of refresh tokens through Revoke-AzureADUserAllRefreshToken.
An AWS compromised-account walkthrough
The following composite case illustrates the full workflow. An Indian fintech subsidiary running on AWS receives a GuardDuty finding at 02:47 IST: UnauthorizedAccess:IAMUser/InstanceCredentialExfiltrationOutsideAWS. The finding includes the AccessKeyId of an instance profile credential used from an IP in another country. The SOC analyst pulls CloudTrail history filtered on that AccessKeyId.
The investigation walks the following surface, in order:
- CloudTrail by AccessKeyId for the last 30 days, filtered to write events. The result: 47 calls to
s3:ListBuckets, 312 calls tos3:GetObjectagainst a single bucket containing customer KYC documents, and four calls toiam:CreateAccessKeyagainst three other IAM users. - CloudTrail by source IP to find any other credentials used from the same IP. Two more AccessKeyIds appear, both belonging to long-lived IAM users whose keys were probably leaked from a developer laptop.
- VPC Flow Logs filtered to the EC2 instance that held the original instance profile. Outbound flows to the foreign IP for the past 18 days. The instance is identified, the security group is captured.
- S3 server access logs for the KYC bucket, confirming the GetObject pattern and giving exact byte counts that establish the scope of exfiltration.
- Config history for the instance, showing that the instance metadata service (IMDS) is at v1 (the older, exploitable version). The root cause is an SSRF in a containerised web app that lets the attacker hit
169.254.169.254and pull the instance role credentials.
The containment runs in parallel. The compromised instance profile credential is deactivated. The instance security group is replaced with the quarantine SG. The EBS volume is snapshotted, shared cross-account, and attached read-only to a forensic m6i.large for imaging. The two leaked human-user keys are deactivated. The S3 bucket policy is locked to deny all principals except the investigation role. The CERT-In six-hour clock is met by reporting at 06:30 IST with the impact statement, IoC list, and initial scope.
The post-incident write-up flags the fixes: enforce IMDSv2 across the account, rotate the leaked developer keys, move long-lived keys to IAM Identity Center with short-lived SSO sessions, narrow the S3 bucket to a VPC endpoint with a bucket policy that denies non-VPC access, and add an SCP that denies iam:CreateAccessKey for non-break-glass users. The case is closed with the CERT-In follow-up report and a customer-notification decision under the DPDP Act 2023 obligations.
The Indian regulatory frame and where this fits
The legal and regulatory frame for cloud incidents in India applies on top of the technical workflow. Five anchors govern real investigations.
| Anchor | Source | What it requires | Forensic implication |
|---|---|---|---|
| CERT-In Direction (April 2022) | Dir. 20(3)/2022-CERT-In under IT Act Sec 70B(6) | 180-day log retention; 6-hour incident reporting; NPL/NIC time sync; KYC retention for VPN/VPS providers | Defines the minimum log inventory and the reporting clock |
| RBI Cyber Security Framework | RBI Master Direction on IT Governance 2023 | Tiered controls for banks/NBFCs; SOC; cyber drills; cloud-specific controls under Banking Cloud Framework 2023 | Longer retention and dedicated log sinks for BFSI tenants |
| NCIIPC guidelines | National Critical Information Infrastructure Protection Centre | Reporting and incident response for designated CII entities | Parallel reporting to NCIIPC and CERT-In for power, banking, telecom, transport |
| DPDP Act 2023 | Digital Personal Data Protection Act | Breach notification to the Data Protection Board and to affected data principals | Personal data breach reporting overlay on top of CERT-In |
| BSA 2023 Section 63 | Bharatiya Sakshya Adhiniyam 2023 | Electronic record admissibility with Sec 63(4) certificate | The cloud image and the log extract both need a Sec 63 certificate signed by the person responsible for the system |

The CoWIN data-leak claims of 2023 are the recurring Indian example, though the government's position was that no breach of the CoWIN platform itself was confirmed and that the leaked data appeared to have originated from a Telegram bot fed by older datasets. Whatever the final attribution, the case is a teaching point on what a cloud-era breach response looks like under the CERT-In Direction: a notifiable incident, a six-hour clock, a coordinated reply by the platform owner and CERT-In, and a parallel inquiry by I4C. The network-side defences that fence cloud workloads are covered at Network security, firewalls, IDS, IPsec, SSL/TLS, VPN, PKI, SIEM. The hypervisor and multi-tenant boundary that sits underneath every cloud forensic case is at Cloud technology, virtualization and cloud security architecture; the deeper acquisition story for managed disks and cloud-backed endpoints is at Virtual machine and cloud-backed endpoint forensics.
Which CERT-In direction sets the 180-day log retention and the six-hour incident reporting clock for Indian cloud service providers and intermediaries?
Frequently asked questions
What is the difference between control-plane and data-plane logging in the cloud?
How long are cloud logs retained by default, and what does the CERT-In direction require?
How is a forensic snapshot of a running cloud VM taken without disturbing the workload?
Why is a dedicated forensic account or project used instead of analysing inside the production account?
What is the snapshot-first rule in cloud incident containment?
Which Indian regulatory bodies need to be informed when a cloud incident hits an Indian organisation?
Are cloud forensic images admissible in Indian courts under BSA Section 63?
Test yourself on Digital Forensics with free, timed mocks.
Practice Digital Forensics questionsSpotted an error in this page? Report a correction or read our editorial standards.