Email Forensics: Protocols, Header Tracing, Spoofed Mail
SMTP, POP3 and IMAP ports; reading Received headers bottom-up; SPF, DKIM and DMARC checks; PST, OST and mbox parsing; and the Indian IT Act and BNS provisions for prosecuting phishing and spoofed mail.
Last updated:
Email forensics reconstructs who sent a message and whether the From address was forged by reading the chain of Received headers from bottom (origin) to top (final delivery), then cross-referencing SPF, DKIM, and DMARC authentication results written by the inbound mail server. A From address is easily attacker-controlled; the Received chain, Return-Path, and Authentication-Results header are the authoritative evidence. In Indian phishing prosecutions, this analysis supports charges under IT Act Section 66D (cheating by personation) and BNS 2023 Sections 318-319, with the email exhibit admitted under BSA 2023 Section 63.
Email is the primary delivery surface for financial fraud in India. CERT-In quarterly reporting for 2024 and 2025 lists phishing as the top single incident category, with banking-lookalike domains and HR-impersonation as the dominant pretext types. The forensic questions are who sent the message, from which infrastructure, whether the From address was forged, and whether the chain of custody from inbox to charge sheet will withstand Section 63 BSA scrutiny.
Key takeaways
- Reading the Received chain in an email header bottom-up, from the originating server to the final delivery agent, is the correct methodology for tracing the actual sending infrastructure behind a spoofed address.
- SPF, DKIM, and DMARC together form the email authentication layer: a failed SPF check means the sending IP is not authorised, a failed DKIM check means the message body was altered or the key is wrong, and DMARC sets the policy for what happens on failure.
- SMTP runs on port 25 for server-to-server and submission on 587, with SMTPS on 465 for implicit TLS; POP3 uses 110 and 995, IMAP uses 143 and 993, and the cleartext ports remain common between mail servers.
- On-disk email containers differ by client: PST and OST for Outlook, mbox for Thunderbird and Unix clients, and individual EML or MSG files for exported messages, and each requires a different parsing tool.
- Phishing is the top single category of incidents in CERT-In quarterly notes for 2024 and 2025, with banking-lookalike domains and HR-impersonation the dominant pretext types, making header analysis a high-frequency forensic task.
This topic walks the email evidence stack from wire to witness box. It covers the protocols and their ports, the MUA-to-MTA-to-MDA flow that puts a message on disk, the anatomy of an email header and the bottom-up methodology for reading the Received chain, the SPF, DKIM and DMARC authentication mechanisms, the spoofing indicators that drop out of those checks, the on-disk evidence containers (PST, OST, mbox, EML, MSG) and the tools that parse each, and the Indian statutory frame for prosecuting phishing under IT Act Section 66D and BNS 2023 Section 318. Throughout, the goal is to produce a report that the trial court can read on its own.
By the end of this topic you will be able to:
- Identify the correct ports for SMTP (25 / 587 / 465), POP3 (110 / 995), and IMAP (143 / 993) and explain the forensic significance of each.
- Trace the path a message takes through MUA, submission MTA, relaying MTA, and MDA, and identify the evidence each agent produces.
- Apply the bottom-up method to a Received header chain to identify the true origin server and detect fabricated or discontinuous hops.
- Distinguish the scope and failure modes of SPF, DKIM, and DMARC, and explain why a DMARC fail on a banking domain is the primary spoofing indicator.
- Map an email exhibit to its on-disk container (PST, OST, mbox, EML, MSG), select the appropriate parsing tool, and describe the acquisition and Section 63 BSA certification steps.
- MTA
- Mail Transfer Agent. The server that relays mail server-to-server. Postfix, Exim, Sendmail and Microsoft Exchange are the common implementations. Speaks SMTP.
- MUA
- Mail User Agent. The client the user sees. Outlook, Thunderbird, Apple Mail, Gmail web, Spark, mobile apps. Speaks SMTP outbound and IMAP or POP3 inbound.
- Received header
- An audit line written by each MTA that handles a message, stamped onto the top of the header block. The chain reads bottom-up: the bottom Received is the origin, the top Received is the last hop before the recipient.
- SPF
- Sender Policy Framework. A DNS TXT record on the sender domain that lists the IP addresses authorised to send mail for that domain. The receiver compares the connecting IP against the SPF record.
- DKIM
- DomainKeys Identified Mail. A cryptographic signature placed in a DKIM-Signature header. The signing key sits on the sending server; the public key is published in DNS. The receiver verifies the signature against the public key.
- DMARC
- Domain-based Message Authentication, Reporting and Conformance. A DNS policy that stacks on top of SPF and DKIM, telling receivers what to do (none, quarantine, reject) on failure, and where to send aggregate (rua) and forensic (ruf) reports.
Protocols and ports: SMTP, POP3, IMAP
Email runs on three protocols, each with distinct forensic implications. SMTP carries mail outbound and server-to-server. POP3 and IMAP carry mail inbound to the client. Each has a cleartext default port and a TLS-protected variant.
| Protocol | Role | Cleartext port | Implicit TLS port | STARTTLS variant | What sits on the wire |
|---|---|---|---|---|---|
| SMTP (server-to-server) | Relay between MTAs | 25 | (none in common use) | Opportunistic, on 25 | Envelope, headers, body, in cleartext unless STARTTLS negotiates |
| SMTP submission | Client to MTA, authenticated | (587 typical) | (465 SMTPS legacy / now common) | 587 with mandatory STARTTLS | Same payload, plus SASL credentials over TLS |
| POP3 | Pull and delete mailbox | 110 | 995 (POP3S) | On 110 | USER/PASS, then message bodies |
| IMAP | Server-side mailbox with folders | 143 | 993 (IMAPS) | On 143 | LOGIN, then per-folder UID and body streams |
The choice between POP3 and IMAP shapes the on-disk evidence picture. POP3 traditionally downloads and deletes; the inbox of evidentiary value sits on the client. IMAP keeps the mailbox on the server and the client holds a synchronised copy (an OST for Outlook against Exchange, a local mailbox for Thunderbird against IMAP). Webmail is yet a third pattern: nothing of evidentiary value sits on the client beyond the browser cache (covered in our Web Browser Forensics topic). The acquisition plan has to match the protocol.
A practical Indian context point: many corporate phishing investigations involve submission on 587 (the attacker authenticated to a compromised mailbox) rather than open-relay on 25. The MTA log on the submission server then carries the SASL username, the source IP and the timestamp, which combined with login telemetry from the same provider often identifies the attacker's foothold account.
Email architecture: MUA, MTA, MDA and the MX record
A single message traverses several agents before it lands in the recipient's mailbox. Each agent is a candidate evidence source.
- Sender MUA composes and submitsThe user writes the message in Outlook, Thunderbird, Apple Mail or Gmail web. The MUA authenticates to its submission MTA on 587 or 465 with SASL credentials. The MUA may keep a Sent Items copy locally (PST/OST for Outlook, Sent folder for Thunderbird mbox).
- Submission MTA accepts and queuesThe submission MTA writes a Received header, applies any DKIM signature, runs outbound policy and queues for relay. The MTA's maillog (Postfix /var/log/mail.log, Sendmail /var/log/maillog, Exchange message-tracking log) captures the queue ID.
- DNS MX lookupThe relaying MTA queries DNS for the recipient domain's MX record. The MX target is the inbound MTA for the recipient.
- Recipient inbound MTA receivesThe inbound MTA runs SPF (against the connecting IP), DKIM verification (against the DKIM-Signature header), DMARC alignment, and any local antispam. It writes its own Received header and an Authentication-Results header capturing the SPF / DKIM / DMARC outcomes.
- MDA delivers to mailbox storeThe Mail Delivery Agent writes the message to the user's mailbox: Maildir, Microsoft 365 EXOlogs and store, Google Workspace store. The mailbox is the canonical evidentiary source on the recipient side.
- Recipient MUA fetchesThe recipient's MUA pulls via IMAP or POP3 (or renders via webmail). A second on-disk copy lands in the OST or mbox.
Each agent writes one Received header onto the message as it touches it. The chain of Received headers therefore reads bottom-up: the bottom-most Received was written first (by the sending MTA) and the top-most was written last (by the receiving MTA, immediately before delivery). Reading top-down inverts the timeline, which is the most common rookie error in header tracing.

The Authentication-Results header, written by the inbound MTA, summarises the SPF, DKIM and DMARC outcomes for that delivery. It is the single most useful line in the header for triage. A line that reads spf=fail dkim=fail dmarc=fail on a message claiming to come from a major Indian bank's domain is the first piece of evidence the report should quote.
Reading the header: anatomy and the bottom-up method
The header block contains audit lines and metadata lines. The audit lines (Received, Authentication-Results, X-Originating-IP, X-Sender-IP) carry the forensic content; the metadata lines (From, To, Cc, Bcc, Date, Subject, Message-ID, Return-Path, Reply-To, MIME-Version, Content-Type) describe the message itself.
| Header | Set by | Investigative use |
|---|---|---|
| Received | Each MTA on the path | Reconstruct the hop trail. Read bottom-up. |
| Return-Path | Inbound MTA from MAIL FROM (the envelope sender) | Compare against the visible From. Mismatch is a spoofing indicator. |
| From | MUA, attacker-controllable | Display name and address shown to the recipient. Not trustworthy on its own. |
| Reply-To | MUA, attacker-controllable | Where replies go. Phishing often uses Reply-To to redirect to a different mailbox. |
| Message-ID | Sending MTA | Unique identifier. Useful for de-duplication and for cross-referencing MTA logs. |
| Date | Sending MUA | Sender-claimed time. Compare against Received timestamps for clock skew or fabrication. |
| X-Originating-IP | Webmail providers (often) | Connecting client IP for webmail send. Empty or RFC1918 is a flag. |
| Authentication-Results | Inbound MTA | SPF / DKIM / DMARC outcomes summarised. |
| DKIM-Signature | Sending MTA | Cryptographic signature. The d= tag is the signing domain. |
| Content-Type / MIME-Version | Sending MUA | Structure of multipart bodies, attachments, alternative HTML/plain. |
The bottom-up method in five steps.
- Locate the bottom-most Received header. This is the origin server. Note its from-clause (the HELO/EHLO name the sender announced), its by-clause (the MTA that accepted it), the connecting IP and the timestamp.
- Walk upward through each subsequent Received. Each row's by-clause should match the next row's from-clause (the chain should be continuous).
- At each hop, sanity-check the timestamp. Hops should be monotonically increasing in time, with hop latencies in seconds, not hours. A negative or wildly large gap is an indicator of fabrication or open-relay abuse.
- Read the top-most Received: this is the last MTA before delivery, usually the inbound MTA for the recipient.
- Cross-reference against the Authentication-Results, the Return-Path, and the visible From. A coherent header has the Return-Path and the From in the same organisational domain, SPF pass for the connecting IP, DKIM pass for d= matching the visible From domain, and DMARC pass.
The X-Originating-IP header deserves a special note. Webmail providers (Yahoo Mail historically, several Indian webmail providers still) write the client connecting IP into X-Originating-IP. If this header carries an RFC1918 private address (10.x, 172.16-31.x, 192.168.x) the value is being reported from inside a NAT and the true source IP needs a separate subpoena to the provider. If the header is absent, the provider does not write it.
SPF, DKIM and DMARC: the authentication stack

SPF, DKIM, and DMARC answer three distinct questions about message authenticity. Each mechanism is published in DNS by the sending domain owner and verified by the receiving inbound MTA. Their failure modes do not overlap.
SPF (Sender Policy Framework) asks: is the connecting IP allowed to send for this domain? The domain owner publishes a TXT record like v=spf1 ip4:203.0.113.0/24 include:_spf.google.com -all. The receiver compares the SMTP connecting IP against the listed IPs and include-chains. A pass means the connecting IP is on the list; a fail means it is not; softfail (~all) means the domain owner is still testing. SPF aligns to the MAIL FROM (Return-Path) domain, not to the visible From.
DKIM (DomainKeys Identified Mail) asks: was this message cryptographically signed by a key authorised by the domain in the d= tag? The sending MTA computes a signature over selected header fields and the body, places it in the DKIM-Signature header, and the receiver fetches the public key from DNS (selector._domainkey.signing-domain TXT record) and verifies. A pass proves the message has not been altered in flight and that the signing domain controls the signing key.
DMARC (Domain-based Message Authentication, Reporting and Conformance) asks: does at least one of SPF or DKIM pass and align to the visible From domain, and if not, what should the receiver do? The domain owner publishes a TXT record like v=DMARC1; p=reject; rua=mailto: dmarc@example.in; ruf=mailto: dmarc-fo@example.in. The p= tag is the policy: none (monitor only), quarantine (deliver to spam), reject (refuse). The rua and ruf tags are the aggregate and forensic reporting addresses.
| Mechanism | Aligned to | Pass condition | Forensic value |
|---|---|---|---|
| SPF | Return-Path / MAIL FROM domain | Connecting IP listed in sender's SPF record | Strong evidence on connecting infrastructure; weak on identity (only the envelope sender domain). |
| DKIM | d= tag in DKIM-Signature header | Signature verifies against the published public key | Proves integrity in flight and key custody; the d= tag may differ from the visible From. |
| DMARC | Visible From domain | SPF or DKIM pass and align to From domain | Strong evidence on the visible From. A DMARC fail on a banking domain is the standout phishing flag. |
| Authentication-Results | Receiver records all three | Summary line; values are dkim=pass/fail, spf=pass/fail, dmarc=pass/fail | First triage line for the examiner. |
For an Indian banking-phishing case the typical pattern in the header is From: HDFC Bank <alerts@hdfc-secure-login.in> with Return-Path: <bounce@cheap-vps-host.example> and Authentication-Results: spf=fail (sender IP not in SPF) dkim=none dmarc=fail (p=reject). That single Authentication-Results line is enough to open the spoofing finding in the report; the rest of the analysis is corroboration. CERT-In advisories under the CDB-2024-* series have repeatedly published indicator lists for HDFC, ICICI and State Bank lookalike-domain campaigns, and the I4C portal at cybercrime.gov.in is the citizen-side complaint linkage that the IO references in the FIR.
Spoofed mail and phishing artefacts: indicators and pivots
A spoofed-mail finding is a structured argument: the report quotes each indicator with the verbatim header value and connects it to the substantive offence under the IT Act or BNS.
The recurring indicator set:
- From vs Return-Path mismatch. The visible From shows a bank or HR address; the Return-Path shows a generic bounce mailbox at a different domain.
- SPF fail or softfail. The connecting IP is not listed in the SPF record for the Return-Path domain.
- DKIM none or fail. No DKIM-Signature header, or one that does not verify, or one whose d= tag is not the From domain.
- DMARC fail. Authentication-Results line carries dmarc=fail.
- Received chain discontinuity. A hop's by-clause does not match the next hop's from-clause; or a timestamp goes backwards; or a hop appears to be inside an ISP block known to host bulletproof hosting.
- X-Originating-IP RFC1918 or absent. The provider is reporting a private address (NAT) or has not recorded one at all.
- Lookalike From domain. The display name is the bank; the address domain is a one-character typo (hdfo, hdbank), a punycode homoglyph (xn--), or an unrelated TLD (.top, .icu).
- URL points to a different host. The href in the HTML body resolves to a domain different from the display text. A common form is anchor text reading https://www.hdfcbank.in/login with href set to https://hdfc-secure-login.in.in/login.
- Hidden tracking pixel. A 1x1 image from a third-party host that fires on open. The host is itself a pivot for the wider campaign.
The phishing-page evidence pairs with the email evidence. The mail carries the lure; the click trail and the captured page live in the browser. See our Web Browser Forensics topic for the click-side analysis; the two reports are usually annexed together in an Indian phishing charge sheet.
The phishing prosecution path in India runs through two main provisions:
- IT Act 2000 Section 66D (added by the 2008 amendment): cheating by personation by using any communication device or computer resource. The classic phishing-impersonation offence, used widely by central and state cyber-crime units. Imprisonment up to three years and a fine.
- BNS 2023 Section 318 (cheating, the successor to IPC Sections 415, 417, 418 and 420). Used in tandem with IT Act 66D when the cheating results in transfer of property.
- BNS 2023 Section 319 (cheating by personation) is a separate aggravated form, typically invoked where the impersonation is part of the operative element of the cheat.
The procedural anchor is BNSS 2023 (charge sheet under Section 193, FSL report annexure, see our BNSS topic) and the admissibility anchor is BSA 2023 Section 63 (statutory certificate for the email exhibit, see our BSA topic). The I4C portal (cybercrime.gov.in) is the citizen-complaint linkage that often originates the FIR.
On-disk evidence: PST, OST, mbox, EML, MSG and webmail subpoena
Email on disk is stored in a small set of container formats. The examiner must identify the correct format for each client and parse it without altering the original.
| Container | Used by | Holds | Parser of choice |
|---|---|---|---|
| PST | Outlook (cached or archive) | Folders, messages, attachments, calendar, contacts | libpff (open source), Aid4Mail, Outlook Forensics, AXIOM Email |
| OST | Outlook against Exchange / Microsoft 365 | Local synced copy of the Exchange mailbox | libpff (with OST support), Aid4Mail, Magnet AXIOM |
| mbox | Thunderbird, Apple Mail (legacy), Unix MUAs | Concatenated RFC 5322 messages in one file per folder | MailXaminer, Python mailbox module, AXIOM Email |
| EML | Outlook Express, generic export | One message per file | Any text editor, MailXaminer |
| MSG | Outlook native export | One message with attachments, OLE compound file | libpff, Outlook itself, AXIOM |
| Maildir | Postfix-style Unix mailbox | One file per message in cur/new/tmp | Standard text tools, MailXaminer |
The acquisition discipline is the same for every container: image the file with a hash, work on a copy, never write to the original. PST and OST files routinely sit in %USERPROFILE%\AppData\Local\Microsoft\Outlook on Windows; Thunderbird mbox files sit under the Mail or ImapMail subfolders of the profile directory.
For webmail, the mailbox is the provider's, not the client's. On-disk artefacts on the client are limited to browser cache, IndexedDB, and any locally-downloaded EML or MSG files. The forensic source of truth is the provider; the legal access path is a formal production request.
- Under IT Act Section 91 BNSS (read with Section 65B/63 BSA) the investigating officer can require production of documents and electronic records. For Indian-incorporated providers this is the standard route.
- Under IT Act Section 69 the central or state government can direct interception, monitoring or decryption of any information transmitted, received or stored through a computer resource, subject to the procedure and safeguards laid down by the rules.
- For foreign providers (Google, Microsoft, Yahoo, Apple) the cooperative route runs through their published Law Enforcement Request System, in parallel with formal MLAT requests for content data.
MTA-side logs are the third evidence source and the one most often missed. Postfix maillog (/var/log/maillog or /var/log/mail.log), Sendmail's queue and syslog entries, and the Exchange / Microsoft 365 message-tracking log all carry per-message records with queue ID, source IP, recipient list, size and SPF/DKIM outcomes. On the sending side these logs identify the foothold account; on the receiving side they corroborate the headers in the message itself.
- Decide the evidentiary container before touchingWebmail-only victim: plan a subpoena (Section 91 BNSS) and a header export from the user's inbox. Outlook user: plan a PST/OST image. Thunderbird user: plan an mbox image.
- Image with hashesCopy the PST, OST or mbox file with SHA-256 hashes. Record the hash list in the chain of custody.
- Parse on a working copyUse libpff (pffexport), Aid4Mail, MailXaminer or Magnet AXIOM Email. Cross-validate with a second tool for any message that will be quoted in the report.
- Extract full headersStrip the Received chain, Return-Path, From, Reply-To, Message-ID, Date, X-Originating-IP, Authentication-Results and DKIM-Signature for each message of interest. Quote verbatim in the report.
- Run header analysisRead bottom-up. Re-run SPF and DKIM verification with the current DNS records (note the verification date; DNS records change). MxToolbox is the quick web tool; EmailTrackerPro is the dedicated desktop tool.
- Pair with provider recordsSubpoena MTA logs from the sender's provider and inbox-side delivery logs from the receiver's provider. Match queue IDs, Message-IDs and timestamps.
- Draft the Section 63 BSA certificateSign over the imaged container and the extracted EML / MSG exhibits. The certificate identifies the records with hashes, describes the imaging method, states that the device was working properly, and is signed by the responsible official.
The reporting toolset that most Indian state cyber-forensic units rely on includes MailXaminer for unified PST/OST/mbox/EML analysis, OST-PST Converter for conversion when a parser needs PST input, Aid4Mail for high-volume PST processing, EmailTrackerPro and eMailTracker for header analysis, MxToolbox as a quick online SPF/DKIM/DMARC tester, and Magnet AXIOM Email for integrated workflows. The Indian-context cross-link is that CERT-In phishing advisories (the CDB-2024 series and onwards) publish indicator lists that the examiner can match the case email against; the I4C portal at cybercrime.gov.in is the citizen-side complaint endpoint that often originates the FIR.
The default cleartext port for SMTP server-to-server relay is:
Frequently asked questions
What are the default ports for SMTP, POP3 and IMAP?
Why is the Received chain read bottom-up?
What do SPF, DKIM and DMARC actually check?
What are the standard spoofing indicators in an email header?
How do Outlook PST and OST differ, and which tool parses them?
Under which Indian provisions is a phishing case prosecuted?
How does the forensic examiner obtain webmail content from Google or Microsoft?
Test yourself on Digital Forensics with free, timed mocks.
Practice Digital Forensics questionsSpotted an error in this page? Report a correction or read our editorial standards.