Cryptography: Symmetric vs Asymmetric, Substitution, Transposition
CIAN goals, symmetric vs asymmetric trade-offs, the key-distribution problem, classical substitution and transposition ciphers, key types, confusion and diffusion, the forensic attack model, and how CCA India and UPI bind these ideas to real Indian infrastructure.
Last updated:
Cryptography delivers four distinct security properties: confidentiality, integrity, authentication, and non-repudiation. Symmetric ciphers (AES, ChaCha20) use one shared key and are fast enough for bulk data; asymmetric systems (RSA, ECDH) use a public/private key pair that eliminates the key-distribution problem at the cost of speed. In practice every modern protocol combines both families: asymmetric crypto exchanges a one-time session key, then symmetric crypto encrypts all application data. Classical ciphers such as Caesar, Vigenere, and columnar transposition are foundational vocabulary for forensic examiners because they appear in CTF casework, legacy malware obfuscation, and examination syllabi.
Most digital forensics work eventually reaches a cryptographic boundary. A WhatsApp database is sealed with AES-256-GCM. A seized laptop's disk is wrapped under BitLocker's AES-XTS. An intercepted email from a fraud syndicate uses S/MIME with RSA-2048 signatures and AES-128 content encryption. An examiner without working knowledge of these primitives will produce reports that fail cross-examination: the defence expert will ask which mode, which key length, which IV, and whether the signature scheme is deterministic.
Key takeaways
- Cryptography delivers four distinct properties, confidentiality, integrity, authentication, and non-repudiation, and a forensic report that conflates them will fail cross-examination when the defence expert asks which property the algorithm was actually providing.
- Symmetric cryptography is fast and well suited to bulk data encryption, but requires both parties to share a secret in advance, which is the key distribution problem that asymmetric cryptography was designed to solve.
- Asymmetric cryptography relies on a mathematical trapdoor operation that is easy in one direction and computationally infeasible to reverse without a private key, allowing a stranger to encrypt a message to you without prior contact.
- Classical substitution and transposition ciphers remain relevant to forensic examiners because they appear in CTF-style steganography casework and in entry-level questions on the cryptographic foundations the Indian working syllabus expects.
- Most digital forensics work eventually meets a cryptographic boundary, such as a chat database sealed with AES or a disk under full-disk encryption, so examiners must understand how these ciphers work.
This topic builds the foundation for every cryptography topic in Module 7. It starts with what cryptography actually promises (confidentiality, integrity, authentication, non-repudiation), separates symmetric from asymmetric primitives by what they are good at and bad at, walks the classical substitution and transposition ciphers an examiner still meets in CTF-style steganography casework, names the key types and lifecycle phases that show up in PKI evidence, and closes with the attack taxonomy and Indian PKI frame. The companion topics for symmetric algorithm internals, PKI and signatures and cryptanalysis and Diffie-Hellman build on this vocabulary.
By the end of this topic you will be able to:
- Identify which of the four CIAN properties (confidentiality, integrity, authentication, non-repudiation) a given cryptographic primitive provides, and explain why AES-CBC-only satisfies only one.
- Explain the key distribution problem and describe how asymmetric key pairs reduce N(N-1)/2 shared secrets to N key pairs.
- Describe the hybrid encryption pattern used by TLS 1.3, UPI, and WhatsApp, including the roles of the asymmetric and symmetric phases.
- Distinguish substitution ciphers (Caesar, Vigenere, Playfair, Hill) from transposition ciphers (rail fence, columnar) and identify the attack that breaks each.
- Define confusion, diffusion, and the avalanche effect as Shannon design criteria, and explain how forward secrecy (ECDHE) limits the damage of a long-term key compromise in intercepted-traffic cases.
- CIAN
- Confidentiality, Integrity, Authentication, Non-repudiation. The four properties a cryptosystem can deliver, usually not all at once with a single primitive. Bulk encryption gives confidentiality; a MAC gives integrity and authentication; a digital signature adds non-repudiation.
- Symmetric key
- One shared secret used to both encrypt and decrypt. Fast (AES-128 on AES-NI runs at multiple GB/s) but solves nothing if the two parties have no prior secure channel to exchange the key.
- Asymmetric key pair
- Mathematically linked public and private keys. The public key encrypts or verifies, the private key decrypts or signs. Two to three orders of magnitude slower than symmetric primitives, but no shared secret is needed in advance.
- Hybrid encryption
- The pattern every real protocol uses: asymmetric crypto exchanges a fresh symmetric session key, then the symmetric primitive does the heavy lifting on the actual data. TLS 1.3, S/MIME, WhatsApp pairwise channels and Aadhaar API requests all use this shape.
- Confusion and diffusion
- Shannon's 1949 design principles. Confusion makes the relationship between key and ciphertext as complex as possible; diffusion spreads each plaintext bit's influence across many ciphertext bits. Every modern block cipher is engineered around these two ideas.
- Perfect Forward Secrecy
- A property where compromise of a long-term key does not retroactively break past sessions, because each session used an ephemeral key (typically ECDHE) that was discarded after use. TLS 1.3 mandates PFS.
What cryptography is actually for: the CIAN model
Cryptography is not "making things secret." Secrecy is one of four properties, and confusing them in a forensic report is how an expert witness gets discredited.
- Confidentiality. Only the intended recipient can read the plaintext. Achieved with encryption: symmetric (AES, ChaCha20) for bulk data, asymmetric (RSA, ECIES) for small payloads or for wrapping symmetric keys.
- Integrity. The message has not been modified between sender and receiver. Achieved with a Message Authentication Code (HMAC-SHA-256, KMAC) or an AEAD mode like AES-GCM that fuses encryption and integrity into one step.
- Authentication. The sender is who they claim to be. A MAC gives entity authentication if both parties hold the shared key. A digital signature gives the stronger third-party-verifiable form.
- Non-repudiation. The sender cannot later credibly deny having sent the message. Only digital signatures backed by a trusted PKI give this property; a MAC does not, because either party could have produced it.
The reason these are kept separate is that a single primitive rarely covers all four. Encrypting a file with AES-CBC gives confidentiality and nothing else: an attacker who flips a ciphertext bit produces a plaintext with the corresponding bit flipped (plus a 16-byte block of garbage) and the recipient cannot tell. Padding-oracle attacks against AES-CBC-only systems exploit precisely this gap. Wrapping the same payload in AES-GCM gives confidentiality plus integrity plus authentication; adding an RSA or Ed25519 signature on top gives non-repudiation.
The forensic implication is simple. When you report that "the data was encrypted," the next question is always "and was its integrity also protected, and was the sender authenticated?" If the system used a confidentiality-only mode like AES-CBC without a MAC, the chain of evidence has a hole the defence can drive through.
Symmetric vs asymmetric: speed, scale and the key distribution problem
The two families of cryptography exist because each addresses a different problem.

| Property | Symmetric | Asymmetric |
|---|---|---|
| Keys | One shared secret per pair | Public key (open) + private key (secret) per party |
| Throughput | AES-NI: 5 to 10 GB/s on a modern CPU core | RSA-2048: ~1000 sign/s, ~25000 verify/s per core |
| Key size for ~128-bit security | 128-bit AES | 3072-bit RSA or 256-bit ECC |
| Use case | Bulk data encryption, full-disk encryption, session traffic | Key exchange, digital signatures, certificate issuance |
| Key distribution | Hard: needs a prior secure channel for the shared key | Easy: the public key can be published openly |
| Scaling for N parties | N(N-1)/2 keys for pairwise | N key pairs total |
The reason symmetric ciphers cannot stand alone in an open network is the key distribution problem. If Alice and Bob want to share a symmetric key, they need a prior secure channel to exchange it. The internet is not a prior secure channel. For 1000 parties wanting pairwise encryption, the symmetric approach demands 1000 times 999 over 2 (which is 499,500) distinct keys, each distributed out of band. The arithmetic gets unmanageable past a handful of parties.
Asymmetric crypto solves this. Each party generates one key pair, publishes the public half (in a certificate signed by a trusted CA) and keeps the private half. Any other party can encrypt to them or verify their signature using only the public key. The total key material to manage is N pairs, not N(N-1)/2 secrets.
The cost is speed. RSA-2048 decryption is about 100,000 times slower than AES-256 on the same hardware. ECC is faster than RSA at equivalent security but still orders of magnitude slower than symmetric work. So no real-world system uses asymmetric crypto to encrypt bulk data. Every modern protocol is a hybrid system:
- Asymmetric phaseUse RSA, ECDH or ECDHE to either encrypt or agree on a fresh symmetric session key. This step happens once at the start of a session.
- Optional signature stepIf non-repudiation matters, the asymmetric phase also produces a digital signature over the handshake transcript, binding the session to a specific certified identity.
- Symmetric phaseUse AES-GCM or ChaCha20-Poly1305 with the session key to encrypt and authenticate every byte of the actual application data. This is where 99.9% of the CPU time goes.
- Session teardownDiscard the session key. Under PFS, even a later compromise of the long-term key cannot decrypt the captured session traffic, because the session key was ephemeral.
The TLS 1.3 handshake is the canonical instance: ECDHE for key exchange, an RSA or ECDSA signature on the handshake, and AES-GCM or ChaCha20-Poly1305 for the record layer. UPI, BHIM, and the Aadhaar API stack all follow this hybrid pattern. For Indian casework, when a witness statement says "the traffic was encrypted with RSA-2048," the technically accurate version is almost always "RSA-2048 wrapped a one-time AES-256 session key, and the actual payload used AES-256."
Classical substitution and transposition ciphers
Modern cryptography as a formal discipline dates to Shannon's 1949 work; classical cryptography covers the preceding four millennia. Forensic examiners encounter classical ciphers in capture-the-flag challenges (NFSU and CDAC training pipelines draw on them), low-effort obfuscation in malware configuration blobs, and historical case material. Understanding how they fail under frequency analysis is part of the standard working syllabus.
Substitution ciphers replace each plaintext symbol with a different ciphertext symbol according to some rule.
- Caesar cipher. Shift each letter by a fixed integer k. Key space is 25. Brute force takes microseconds. Still appears in malware as a low-rent obfuscation of strings.
- Monoalphabetic substitution. Each plaintext letter maps to a fixed ciphertext letter by an arbitrary permutation. Key space is 26 factorial (about 4 times 10^26) which sounds large but collapses under frequency analysis. The al-Kindi method (9th century Arab polymath) counts ciphertext letter frequencies and matches them to known plaintext-language frequencies; English plaintexts give E, T, A, O, I, N as the top letters and the substitution falls in minutes by hand.
- Polyalphabetic substitution (Vigenere). Uses a keyword of length m to apply m different Caesar shifts in rotation. Defeats single-letter frequency analysis. Defeated in turn by the Kasiski examination (find repeated ciphertext substrings, infer m from their spacing) plus per-position frequency analysis.
- Playfair. A 5x5 grid of letters keyed by a passphrase; encrypts digraphs (pairs of letters). Used by British forces through WW1 and WW2 for tactical traffic. Falls to digraph frequency analysis and known-plaintext attacks.
- Hill cipher. Treats blocks of m letters as vectors in Z/26 and multiplies by an m by m matrix. The first cipher with proper diffusion across letters. Falls instantly to a known-plaintext attack: with m plaintext-ciphertext pairs the matrix solves directly by linear algebra.
Transposition ciphers keep the symbols and rearrange their order. Frequency of individual letters is unchanged, so single-letter frequency analysis fails. Bigram and trigram analysis still work.

- Rail fence. Write the plaintext in a zigzag across n rails, then read off row by row. Key is the number of rails. Trivially brute-forceable.
- Columnar transposition. Write the plaintext in rows of fixed width, then read columns in a key-defined order. Single columnar was the basis of WW1 German field ciphers. Defeated by anagramming with known column counts.
- Double transposition. Apply columnar transposition twice with different keys. Significantly harder; was actually used by some WW2 resistance circuits.
The one-time pad sits in a category by itself. The key is a random bitstring at least as long as the message, used once and then destroyed. If those conditions hold, Shannon proved in 1949 that the ciphertext is information-theoretically secure: a brute force over all keys produces every possible plaintext of that length, so the ciphertext reveals nothing beyond its length. The pad is unbreakable. It is also unusable at scale, because the key-management problem is exactly the message-distribution problem it was supposed to solve. The Indian PKI ecosystem under CCA does not use one-time pads; it uses computationally secure primitives because the operational tradeoff is overwhelmingly in their favour.
Key types, key derivation and key lifecycle
A forensic examiner reviewing a system's key inventory requires precise vocabulary for what each entry represents.
| Key type | Asymmetry | Lifetime | Typical use |
|---|---|---|---|
| Public key | Half of a pair | Long (years, certificate-bound) | Encrypt to recipient; verify signature |
| Private key | Half of a pair | Long (years, must be guarded) | Decrypt to self; sign on own behalf |
| Symmetric key (shared secret) | Same for both parties | Variable | Bulk encryption between two endpoints |
| Session key (ephemeral) | Symmetric, derived per session | Short (minutes to hours) | TLS record layer, IPsec SA, WhatsApp double-ratchet message keys |
| Master key | Either family | Long; not used directly | Input to a KDF that produces working keys |
| Derived key | Symmetric, KDF output | Bound to master and context | Per-purpose keys from one master via HKDF / PBKDF2 / Argon2 |
The two key-derivation function families an examiner sees regularly:
- PBKDF2 (Password-Based Key Derivation Function 2). Salt plus password plus iteration count fed through HMAC-SHA-256 (or SHA-1 in legacy systems) for typically 100,000 to 1,000,000 iterations. Used by WPA2-PSK, LUKS, older BitLocker, iOS Data Protection through iOS 9. Iteration count is the only knob against brute force; weak counts (under 10,000) get broken on cheap hardware.
- HKDF (HMAC-based Key Derivation Function, RFC 5869). Two stages, extract then expand. Used by TLS 1.3, Signal protocol, WhatsApp's double ratchet, modern Noise-protocol implementations. Designed for already-random inputs (a Diffie-Hellman output) rather than low-entropy passwords; iteration count is one, by design.
- Argon2id. The 2015 password-hashing winner. Memory-hard (deliberately expensive in RAM as well as CPU) so GPU and ASIC attackers gain less leverage. Used by modern password managers, recent VeraCrypt versions, and most new wallet software.
The key lifecycle has five phases:
- GenerationSource the key from a cryptographically strong RNG (Intel RDRAND with reseeding, /dev/urandom on a properly seeded Linux box, a hardware HSM). Weak entropy at generation is the root cause of more PKI failures than every other phase combined.
- DistributionFor asymmetric keys, publish the public half in a CA-signed certificate; protect the private half on the originating device. For symmetric keys, use a key wrapping scheme (RSA-OAEP, ECDH-ES) or a pre-shared-key arrangement under a HSM.
- StoragePrivate keys live in a TPM, HSM, secure enclave (Apple Secure Enclave, Android StrongBox, Intel SGX), or at minimum a password-protected PKCS#12 file. Never in plain text on disk if the threat model allows compromise.
- RotationReplace keys on a defined schedule (annually for CA signing keys, hours-to-days for TLS session keys, per-message for double-ratchet message keys). Rotation limits damage if a key is later found to be compromised.
- DestructionZeroise the key material from memory after use, and from storage when the key is retired. Audit logs record destruction time so a later investigation can prove a key was no longer in service when an event happened.
CCA India's licensing of certifying authorities (e-Mudhra, Sify, NSDL e-Governance, IDRBT, Capricorn, (n)Code Solutions, NIC) mandates audited HSM-based key generation and storage for all class-3 signing keys used on Indian Aadhaar-eSign and DSC infrastructure. The audit trail follows exactly the five-phase lifecycle above and is part of what makes an Indian DSC admissible under IT Act 2000 Section 5.
Design properties: confusion, diffusion, avalanche, forward secrecy
Three structural properties define a well-designed cipher.
- Confusion. The relationship between the key and the ciphertext should be as complex as possible. A single key bit, changed, should affect ciphertext bits in a way that does not look like any simple function. Confusion is delivered in modern ciphers mostly by S-boxes (small lookup tables that perform a nonlinear permutation, like AES's 8x8 SubBytes).
- Diffusion. Each plaintext bit's influence should spread across many ciphertext bits. A single plaintext bit flipped should change roughly half the ciphertext bits. Diffusion comes from linear mixing layers like AES's MixColumns or the Feistel network in DES/Blowfish.
- Avalanche effect. The empirical metric for diffusion. A good block cipher exhibits a 50% bit-flip rate in the ciphertext when one input bit (plaintext or key) is changed. AES achieves full avalanche within 2 to 3 rounds; the 10 to 14 rounds give substantial security margin.
Strong vs weak keys. Some keys produce statistically inferior ciphertext for structural reasons in the algorithm. DES famously has four "weak keys" where encryption equals decryption (the round subkeys end up identical), plus 12 semi-weak keys with similar pathologies. AES has no known weak keys. Examiner takeaway: when reviewing a custom or legacy cipher in malware analysis, the absence of a documented weak-key analysis is a red flag.
Forward secrecy. A property of a key-exchange protocol, not of a cipher. A protocol has forward secrecy (also called Perfect Forward Secrecy when the ephemeral exchange is on a fresh group element per session) if compromise of a long-term key today does not allow decryption of past captured sessions. The mechanism is to use ephemeral Diffie-Hellman or ECDH for session-key agreement and discard the ephemeral key immediately after use. The long-term key signs the ephemeral exchange but does not decrypt the data, so its later capture cannot retroactively unlock anything.
The forensic implication for intercepted-traffic cases under Indian Telegraph Act warrants: if the traffic was captured under TLS 1.2 without PFS (RSA key exchange), the server's private RSA key, if later obtained, retroactively decrypts the entire session. If the same traffic was captured under TLS 1.3 (which mandates ECDHE), the server's long-term key is useless for retrospective decryption. The same applies to WhatsApp's double ratchet, where each message has its own derived key and even endpoint compromise reveals only future messages, not past ones.
The cryptanalytic attack taxonomy
Cryptanalysis models are classified by what the attacker is assumed to know and control, ordered from most to least difficult for the attacker:
| Attack model | Attacker capability | Hardness for attacker |
|---|---|---|
| Ciphertext-only (COA) | Sees only ciphertexts | Hardest |
| Known-plaintext (KPA) | Holds matched plaintext / ciphertext pairs | Harder |
| Chosen-plaintext (CPA) | Can submit plaintexts to an encryption oracle | Medium |
| Chosen-ciphertext (CCA1) | Can submit ciphertexts to a decryption oracle, before seeing the target | Easier |
| Adaptive chosen-ciphertext (CCA2) | Can submit ciphertexts adaptively, even after seeing the target | Easiest |
| Side-channel | Observes timing, power, EM, cache, sound from a real implementation | Sometimes trivial |
Additional attack categories relevant to forensic examiners:
- Brute force (exhaustive key search). Try every possible key. AES-128 has 2^128 keys; even at 10^18 attempts per second (well beyond what any 2026 attacker can sustain), the expected time exceeds the age of the universe. AES-256 is overkill against any non-quantum attacker. 56-bit DES has 2^56 keys and was brute-forced in 56 hours by the EFF DES Cracker in 1998 (DES Challenge II) for under $250,000; a later 1999 collaboration with distributed.net cracked it in 22 hours. Key length is the primary brute-force defence.
- Replay attack. Capture a valid encrypted message and resend it later. Defeated by including a fresh nonce, sequence number or timestamp under the integrity tag. UPI transactions defeat replay with a per-transaction timestamp plus the bank's response idempotency token.
- Side-channel attacks. Real implementations leak information through covert channels. Timing (key-dependent branches in RSA), power consumption (DPA against smartcards), cache (Flush+Reload against AES T-tables), electromagnetic emissions (TEMPEST), acoustic (capturing keyboard typing or CPU coil whine). Mitigations: constant-time implementations, masking, shielding. AES-NI hardware instructions are constant-time by construction and replaced T-table software AES in production for exactly this reason.
- Differential and linear cryptanalysis. Statistical attacks that exploit non-uniform input/output distributions of the cipher's rounds. Detailed in Cryptanalysis, Cryptographic Attacks and Diffie-Hellman. DES's S-boxes were designed (in part by IBM with NSA review) to resist differential cryptanalysis 15 years before the academic community rediscovered it.
- Key management attacks. The cipher is fine; the keys are stolen. Pre-shared keys checked into Git repositories, private keys with predictable filenames in user home directories, RAM-resident keys captured from a hibernation file. The forensic recovery topic in Data Recovery, File Carving and Recovering Deleted, Hidden & Encrypted Content is largely about exploiting key management failures, not cipher math.
For Indian PKI specifically, NCRB's 2024 hash-evidence guidelines treat the cryptographic primitives as out of scope for cross-examination (SHA-256 and AES-256 are accepted as forensically sound) and concentrate the examiner's testimony on key-handling, audit trail and chain-of-custody questions, which is where real cases actually fail.
Diffie-Hellman, ElGamal and the Indian PKI hook
A full treatment of Diffie-Hellman and its attacks is in Cryptanalysis, Cryptographic Attacks and Diffie-Hellman. The summary an examiner needs at this level:
Diffie-Hellman key exchange (1976). Alice and Bob agree on a prime p and a generator g, public to all. Alice picks a secret integer a and publishes A = g^a mod p. Bob picks a secret b and publishes B = g^b mod p. Each computes the shared secret as B^a mod p = A^b mod p = g^(ab) mod p. An eavesdropper sees A and B but cannot extract a or b without solving the discrete logarithm problem, which is computationally infeasible for properly sized p (2048 bits or more) or for elliptic-curve variants at 256-bit. ECDH (over Curve25519, P-256, P-384) is the modern variant used in TLS 1.3, Signal, WireGuard and the WhatsApp double ratchet.
ElGamal signature scheme (1985). A signature scheme based on the discrete-log problem. Unlike RSA signatures which are deterministic, ElGamal signatures include a fresh random value per signature, so the same message signed twice produces two different signatures. This randomness is the source of both ElGamal's strength and its operational fragility: if the random value is reused across two signatures, the private key can be extracted in closed form. DSA (Digital Signature Algorithm) is a variant; ECDSA is its elliptic-curve cousin. Bitcoin's notorious early-wallet failures and the 2010 Sony PS3 ECDSA private-key extraction both came from random-value reuse.
The Indian PKI ecosystem under CCA India binds these primitives to real infrastructure:
- CCA (the Controller of Certifying Authorities) under MeitY licenses Indian CAs (e-Mudhra, Sify, NSDL e-Gov, IDRBT, NIC, (n)Code, Capricorn, IndiaPKI). Each licensed CA issues Class 1, Class 2, Class 3 and DGFT-class Digital Signature Certificates and operates audited HSM-backed root keys.
- Aadhaar eSign uses one-time-use DSCs bound to Aadhaar authentication; the eSign API issues a signing key for that single sign event and destroys it after use, with the audit logged at the eSign service provider and verifiable by CCA.
- UPI's cryptographic backbone combines AES-256 (data confidentiality), RSA-2048 (initial NPCI-issued certificate-based authentication of payer/payee PSPs) and ECDH (session-key agreement for the actual payment leg). Every UPI transaction is a hybrid construction of the kind described in Section 2.
- RBI's 2023 PoS encryption mandate requires AES-based encryption on all card-present terminals; the underlying primitives are AES-256-CBC with HMAC-SHA-256 or AES-128-GCM, depending on the terminal vendor.
A forensic examiner reading an Indian eSign audit log or a UPI dispute file needs to translate "RSA-2048 signature over the AES-256 wrapped session key" into the CIAN-checklist language of Section 1, prove each property held at the right phase, and identify which key in the lifecycle was used at each step. This is the foundation; Symmetric Cryptosystems: DES, AES, RC4 and Blowfish and Asymmetric Cryptosystems, Hashing, PKI and Digital Signatures build on it.
An Aadhaar API request encrypts the PID block with a fresh AES-256-GCM session key, wraps that session key under UIDAI's RSA-2048 public key, and signs the whole request with the AUA's signing certificate. Which CIAN property is NOT directly provided by this composition?
Frequently asked questions
What does CIAN actually stand for and why does it matter in a forensic report?
Why don't real-world systems just use asymmetric encryption for everything?
Is a one-time pad really unbreakable?
What is the key distribution problem and how does asymmetric crypto solve it?
What is Perfect Forward Secrecy and why does it matter for intercepted-traffic cases?
What is CCA India and what does it do for Indian PKI?
How does the UPI cryptographic backbone fit the symmetric / asymmetric / hybrid pattern?
Test yourself on Digital Forensics with free, timed mocks.
Practice Digital Forensics questionsSpotted an error in this page? Report a correction or read our editorial standards.