Cryptography Fundamentals: Symmetric vs Asymmetric, Substitution and Transposition

Your journey to becoming a forensic professional starts here.

Practice with national-level exam (FACT, FACT Plus, NET, CUET, etc.) mocks, learn from structured notes, and get your doubts solved in one place.

Start Free Mock Test Create Your Account

Cryptography Fundamentals: Symmetric vs Asymmetric, Substitution and Transposition | ForensicSpot

Most digital forensics work eventually hits a cryptographic boundary. A WhatsApp database is sealed with AES-256-GCM. A seized laptop's disk is wrapped under BitLocker's AES-XTS. An intercepted email between a fraud syndicate uses S/MIME with RSA-2048 signatures and AES-128 content encryption. An examiner who treats crypto as a black box will produce reports that fall apart on cross-examination, because the defence's expert will ask which mode, which key length, which IV, and whether the signature scheme is deterministic.

This topic builds the foundation for every cryptography topic in Module 7. It starts with what cryptography actually promises (confidentiality, integrity, authentication, non-repudiation), separates symmetric from asymmetric primitives by what they are good at and bad at, walks the classical substitution and transposition ciphers an examiner still meets in CTF-style steganography casework, names the key types and lifecycle phases that show up in PKI evidence, and closes with the attack taxonomy and Indian PKI frame. The companion topics for symmetric algorithm internals, PKI and signatures and cryptanalysis and Diffie-Hellman build on this vocabulary.

Key terms

CIAN: Confidentiality, Integrity, Authentication, Non-repudiation. The four properties a cryptosystem can deliver, usually not all at once with a single primitive. Bulk encryption gives confidentiality; a MAC gives integrity and authentication; a digital signature adds non-repudiation.
Symmetric key: One shared secret used to both encrypt and decrypt. Fast (AES-128 on AES-NI runs at multiple GB/s) but solves nothing if the two parties have no prior secure channel to exchange the key.
Asymmetric key pair: Mathematically linked public and private keys. The public key encrypts or verifies, the private key decrypts or signs. Two to three orders of magnitude slower than symmetric primitives, but no shared secret is needed in advance.
Hybrid encryption: The pattern every real protocol uses: asymmetric crypto exchanges a fresh symmetric session key, then the symmetric primitive does the heavy lifting on the actual data. TLS 1.3, S/MIME, WhatsApp pairwise channels and Aadhaar API requests all use this shape.
Confusion and diffusion: Shannon's 1949 design principles. Confusion makes the relationship between key and ciphertext as complex as possible; diffusion spreads each plaintext bit's influence across many ciphertext bits. Every modern block cipher is engineered around these two ideas.
Perfect Forward Secrecy: A property where compromise of a long-term key does not retroactively break past sessions, because each session used an ephemeral key (typically ECDHE) that was discarded after use. TLS 1.3 mandates PFS.

What cryptography is actually for: the CIAN model

Four properties. Pick the ones the case needs and check that the algorithm delivers them.

Cryptography is not "making things secret." Secrecy is one of four properties, and confusing them in a forensic report is how an expert witness gets discredited.

Confidentiality. Only the intended recipient can read the plaintext. Achieved with encryption: symmetric (AES, ChaCha20) for bulk data, asymmetric (RSA, ECIES) for small payloads or for wrapping symmetric keys.
Integrity. The message has not been modified between sender and receiver. Achieved with a Message Authentication Code (HMAC-SHA-256, KMAC) or an AEAD mode like AES-GCM that fuses encryption and integrity into one step.
Authentication. The sender is who they claim to be. A MAC gives entity authentication if both parties hold the shared key. A digital signature gives the stronger third-party-verifiable form.
Non-repudiation. The sender cannot later credibly deny having sent the message. Only digital signatures backed by a trusted PKI give this property; a MAC does not, because either party could have produced it.

The reason these are kept separate is that a single primitive rarely covers all four. Encrypting a file with AES-CBC gives confidentiality and nothing else: an attacker who flips a ciphertext bit produces a plaintext with the corresponding bit flipped (plus a 16-byte block of garbage) and the recipient cannot tell. Padding-oracle attacks against AES-CBC-only systems exploit precisely this gap. Wrapping the same payload in AES-GCM gives confidentiality plus integrity plus authentication; adding an RSA or Ed25519 signature on top gives non-repudiation.

Symmetric vs asymmetric: speed, scale and the key distribution problem

Symmetric is fast and small. Asymmetric is slow but solves the problem nobody else can.

The two families of cryptography exist because each fixes a different problem.

Property	Symmetric	Asymmetric
Keys	One shared secret per pair	Public key (open) + private key (secret) per party
Throughput	AES-NI: 5 to 10 GB/s on a modern CPU core	RSA-2048: ~1000 sign/s, ~25000 verify/s per core
Key size for ~128-bit security	128-bit AES	3072-bit RSA or 256-bit ECC
Use case	Bulk data encryption, full-disk encryption, session traffic	Key exchange, digital signatures, certificate issuance
Key distribution	Hard: needs a prior secure channel for the shared key	Easy: the public key can be published openly
Scaling for N parties	N(N-1)/2 keys for pairwise	N key pairs total

Classical substitution and transposition ciphers

The pre-Shannon toolkit. Worth knowing because it still shows up in CTFs and in some legacy malware C2 channels.

Modern cryptography post-dates 1949. Classical cryptography is the 4000 years before that. An examiner runs into these schemes in three places: capture-the-flag forensic challenges (NFSU and CDAC training pipelines lean on these), low-effort obfuscation in malware configuration blobs, and historical case material. Knowing how they fall apart under frequency analysis is part of the working syllabus.

Substitution ciphers replace each plaintext symbol with a different ciphertext symbol according to some rule.

Caesar cipher. Shift each letter by a fixed integer k. Key space is 25. Brute force takes microseconds. Still appears in malware as a low-rent obfuscation of strings.
Monoalphabetic substitution. Each plaintext letter maps to a fixed ciphertext letter by an arbitrary permutation. Key space is 26 factorial (about 4 times 10^26) which sounds large but collapses under frequency analysis. The al-Kindi method (9th century Arab mathematician) counts ciphertext letter frequencies and matches them to known plaintext-language frequencies; English plaintexts give E, T, A, O, I, N as the top letters and the substitution falls in minutes by hand.
Polyalphabetic substitution (Vigenere). Uses a keyword of length m to apply m different Caesar shifts in rotation. Defeats single-letter frequency analysis. Defeated in turn by the Kasiski examination (find repeated ciphertext substrings, infer m from their spacing) plus per-position frequency analysis.
Playfair. A 5x5 grid of letters keyed by a passphrase; encrypts digraphs (pairs of letters). Used by British forces through WW1 and WW2 for tactical traffic. Falls to digraph frequency analysis and known-plaintext attacks.
Hill cipher. Treats blocks of m letters as vectors in Z/26 and multiplies by an m by m matrix. The first cipher with proper diffusion across letters. Falls instantly to a known-plaintext attack: with m plaintext-ciphertext pairs the matrix solves directly by linear algebra.

Transposition ciphers keep the symbols and rearrange their order. Frequency of individual letters is unchanged, so single-letter frequency analysis fails. Bigram and trigram analysis still work.

Rail fence. Write the plaintext in a zigzag across n rails, then read off row by row. Key is the number of rails. Trivially brute-forceable.

Key types, key derivation and key lifecycle

Crypto fails at key management, not at math. Know the names.

A forensic examiner reading a system's key inventory needs vocabulary for what they are looking at.

Key type	Asymmetry	Lifetime	Typical use
Public key	Half of a pair	Long (years, certificate-bound)	Encrypt to recipient; verify signature
Private key	Half of a pair	Long (years, must be guarded)	Decrypt to self; sign on own behalf
Symmetric key (shared secret)	Same for both parties	Variable	Bulk encryption between two endpoints
Session key (ephemeral)	Symmetric, derived per session	Short (minutes to hours)	TLS record layer, IPsec SA, WhatsApp double-ratchet message keys
Master key	Either family	Long; not used directly

Design properties: confusion, diffusion, avalanche, forward secrecy

Shannon, 1949. The vocabulary every cipher designer still uses.

Three structural properties separate a serious cipher from a toy.

Confusion. The relationship between the key and the ciphertext should be as complex as possible. A single key bit, changed, should affect ciphertext bits in a way that does not look like any simple function. Confusion is delivered in modern ciphers mostly by S-boxes (small lookup tables that perform a nonlinear permutation, like AES's 8x8 SubBytes).
Diffusion. Each plaintext bit's influence should spread across many ciphertext bits. A single plaintext bit flipped should change roughly half the ciphertext bits. Diffusion comes from linear mixing layers like AES's MixColumns or the Feistel network in DES/Blowfish.
Avalanche effect. The empirical metric for diffusion. A good block cipher exhibits a 50% bit-flip rate in the ciphertext when one input bit (plaintext or key) is changed. AES achieves full avalanche within 2 to 3 rounds; the 10 to 14 rounds give substantial security margin.

Strong vs weak keys. Some keys produce statistically inferior ciphertext for structural reasons in the algorithm. DES famously has four "weak keys" where encryption equals decryption (the round subkeys end up identical), plus 12 semi-weak keys with similar pathologies. AES has no known weak keys. Examiner takeaway: when reviewing a custom or legacy cipher in malware analysis, the absence of a documented weak-key analysis is a red flag.

Forward secrecy. A property of a key-exchange protocol, not of a cipher. A protocol has forward secrecy (also called Perfect Forward Secrecy when the ephemeral exchange is on a fresh group element per session) if compromise of a long-term key today does not allow decryption of past captured sessions. The mechanism is to use ephemeral Diffie-Hellman or ECDH for session-key agreement and discard the ephemeral key immediately after use. The long-term key signs the ephemeral exchange but does not decrypt the data, so its later capture cannot retroactively unlock anything.

The forensic implication for intercepted-traffic cases under Indian Telegraph Act warrants: if the traffic was captured under TLS 1.2 without PFS (RSA key exchange), the server's private RSA key, if later obtained, retroactively decrypts the entire session. If the same traffic was captured under TLS 1.3 (which mandates ECDHE), the server's long-term key is useless for retrospective decryption. The same applies to WhatsApp's double ratchet, where each message has its own derived key and even endpoint compromise reveals only future messages, not past ones.

The cryptanalytic attack taxonomy

Six attacker models, from weakest to strongest. Every protocol is rated against the worst it can survive.

Cryptanalysis is rated by what the attacker is assumed to know and can do. From hardest (least powerful attacker) to easiest:

Attack model	Attacker capability	Hardness for attacker
Ciphertext-only (COA)	Sees only ciphertexts	Hardest
Known-plaintext (KPA)	Holds matched plaintext / ciphertext pairs	Harder
Chosen-plaintext (CPA)	Can submit plaintexts to an encryption oracle	Medium
Chosen-ciphertext (CCA1)	Can submit ciphertexts to a decryption oracle, before seeing the target	Easier
Adaptive chosen-ciphertext (CCA2)	Can submit ciphertexts adaptively, even after seeing the target	Easiest
Side-channel	Observes timing, power, EM, cache, sound from a real implementation	Sometimes trivial

Diffie-Hellman, ElGamal and the Indian PKI hook

The two asymmetric primitives that built the modern internet, briefly, plus how Indian PKI uses them.

A full treatment of Diffie-Hellman and its attacks is in Cryptanalysis, Cryptographic Attacks and Diffie-Hellman. The summary an examiner needs at this level:

Diffie-Hellman key exchange (1976). Alice and Bob agree on a prime p and a generator g, public to all. Alice picks a secret integer a and publishes A = g^a mod p. Bob picks a secret b and publishes B = g^b mod p. Each computes the shared secret as B^a mod p = A^b mod p = g^(ab) mod p. An eavesdropper sees A and B but cannot extract a or b without solving the discrete logarithm problem, which is computationally infeasible for properly sized p (2048 bits or more) or for elliptic-curve variants at 256-bit. ECDH (over Curve25519, P-256, P-384) is the modern variant used in TLS 1.3, Signal, WireGuard and the WhatsApp double ratchet.

ElGamal signature scheme (1985). A signature scheme based on the discrete-log problem. Unlike RSA signatures which are deterministic, ElGamal signatures include a fresh random value per signature, so the same message signed twice produces two different signatures. This randomness is the source of both ElGamal's strength and its operational fragility: if the random value is reused across two signatures, the private key can be extracted in closed form. DSA (Digital Signature Algorithm) is a variant; ECDSA is its elliptic-curve cousin. Bitcoin's notorious early-wallet failures and the 2010 Sony PS3 ECDSA private-key extraction both came from random-value reuse.

The Indian PKI ecosystem under CCA India binds these primitives to real infrastructure:

CCA (the Controller of Certifying Authorities) under MeitY licenses Indian CAs (e-Mudhra, Sify, NSDL e-Gov, IDRBT, NIC, (n)Code, Capricorn, IndiaPKI). Each licensed CA issues Class 1, Class 2, Class 3 and DGFT-class Digital Signature Certificates and operates audited HSM-backed root keys.
Aadhaar eSign uses one-time-use DSCs bound to Aadhaar authentication; the eSign API issues a signing key for that single sign event and destroys it after use, with the audit logged at the eSign service provider and verifiable by CCA.
UPI's cryptographic backbone combines AES-256 (data confidentiality), RSA-2048 (initial NPCI-issued certificate-based authentication of payer/payee PSPs) and ECDH (session-key agreement for the actual payment leg). Every UPI transaction is a hybrid construction of the kind described in Section 2.

Practice

Question 1 of 5· 0 answered

An Aadhaar API request encrypts the PID block with a fresh AES-256-GCM session key, wraps that session key under UIDAI's RSA-2048 public key, and signs the whole request with the AUA's signing certificate. Which CIAN property is NOT directly provided by this composition?

Frequently asked questions

What does CIAN actually stand for and why does it matter in a forensic report?

CIAN is Confidentiality, Integrity, Authentication and Non-repudiation. They are the four security properties a cryptosystem can provide, and a single primitive rarely provides all four. A forensic report that says 'the data was encrypted' is incomplete: an AES-CBC-only payload has confidentiality but no integrity; an AES-GCM payload has confidentiality, integrity and authentication; a digitally signed and AEAD-encrypted payload has all four. The defence's expert will probe these distinctions on cross-examination, so the report should name them explicitly.

Why don't real-world systems just use asymmetric encryption for everything?

Speed. RSA-2048 decryption runs roughly 100,000 times slower than AES-256 per byte on the same hardware. Encrypting a 1 GB file with RSA directly would take tens of minutes per recipient, where AES handles it in under a second. So every real protocol uses a hybrid pattern: asymmetric crypto (RSA, ECDH, ECDHE) for a one-time key exchange and signature at the start of a session, then symmetric crypto (AES-GCM, ChaCha20-Poly1305) for all the actual data. TLS 1.3, UPI, WhatsApp and the Aadhaar API all follow this shape.

Is a one-time pad really unbreakable?

Yes, under three strict conditions: the key is truly random, kept secret, used exactly once, and is at least as long as the message. Under those conditions Shannon proved in 1949 that the ciphertext is information-theoretically secure, meaning a brute force over all possible keys produces every possible plaintext of that length, so the ciphertext reveals nothing beyond its length. The catch is that the key-management problem is just as hard as the original message-secrecy problem, which is why one-time pads are used for diplomatic and military channels with tightly controlled physical key distribution, and effectively never on the open internet.