DNA in Forensic Science: Structure, Genetic Markers, Extraction and Profiling
UGC-NET Paper 2 Unit III notes on DNA structure, STR markers, extraction workflows, PCR + capillary electrophoresis profiling, CODIS, and Indian DNA labs.
Last updated:
DNA evidence is the single highest-weight tool in modern forensic identification, and Unit III of the UGC-NET Forensic Science syllabus bundles four sub-parts into one bullet: the molecule's structure, what makes it usable as a genetic marker, how analysts pull it out of stained substrates, and how the resulting profile is generated and interpreted. Each sub-part has its own MCQ vocabulary, and NTA has historically picked from across all four in a single cycle.
Treat this as the heaviest topic in Unit III. The terms repeat (STR, PCR, CODIS, capillary electrophoresis, Likelihood Ratio), the dates matter (Watson-Crick 1953, Jeffreys 1984, CODIS 1998), and the Indian institutional anchors (CDFD Hyderabad, CFSL Hyderabad DNA division, the DNA Data Bank under DFSS, BSA 2023 and BNSS 2023) are favourite distractors. Lock the marker comparison table and the extraction table; the rest of Paper 2 will reuse them.
- Nucleotide
- Monomer unit of DNA: a deoxyribose sugar, a phosphate group, and one of four nitrogenous bases (A, T, G, C).
- Watson-Crick base pairing
- Adenine pairs with thymine via two hydrogen bonds; guanine pairs with cytosine via three. The pairing rule that makes the double helix predictable and PCR possible.
- Chromosome
- Packaged DNA. Humans have 22 pairs of autosomes plus one pair of sex chromosomes (XX or XY), all housed in the nucleus, plus a small circular mitochondrial genome.
- Autosomal vs mitochondrial DNA
- Autosomal DNA is nuclear, biparentally inherited, present as two copies per cell. Mitochondrial DNA is cytoplasmic, maternally inherited, present in hundreds to thousands of copies per cell.
- STR (Short Tandem Repeat)
- Stretches of 2 to 6 base-pair motifs repeated head-to-tail. Length polymorphism at STR loci is the basis of modern forensic profiling.
- SNP (Single Nucleotide Polymorphism)
- A single-base variation at a defined position. Used for ancestry, phenotyping, and degraded-sample work.
- VNTR
- Variable Number Tandem Repeats. Longer repeat units (10 to 100 bp) than STRs. The original Alec Jeffreys 1984 fingerprinting markers, now superseded by STRs.
- RFLP
- Restriction Fragment Length Polymorphism. The first-generation DNA typing method using restriction enzymes and Southern blotting. Required nanograms of intact DNA.
- PCR
- Polymerase Chain Reaction. Kary Mullis 1983. Exponential amplification of a target sequence using a thermostable Taq polymerase, two primers and thermal cycling.
- Multiplex STR kit
- Single-tube PCR amplifying 15+ STR loci plus amelogenin in one reaction. AmpFISTR Identifiler Plus and PowerPlex are the workhorses in Indian DNA labs.
- Capillary electrophoresis
- Size separation of fluorescently labelled PCR products through a polymer-filled capillary on an instrument like the ABI 3500. Output is an electropherogram.
- CODIS
- Combined DNA Index System. The FBI database that defined the original 13 core STR loci (expanded to 20 in 2017). The standard reference set globally.
DNA structure, the bits forensic science cares about
Watson-Crick 1953, antiparallel strands, and why the non-coding regions matter.
The double-helix model was published by James Watson and Francis Crick in Nature, April 1953, building on the X-ray diffraction work of Rosalind Franklin and Maurice Wilkins (Photograph 51). The structure is a right-handed double helix: two antiparallel polynucleotide strands held together by hydrogen bonds between complementary bases. Adenine pairs with thymine (two H-bonds), guanine with cytosine (three H-bonds). The sugar-phosphate backbone runs outside, the bases stack inside. Helix diameter is roughly 2 nm; one full turn covers about 10.5 base pairs and 3.4 nm.
For forensic work, the relevant breakdown is coding vs non-coding DNA. Only about 1 to 2 percent of the human genome codes for protein. The rest, often called non-coding or junk DNA, is full of repetitive sequences. Two kinds of repeats matter:
- Tandem repeats: short motifs repeated head-to-tail. VNTRs (10 to 100 bp motifs, repeated 5 to several hundred times) and STRs (2 to 6 bp motifs, repeated 5 to 50 times).
- Single-base variations: SNPs scattered roughly every 1,000 bases.
Forensic profiling targets the non-coding repeats deliberately. They are highly polymorphic (which gives identification power), they do not affect the donor's phenotype (which sidesteps privacy concerns), and the short STR repeats survive partial degradation better than the long VNTRs of the early Jeffreys era. Under Section 39 of the Bharatiya Sakshya Adhiniyam 2023, the resulting profile is admissible as expert opinion in Indian courts, which is why analysts document the marker set and the validation status of the kit in every report.
DNA as a genetic marker
Autosomal STRs, Y-STRs, mtDNA and SNPs. What each does, why STR became the standard.
Four marker classes are tested in NTA papers. Each has a defined chromosomal location, a defined inheritance pattern, and a defined casework niche. Memorise this table; it answers half the marker MCQs you will see.
| Marker | Chromosomal location | Inheritance | Mutation rate | Typical forensic use |
|---|---|---|---|---|
| Autosomal STR | 22 autosome pairs | Biparental (one allele from each parent) | ~10⁻³ per locus per generation | Standard individual identification, paternity, mass disasters |
| Y-STR | Y chromosome (non-recombining region) | Paternal (father to son, unchanged within a lineage) | ~10⁻³ per locus per generation | Male contributor in sexual-assault mixtures, paternal lineage testing |
| mtDNA | Mitochondrion (circular, ~16,569 bp) | Maternal (mother to all offspring) | ~10⁻² in the control region (5-10× nuclear) | Old bones, hair shafts, severely degraded samples, maternal lineage |
DNA extraction
Four mainstream chemistries plus the differential extraction that exists for one specific case type.
Extraction has to do three things: lyse the cell, separate DNA from protein and lipid, and recover the DNA in a buffer compatible with downstream PCR. Sample integrity from the scene through the lab depends on a documented chain of custody; a contaminated swab cannot be unsplit by good chemistry. NTA tests the four method names, their pros and cons, and the special case of differential extraction.
| Method | Principle | Speed | DNA quality | Automation | Best for |
|---|---|---|---|---|---|
| Organic (phenol-chloroform) | Cell lysis with SDS + proteinase K, then phenol-chloroform partitioning. DNA stays in the aqueous phase. | Slow (4+ hours, hazardous reagents) | High yield, long fragments | Manual | Bone, tooth, tissue, reference samples where yield matters more than throughput |
| Chelex 100 | Chelating resin binds divalent metal ions, preventing nuclease activity during boiling. | Fast (under 1 hour) | Crude single-stranded DNA, PCR-compatible | Manual |
PCR amplification and the STR profiling pipeline
From Kary Mullis 1983 to a multiplex CE injection on an ABI 3500.
PCR principle. A target sequence is amplified exponentially over 28 to 32 thermal cycles. Each cycle has three steps: denaturation at ~94 °C (strand separation), annealing at ~55 to 65 °C (primer binding), extension at ~72 °C (Taq polymerase synthesises the complementary strand). Yield doubles per cycle, so 30 cycles can take 1 pg of template to nanogram quantities of amplicon. Kary Mullis published the method in 1983 and won the 1993 Nobel Prize in Chemistry.
Multiplex STR kits put 15 to 24 primer pairs into a single PCR. Each forward primer carries one of four to six fluorescent dyes (FAM, VIC, NED, PET, LIZ are the AmpFISTR set). Amplicon size + dye colour together resolve each locus on a single capillary injection. Indian labs typically run AmpFISTR Identifiler Plus (15 autosomal STRs plus amelogenin), PowerPlex 21 or GlobalFiler (21+ loci including CODIS 20). The amelogenin marker is the conventional sex-typing locus: the X amplicon is 6 bp shorter than the Y amplicon, so a male sample shows two peaks and a female shows one.
Capillary electrophoresis is the workhorse separation step. The classic Indian setup is an Applied Biosystems 3500 Genetic Analyser, an 8-capillary instrument loaded with POP-4 polymer. Fluorescently labelled amplicons electrokinetically inject into each capillary, separate by size as they migrate to the detector, and emit at different wavelengths as the laser excites them. The internal LIZ-labelled size standard runs in the same capillary so allele sizing is calibrated injection by injection. CE is fundamentally the high-resolution descendant of slab-gel methods covered in the electrophoresis topic; the principle is identical, the format is single-capillary microfluidic rather than vertical slab.
The output is an electropherogram: dye-coloured peaks plotted on a size axis. Each STR locus shows one peak (homozygote) or two peaks (heterozygote), and the GeneMapper ID-X software calls the allele numbers against an allelic ladder.
Profile interpretation
Peak heights, stutter, allelic dropout and the Likelihood Ratio framework.
A clean single-source profile from a high-template sample is straightforward: read the alleles, compare to the reference, declare match or exclusion. The interesting work, and the part NTA likes to probe, is in the artefacts and the statistics.
Peak height and stochastic effects. Each peak has a relative fluorescence unit (RFU) value. A clean profile has all peaks above the analytical threshold (typically 50 to 100 RFU) and heterozygote balance above 60 percent. Below the stochastic threshold, one of two alleles at a heterozygous locus may drop out, simulating a false homozygote.
Stutter peaks appear one repeat unit shorter than the true allele (n-4 stutter) at roughly 5 to 15 percent of the parent peak height. They arise from strand slippage during PCR extension. Stutter is a major problem when interpreting mixtures because it can mimic a minor contributor.
Allelic dropout, drop-in, and degradation. Low-template DNA shows allelic dropout (locus appears homozygous when it is actually heterozygous), allelic drop-in (sporadic contamination peaks), and a characteristic ski-slope electropherogram (taller peaks at small loci, shorter at large loci) when the sample is degraded.
Mixture interpretation is the hardest part of casework. Analysts determine the number of contributors, estimate the mixture proportion from peak-height ratios, and either deconvolve or evaluate likelihoods using probabilistic genotyping software like STRmix or EuroForMix. The output is a Likelihood Ratio: the probability of the evidence under the prosecution hypothesis (the suspect is a contributor) divided by the probability under the defence hypothesis (an unknown unrelated person is a contributor). An LR of 10⁹ means the evidence is a billion times more probable under the prosecution hypothesis than under the defence hypothesis. Modern Indian DNA reports increasingly cite LRs rather than simple match probabilities.
Indian institutional context
CDFD, CFSL DNA divisions, BSA 2023, BNSS 2023 and the DNA Technology Bill.
The Indian DNA ecosystem is centred on a few apex institutions that NTA names directly. The Centre for DNA Fingerprinting and Diagnostics (CDFD), Hyderabad is the autonomous institute under the Department of Biotechnology that pioneered Indian DNA fingerprinting under Dr Lalji Singh in the late 1980s. The CFSL Hyderabad DNA division is the central forensic lab DNA reference under DFSS. Other CFSLs at Kolkata, Chandigarh and Delhi run their own DNA units, and most major states have a DNA division in their SFSL. The National Forensic Sciences University (NFSU), Gandhinagar trains the next generation of DNA analysts.
The DNA Data Bank envisaged under DFSS is the Indian equivalent of CODIS, intended to hold convicted-offender profiles, crime-scene profiles, missing-persons profiles and unidentified-deceased profiles. Operational rollout depends on the legislation discussed below.
Statutory framework.
- The Bharatiya Sakshya Adhiniyam 2023 (BSA, replacing the Indian Evidence Act 1872) governs admissibility. Section 39 makes expert opinion (including DNA) admissible; Section 63 deals with electronic records, which now covers DNA electropherograms generated and stored digitally.
- The Bharatiya Nagarik Suraksha Sanhita 2023 (BNSS, replacing the CrPC 1973) mandates collection of fingerprint, footprint, palm-print, photograph and biological specimens from arrested and convicted persons. Section 349 BNSS (read with the Identification of Prisoners Act 1920 as amended by the Criminal Procedure (Identification) Act 2022) is the statutory hook for compelled sample collection.
- The DNA Technology (Use and Application) Regulation Bill was introduced in 2019 to create a National DNA Data Bank, Regional DNA Data Banks, a DNA Regulatory Board and a scheme of accreditation for DNA labs. The Bill was withdrawn in 2023 and has not been re-enacted as of this writing; the policy intent, however, is what NTA tests.