Skip to content

DNA in Forensic Science: Structure, Genetic Markers, Extraction and Profiling

DNA structure, STR markers, extraction workflows, PCR + capillary electrophoresis profiling, CODIS, and Indian DNA labs.

Last updated:

Share

DNA evidence is the single highest-weight tool in modern forensic identification, and Unit III of the UGC-NET Forensic Science syllabus bundles four sub-parts into one bullet: the molecule's structure, what makes it usable as a genetic marker, how analysts pull it out of stained substrates, and how the resulting profile is generated and interpreted. Each sub-part has its own MCQ vocabulary, and NTA has historically picked from across all four in a single cycle.

Treat this as the heaviest topic in Unit III. The terms repeat (STR, PCR, CODIS, capillary electrophoresis, Likelihood Ratio), the dates matter (Watson-Crick 1953, Jeffreys 1984, CODIS 1998), and the Indian institutional anchors (CDFD Hyderabad, CFSL Hyderabad DNA division, the DNA Data Bank under DFSS, BSA 2023 and BNSS 2023) are favourite distractors. Lock the marker comparison table and the extraction table; the rest of Paper 2 will reuse them.

Key terms
Nucleotide
Monomer unit of DNA: a deoxyribose sugar, a phosphate group, and one of four nitrogenous bases (A, T, G, C).
Watson-Crick base pairing
Adenine pairs with thymine via two hydrogen bonds; guanine pairs with cytosine via three. The pairing rule that makes the double helix predictable and PCR possible.
Chromosome
Packaged DNA. Humans have 22 pairs of autosomes plus one pair of sex chromosomes (XX or XY), all housed in the nucleus, plus a small circular mitochondrial genome.
Autosomal vs mitochondrial DNA
Autosomal DNA is nuclear, biparentally inherited, present as two copies per cell. Mitochondrial DNA is cytoplasmic, maternally inherited, present in hundreds to thousands of copies per cell.
STR (Short Tandem Repeat)
Stretches of 2 to 6 base-pair motifs repeated head-to-tail. Length polymorphism at STR loci is the basis of modern forensic profiling.
SNP (Single Nucleotide Polymorphism)
A single-base variation at a defined position. Used for ancestry, phenotyping, and degraded-sample work.
VNTR
Variable Number Tandem Repeats. Longer repeat units (10 to 100 bp) than STRs. The original Alec Jeffreys 1984 fingerprinting markers, now superseded by STRs.
RFLP
Restriction Fragment Length Polymorphism. The first-generation DNA typing method using restriction enzymes and Southern blotting. Required nanograms of intact DNA.
PCR
Polymerase Chain Reaction. Kary Mullis 1983. Exponential amplification of a target sequence using a thermostable Taq polymerase, two primers and thermal cycling.
Multiplex STR kit
Single-tube PCR amplifying 15+ STR loci plus amelogenin in one reaction. AmpFISTR Identifiler Plus and PowerPlex are the workhorses in Indian DNA labs.
Capillary electrophoresis
Size separation of fluorescently labelled PCR products through a polymer-filled capillary on an instrument like the ABI 3500. Output is an electropherogram.
CODIS
Combined DNA Index System. The FBI database that defined the original 13 core STR loci (expanded to 20 in 2017). The standard reference set globally.

DNA structure, the bits forensic science cares about

Watson-Crick 1953, antiparallel strands, and why the non-coding regions matter.

The double-helix model was published by James Watson and Francis Crick in Nature, April 1953, building on the X-ray diffraction work of Rosalind Franklin and Maurice Wilkins (Photograph 51). The structure is a right-handed double helix: two antiparallel polynucleotide strands held together by hydrogen bonds between complementary bases. Adenine pairs with thymine (two H-bonds), guanine with cytosine (three H-bonds). The sugar-phosphate backbone runs outside, the bases stack inside. Helix diameter is roughly 2 nm; one full turn covers about 10.5 base pairs and 3.4 nm.

For forensic work, the relevant breakdown is coding vs non-coding DNA. Only about 1 to 2 percent of the human genome codes for protein. The rest, often called non-coding or junk DNA, is full of repetitive sequences. Two kinds of repeats matter:

  • Tandem repeats: short motifs repeated head-to-tail. VNTRs (10 to 100 bp motifs, repeated 5 to several hundred times) and STRs (2 to 6 bp motifs, repeated 5 to 50 times).
  • Single-base variations: SNPs scattered roughly every 1,000 bases.

Forensic profiling targets the non-coding repeats deliberately. They are highly polymorphic (which gives identification power), they do not affect the donor's phenotype (which sidesteps privacy concerns), and the short STR repeats survive partial degradation better than the long VNTRs of the early Jeffreys era. Under Section 39 of the Bharatiya Sakshya Adhiniyam 2023, the resulting profile is admissible as expert opinion in Indian courts, which is why analysts document the marker set and the validation status of the kit in every report.

DNA as a genetic marker

Autosomal STRs, Y-STRs, mtDNA and SNPs. What each does, why STR became the standard.

Four marker classes are tested in NTA papers. Each has a defined chromosomal location, a defined inheritance pattern, and a defined casework niche. Memorise this table; it answers half the marker MCQs you will see.

MarkerChromosomal locationInheritanceMutation rateTypical forensic use
Autosomal STR22 autosome pairsBiparental (one allele from each parent)~10⁻³ per locus per generationStandard individual identification, paternity, mass disasters
Y-STRY chromosome (non-recombining region)Paternal (father to son, unchanged within a lineage)~10⁻³ per locus per generationMale contributor in sexual-assault mixtures, paternal lineage testing
mtDNAMitochondrion (circular, ~16,569 bp)Maternal (mother to all offspring)~10⁻² in the control region (5-10× nuclear)Old bones, hair shafts, severely degraded samples, maternal lineage
SNPAcross the genomeBiparentalVery low (effectively stable)Ancestry inference, externally visible characteristics, highly degraded DNA

Why autosomal STR became the global standard. STRs are short (under 400 bp amplicons), which means a partially degraded sample still gives a typeable signal. They are highly polymorphic; a single locus can have 10 to 30 alleles in a population, so combining 15 to 20 loci yields random match probabilities below 10⁻¹⁵. They multiplex cleanly in a single PCR tube. The CODIS core (originally 13 loci, expanded to 20 in January 2017) and the European ESS set anchor the international convention. Indian labs run AmpFISTR Identifiler Plus (15 STRs plus amelogenin) and increasingly the 21-locus GlobalFiler kit on ABI 3500 platforms.

Y-STRs are essential for male-female mixtures from sexual-assault evidence because they ignore the female component. They cannot distinguish patrilineal relatives (father, son, brother, paternal uncle all share the same Y haplotype barring mutation), which is a known MCQ trap.

Mitochondrial DNA wins when nuclear DNA has failed. Hundreds to thousands of mtDNA copies per cell mean a hair shaft, a tooth pulp from a 30-year-old case, or charred remains can still type. The high control-region mutation rate gives discrimination, but mtDNA only identifies a maternal lineage, not an individual.

SNPs and SNaPshot assays are used for ancestry, phenotype prediction (eye, hair, skin colour) and for highly degraded DNA where even mini-STRs fail.

DNA extraction

Four mainstream chemistries plus the differential extraction that exists for one specific case type.

Extraction has to do three things: lyse the cell, separate DNA from protein and lipid, and recover the DNA in a buffer compatible with downstream PCR. Sample integrity from the scene through the lab depends on a documented chain of custody; a contaminated swab cannot be unsplit by good chemistry. NTA tests the four method names, their pros and cons, and the special case of differential extraction.

MethodPrincipleSpeedDNA qualityAutomationBest for
Organic (phenol-chloroform)Cell lysis with SDS + proteinase K, then phenol-chloroform partitioning. DNA stays in the aqueous phase.Slow (4+ hours, hazardous reagents)High yield, long fragmentsManualBone, tooth, tissue, reference samples where yield matters more than throughput
Chelex 100Chelating resin binds divalent metal ions, preventing nuclease activity during boiling.Fast (under 1 hour)Crude single-stranded DNA, PCR-compatibleManualBloodstains, buccal swabs, small reference samples
Silica column (spin-column)DNA binds silica in high-salt chaotrope, contaminants wash off, elution in low-salt buffer.Moderate (1-2 hours)Clean double-stranded DNA, good for STR multiplexManual or semi-automatedWide range of crime-scene samples; QIAamp DNA kits are common in CFSLs
Magnetic beadDNA binds silica-coated paramagnetic beads, magnet pulls beads to tube wall, wash and elute.Fast (~30 min on a robot)Clean, reproducible, automation-friendlyRobotic (Maxwell, EZ1, KingFisher)High-throughput casework, low-template samples, databasing workflows

Differential extraction is the one method you must name in any sexual-assault evidence question. The vaginal/anal swab carries epithelial cells from the victim and sperm cells from the male contributor. Sperm heads are extraordinarily resistant to lysis because their DNA is packed with disulfide-bonded protamines. The trick:

  1. First lysis with SDS and proteinase K dissolves the epithelial cells. Centrifuge, save the supernatant as the female fraction.
  2. Wash the pellet, then add dithiothreitol (DTT) to reduce the protamine disulfides. Re-lyse with SDS and proteinase K to release the sperm DNA, the male fraction.

Run the two fractions through separate PCR and CE injections. The male profile becomes interpretable without being drowned out by the victim's DNA.

Differential extraction: SDS/proteinase K lyses epithelial cells first (female fraction); DTT then breaks protamine disulfide
Differential extraction: SDS/proteinase K lyses epithelial cells first (female fraction); DTT then breaks protamine disulfides to release sperm DNA (male fraction). Each fraction runs to separate PCR

PCR amplification and the STR profiling pipeline

From Kary Mullis 1983 to a multiplex CE injection on an ABI 3500.

PCR principle. A target sequence is amplified exponentially over 28 to 32 thermal cycles. Each cycle has three steps: denaturation at ~94 °C (strand separation), annealing at ~55 to 65 °C (primer binding), extension at ~72 °C (Taq polymerase synthesises the complementary strand). Yield doubles per cycle, so 30 cycles can take 1 pg of template to nanogram quantities of amplicon. Kary Mullis published the method in 1983 and won the 1993 Nobel Prize in Chemistry.

Multiplex STR kits put 15 to 24 primer pairs into a single PCR. Each forward primer carries one of four to six fluorescent dyes (FAM, VIC, NED, PET, LIZ are the AmpFISTR set). Amplicon size + dye colour together resolve each locus on a single capillary injection. Indian labs typically run AmpFISTR Identifiler Plus (15 autosomal STRs plus amelogenin), PowerPlex 21 or GlobalFiler (21+ loci including CODIS 20). The amelogenin marker is the conventional sex-typing locus: the X amplicon is 6 bp shorter than the Y amplicon, so a male sample shows two peaks and a female shows one.

Capillary electrophoresis is the workhorse separation step. The classic Indian setup is an Applied Biosystems 3500 Genetic Analyser, an 8-capillary instrument loaded with POP-4 polymer. Fluorescently labelled amplicons electrokinetically inject into each capillary, separate by size as they migrate to the detector, and emit at different wavelengths as the laser excites them. The internal LIZ-labelled size standard runs in the same capillary so allele sizing is calibrated injection by injection. CE is fundamentally the high-resolution descendant of slab-gel methods covered in the electrophoresis topic; the principle is identical, the format is single-capillary microfluidic rather than vertical slab.

The output is an electropherogram: dye-coloured peaks plotted on a size axis. Each STR locus shows one peak (homozygote) or two peaks (heterozygote), and the GeneMapper ID-X software calls the allele numbers against an allelic ladder.

Profile interpretation

Peak heights, stutter, allelic dropout and the Likelihood Ratio framework.

A clean single-source profile from a high-template sample is straightforward: read the alleles, compare to the reference, declare match or exclusion. The interesting work, and the part NTA likes to probe, is in the artefacts and the statistics.

Peak height and stochastic effects. Each peak has a relative fluorescence unit (RFU) value. A clean profile has all peaks above the analytical threshold (typically 50 to 100 RFU) and heterozygote balance above 60 percent. Below the stochastic threshold, one of two alleles at a heterozygous locus may drop out, simulating a false homozygote.

Stutter peaks appear one repeat unit shorter than the true allele (n-4 stutter) at roughly 5 to 15 percent of the parent peak height. They arise from strand slippage during PCR extension. Stutter is a major problem when interpreting mixtures because it can mimic a minor contributor.

Allelic dropout, drop-in, and degradation. Low-template DNA shows allelic dropout (locus appears homozygous when it is actually heterozygous), allelic drop-in (sporadic contamination peaks), and a characteristic ski-slope electropherogram (taller peaks at small loci, shorter at large loci) when the sample is degraded.

Mixture interpretation is the hardest part of casework. Analysts determine the number of contributors, estimate the mixture proportion from peak-height ratios, and either deconvolve or evaluate likelihoods using probabilistic genotyping software like STRmix or EuroForMix. The output is a Likelihood Ratio: the probability of the evidence under the prosecution hypothesis (the suspect is a contributor) divided by the probability under the defence hypothesis (an unknown unrelated person is a contributor). An LR of 10⁹ means the evidence is a billion times more probable under the prosecution hypothesis than under the defence hypothesis. Modern Indian DNA reports increasingly cite LRs rather than simple match probabilities.

Indian institutional context

CDFD, CFSL DNA divisions, BSA 2023, BNSS 2023 and the DNA Technology Bill.

The Indian DNA ecosystem is centred on a few apex institutions that NTA names directly. The Centre for DNA Fingerprinting and Diagnostics (CDFD), Hyderabad is the autonomous institute under the Department of Biotechnology that pioneered Indian DNA fingerprinting under Dr Lalji Singh in the late 1980s. The CFSL Hyderabad DNA division is the central forensic lab DNA reference under DFSS. Other CFSLs at Kolkata, Chandigarh and Delhi run their own DNA units, and most major states have a DNA division in their SFSL. The National Forensic Sciences University (NFSU), Gandhinagar trains the next generation of DNA analysts.

The DNA Data Bank envisaged under DFSS is the Indian equivalent of CODIS, intended to hold convicted-offender profiles, crime-scene profiles, missing-persons profiles and unidentified-deceased profiles. Operational rollout depends on the legislation discussed below.

Statutory framework.

  • The Bharatiya Sakshya Adhiniyam 2023 (BSA, replacing the Indian Evidence Act 1872) governs admissibility. Section 39 makes expert opinion (including DNA) admissible; Section 63 deals with electronic records, which now covers DNA electropherograms generated and stored digitally.
  • The Bharatiya Nagarik Suraksha Sanhita 2023 (BNSS, replacing the CrPC 1973) mandates collection of fingerprint, footprint, palm-print, photograph and biological specimens from arrested and convicted persons. Section 349 BNSS (read with the Identification of Prisoners Act 1920 as amended by the Criminal Procedure (Identification) Act 2022) is the statutory hook for compelled sample collection.
  • The DNA Technology (Use and Application) Regulation Bill was introduced in 2019 to create a National DNA Data Bank, Regional DNA Data Banks, a DNA Regulatory Board and a scheme of accreditation for DNA labs. The Bill was withdrawn in 2023 and has not been re-enacted as of this writing; the policy intent, however, is what NTA tests.

Three case-law anchors worth knowing: Selvi v. State of Karnataka (2010) on involuntary narco/polygraph/BEAP (DNA sample collection is not a Selvi-protected category once BNSS mandate exists), Nandlal Wasudeo Badwaik v. Lata Nandlal Badwaik (2014) where the Supreme Court held that DNA tests prevail over the presumption of legitimacy under the Evidence Act, and Inayath Ali v. State of Telangana (2022) on consent in paternity-disputed DNA orders.

Who discovered DNA fingerprinting and what marker did they use?
Sir Alec Jeffreys at the University of Leicester, in September 1984. His original method used multi-locus probes targeting VNTRs (Variable Number Tandem Repeats) and was read out by Southern-blot RFLP. Modern forensic profiling no longer uses VNTRs; it uses STRs amplified by PCR and resolved on capillary electrophoresis. The Jeffreys-VNTR vs modern-STR distinction is a favourite NTA distractor.
Why are STRs preferred over VNTRs in forensic DNA profiling today?
Three reasons. First, STR amplicons are short (under 400 bp), so a partially degraded sample still yields a typeable signal where the kilobase-scale VNTR fragments would have broken down. Second, STRs multiplex cleanly: 15 to 24 loci amplify in a single PCR tube. Third, PCR needs only picograms of template, while RFLP-VNTR required nanograms of intact DNA. STR-PCR-CE is faster, more sensitive, and database-compatible across labs (CODIS, NDNAD, ESS).
What is differential extraction and why is it asked in UGC-NET sexual-assault questions?
Sexual-assault swabs carry victim epithelial cells and male sperm cells. A standard lysis would dissolve both and the male profile would be drowned out by the victim's DNA. Differential extraction first lyses the epithelial cells with SDS and proteinase K, saves the supernatant as the female fraction, then uses dithiothreitol (DTT) to reduce sperm-head protamine disulfides and release the sperm DNA as a separate male fraction. Each fraction is amplified and typed independently.
What are the CODIS core STR loci and how many are there?
CODIS (Combined DNA Index System, FBI, launched 1998) originally defined 13 core STR loci that every contributing US lab had to type. In January 2017 the FBI expanded the core to 20 loci to improve discrimination and align with international kits. Indian labs do not contribute to CODIS, but the 20-locus convention drives the design of multiplex kits used here: AmpFISTR GlobalFiler, PowerPlex Fusion, and similar.
Which Indian institution is the apex centre for DNA analysis and which statute governs admissibility?
The Centre for DNA Fingerprinting and Diagnostics (CDFD), Hyderabad, established in 1995, is the apex autonomous institute and was the cradle of Indian forensic DNA work under Dr Lalji Singh. Operational casework runs through the CFSL DNA divisions (Hyderabad, Kolkata, Chandigarh, Delhi) under DFSS and through state SFSL DNA units. Admissibility of expert DNA opinion is governed by Section 39 of the Bharatiya Sakshya Adhiniyam 2023. Sample collection from arrested persons is mandated under BNSS 2023 read with the Criminal Procedure (Identification) Act 2022.

Test yourself on UGC-NET Forensic Science with free, timed mocks.

Practice UGC-NET Forensic Science questions

Found this useful? Pass it along.

Share

Spotted an error in this page? Report a correction or read our editorial standards.

Your journey to becoming a forensic professional starts here.

Practice with mock tests, learn from structured notes, and get your questions answered by a global forensic community, all in one place.