NGS Data Analysis: Allele Calling, Variant Callers and Quality Filters

Q: What is a BAM file and why do forensic MPS pipelines require it as an intermediate step?

A BAM (Binary Alignment/Map) file stores sequencing reads after alignment to a reference, with each read's mapping position, strand orientation, and per-base quality scores. The BAM is the forensic working record: it preserves every read for re-analysis. VCF files (variant call format) summarise only detected variants and discard reads that fall below the caller's thresholds. When minor contributor alleles are filtered by GATK, they are still present in the BAM and can be recovered by reanalysis with FreeBayes or a forensic-specific tool at a lower allele-fraction threshold.

Q: Why is GATK unsuitable for forensic mixture analysis without modification?

GATK HaplotypeCaller is designed for diploid organisms and computes genotype likelihoods assuming allele fractions near 0.5 or 1.0. A minor contributor at 10-20% mixture proportion produces allele fractions outside this range; GATK flags these as low quality and may filter them entirely. FreeBayes in pooled-continuous mode, FDSTools, STRait Razor, and the Verogen UAS pipeline are validated alternatives with user-defined minimum allele fraction thresholds appropriate for forensic mixture analysis. See the [NGS pipeline section on variant calling](/topics/forensic-biotechnology/ngs-data-analysis-allele-calling-and-variant-callers) for the worked example showing GATK silently discarding a minor contributor.

Q: What does a Phred Q30 score mean and why is it the forensic MPS threshold?

Q30 means the base-calling algorithm is 99.9% confident in the call at that position (1 in 1,000 error probability). Below Q30, sequencing errors become numerous enough to mimic genuine rare alleles or minor contributor alleles. MPS runs where more than 15% of bases fall below Q30 are typically flagged for resequencing in published SOPs from Virginia DFS and OCME New York. The [sequence alignment and BLAST](/topics/forensic-biotechnology/sequence-alignment-blast-genbank-mitomap-empop) topic covers how Q-score filtering interacts with downstream database queries.

Q: What is an isoallele and why does MPS detect it when CE cannot?

An isoallele is a pair of STR alleles with the same repeat count (and therefore the same CE migration position) that differ by a single nucleotide within the repeat sequence. CE separates fragments by size only and cannot resolve isoalleles 1 bp apart in a 200 bp amplicon. MPS reads the full repeat sequence and distinguishes them, increasing the discrimination power of the profile by expanding the number of distinguishable allele combinations per locus. The Royal Canadian Mounted Police validation of the Thermo Fisher Precision ID GlobalFiler NGS kit showed approximately one order of magnitude improvement in discrimination power over length-based CE typing due to isoallele resolution.

What happens between a sequencer's raw output and a reportable allele: read trimming, alignment to a reference (BWA, Bowtie2), variant calling (GATK, FreeBayes), forensic-specific tools (STRait Razor, FDSTools, MyFLq), coverage and balance filters, and the validation studies that turned MPS into an accredited forensic technique.

Last updated: 18 Jun 2026

When the first massively parallel sequencing (MPS) instrument produced a FASTQ file in a forensic laboratory, the output looked nothing like an electropherogram. Instead of coloured peaks at expected allele positions, the analyst faced millions of short reads, each a 150 or 300-base text string followed by a quality score string. The path from that raw file to a reportable STR allele or a validated single-nucleotide variant runs through five sequential operations that together constitute the forensic MPS analysis pipeline.

Key takeaways

A FASTQ file encodes base calls with Phred quality scores; Q30 (99.9% accuracy, 1-in-1,000 error) is the minimum threshold used in most accredited forensic MPS workflows including those of the FBI Laboratory and OCME New York.
GATK HaplotypeCaller uses a diploid genotype model that silently discards minority alleles at allele fractions below roughly 0.3; ForenSeq mixture casework requires FreeBayes (pooled-continuous mode) or forensic-specific tools such as FDSTools and STRait Razor v3.
STRait Razor v3 (University of North Dakota) and FDSTools (Netherlands Forensic Institute) provide sequence-based STR allele designation, resolving isoalleles that are invisible to capillary electrophoresis and increasing per-locus discrimination power.
MPS-based STR typing has been independently validated and ISO/IEC 17025-accredited at Verogen-certified US labs (OCME New York, Virginia DFS), NFI (Netherlands), BKA (Germany), and Victoria Police (Australia) as of 2024.
The minimum forensic coverage threshold is typically 30 reads per allele; below this threshold the locus must be flagged as a no-call, not adjusted against the laboratory average.

The shift from capillary electrophoresis to MPS changes the nature of the forensic DNA examination in fundamental ways. A conventional CE-based STR kit reports a length-based allele designation (for example, D7S820 allele 10 means ten tetranucleotide repeats at that locus). MPS reports the complete sequence of the repeat and its flanking regions, enabling discrimination between isoalleles that appear identical on CE but differ by a single nucleotide within the repeat. This sequencing-based resolution increases the discrimination power of a forensic DNA profile and reduces the size of the population sharing any given haplotype, but it also requires a more complex analytical pipeline before a result can be reported.

By 2024, MPS-based forensic DNA analysis had been independently validated and accepted into accredited casework in laboratories in the United States (Verogen MiSeq FGx, validated by the FBI Laboratory in 2021 and by New York's Office of the Chief Medical Examiner in 2022), Germany (BKA, internal validation published 2020), the Netherlands (NFI, published 2018), and Australia (Victoria Police Forensic Services, published 2023). In India, the Central Forensic Science Laboratory (CFSL) in Hyderabad has published feasibility studies on MPS STR typing using Illumina platforms, and the National Institute of Biomedical Genomics (NIBMG) in Kalyani has contributed to validation of targeted MPS panels for forensic identification. This topic covers the complete analytical pipeline from FASTQ to reportable result.

FASTQ Files and Read Trimming: The Raw Data Layer

Every decision a forensic analyst makes in the MPS pipeline rests on the accuracy of the base calls in the FASTQ file, and those calls come with a probabilistic quality score that many analysts read without understanding what it means.

A FASTQ file is the standard output format from Illumina, Ion Torrent, and most other sequencing platforms. Each sequenced read occupies four lines: a read identifier beginning with @, the nucleotide sequence, a separator line (+), and the quality score string. Quality scores are encoded in Phred format: a score of Q30 means a 1 in 1000 probability of a base-call error at that position (99.9% accuracy). Q20 means 1 in 100 (99% accuracy). Most forensic validation studies set a minimum read-quality threshold of Q30 for inclusion in allele calling.

Read trimming is the step that removes low-quality bases from read ends, adapter sequences from library preparation, and reads below a minimum length threshold before alignment. Trimmomatic (University of Bochum, Germany) and fastp (China, developed at the BGI-Shenzhen Institute) are the two most widely deployed trimming tools in forensic MPS workflows. Trimmomatic's SLIDINGWINDOW parameter removes trailing bases within a four-base window whenever the average quality drops below Q20. fastp additionally performs automatic adapter detection, a useful feature when adapter sequences vary between library preparation kits.

Trim-before-align is not a universal standard. Some forensic pipelines using the Verogen ForenSeq Universal Analysis Software (UAS) perform adapter trimming internally before alignment, and the BKA-validated workflow for the ForenSeq DNA Signature Prep kit handles trimming within the UAS pipeline rather than as a separate step. The choice of trimming tool and threshold must be documented in the method validation report, as different thresholds affect sensitivity (the ability to detect low-level alleles in mixtures) and specificity (the ability to call the correct allele in the presence of PCR artefacts).

Five-stage forensic MPS pipeline from sequencer FASTQ output to court-reportable allele calls. Each stage named tool is indicated. Coverage and balance filters at Stage 5 are the primary difference between a research-grade and a forensic-grade pipeline.

Alignment to Reference: BWA and Bowtie2

Alignment to a reference genome sounds like a solved problem until the read contains an STR repeat region, at which point the short-read aligner's gap-opening penalty becomes the most consequential parameter in the pipeline.

Read alignment maps each trimmed FASTQ read to its position on a reference genome or a targeted reference panel. For whole-genome forensic sequencing (such as in the Verogen ForenSeq Kintelligence workflow for forensic genetic genealogy), the reference is the GRCh38 human genome assembly. For targeted amplicon-based forensic STR panels (ForenSeq DNA Signature Prep, Precision ID GlobalFiler, PowerSeq 46GY System), the reference consists of amplicon sequences covering only the targeted loci.

BWA-MEM (Burrows-Wheeler Aligner, developed at the Wellcome Sanger Institute, UK) is the standard short-read aligner for whole-genome forensic pipelines. It uses the Burrows-Wheeler transform to index the reference and a Smith-Waterman extension algorithm for local realignment around gaps. BWA-MEM handles reads from 70 bp to a few thousand base pairs and is the recommended aligner in the GATK best-practices pipeline. Bowtie2 (Langmead and Salzberg, Johns Hopkins University, US) uses a different seed-and-extend strategy optimised for speed on short reads (up to ~500 bp) and is the aligner embedded in the Verogen UAS internal pipeline for ForenSeq STR typing.

For STR-containing regions, both aligners may produce misalignments around the repeat because short reads that span only a portion of a long repeat have multiple equally valid mapping positions. The forensic solution is to use targeted, amplicon-based sequencing where each PCR primer pair flanks the repeat and the read spans the entire allele, leaving unambiguous anchoring sequences on both sides of the repeat. This design is built into the ForenSeq, Precision ID, and PowerSeq library preparation kits and is the reason targeted amplicon sequencing dominates forensic MPS rather than whole-genome shotgun sequencing for STR-typing applications.

Output from BWA or Bowtie2 is a SAM (Sequence Alignment/Map) file, converted to the compressed BAM format by samtools, and sorted and indexed for downstream processing. samtools flagstat and qualimap provide coverage statistics (mean read depth, percentage of target bases with depth above 30x, uniformity across loci) that are the input quality metrics for the subsequent filtering steps.

Variant Calling: GATK and FreeBayes

Variant callers designed for clinical genomics carry assumptions about ploidy and error models that hold for germline tumour samples but break down in forensic mixtures, and the analyst who does not understand those assumptions will misinterpret the output.

Variant calling is the step that identifies positions where reads differ from the reference, assigns an allele identity to each position, and generates a VCF (Variant Call Format) file recording each variant with its position, reference allele, alternative allele, quality score, and genotype likelihood. GATK (Genome Analysis Toolkit, Broad Institute, Cambridge, Massachusetts, US) and FreeBayes (Erik Garrison, University of Cambridge, UK) are the two dominant open-source variant callers, and both are used in forensic MPS workflows.

GATK HaplotypeCaller performs local de novo assembly of haplotypes within each active region of the genome, then genotypes alleles using a Bayesian likelihood model. The output includes genotype quality (GQ) scores and phased haplotypes. GATK was designed for clinical germline and somatic variant calling in diploid organisms and performs best when the expected ploidy and error model are set correctly. For forensic single-source samples, GATK's default diploid model is appropriate and produces highly accurate SNP and short-indel calls. For forensic mixtures, GATK's diploid assumption breaks down, because mixture components contribute at sub-diploid fractions, and the genotype likelihood model assigns low confidence to variants present at 10-30% allele fraction.

FreeBayes models ploidy-agnostic variant calling by treating all variants above a configurable allele-frequency threshold as candidates. This makes it more appropriate for forensic mixture analysis than GATK in its default configuration, and the Netherlands Forensic Institute's published MPS mixture workflow uses FreeBayes as the variant caller within their forensic pipeline. The upstream sequence alignment and BLAST topic covers the reference databases these variant calls are compared against for non-human evidence. Configuring FreeBayes for forensic use typically requires setting the minimum-alternate-fraction parameter to 0.05-0.10 and the minimum-alternate-count to 3-5 reads to suppress PCR error artefacts while retaining minor-contributor alleles.

For STR allele calling specifically, neither GATK nor FreeBayes is optimised for the repeat-counting logic that forensic STR interpretation requires. This gap is addressed by forensic-specific tools discussed in the next section.

Same locus in a 1:4 two-person mixture: GATK HaplotypeCaller silences the minor contributor (allele fraction 0.15, below the diploid model floor of ~0.30, filtered as low-quality); FreeBayes in pooled-continuous mode calls both alleles at minimum-alternate-fraction 0.05, recovering the minor contributor.

Forensic-Specific Tools: STRait Razor, FDSTools and MyFLq

The generic bioinformatics pipeline delivers a variant table, but converting a repeat-region sequence into a CODIS-compatible allele designation is a forensic-specific operation that requires forensic-specific software.

Three software packages were developed specifically for forensic STR allele calling from MPS data and have been independently validated in peer-reviewed studies:

STRait Razor (STR allele identification tool, University of North Dakota, US) extracts STR reads from a BAM file, aligns them to a repeat-region reference, counts the repeat units, and assigns a CODIS-compatible allele designation (numeric) or a sequence-based allele designation (the full repeat sequence). STRait Razor was validated in the SWGDAM-approved study by Just et al. (2015) using 87 samples on the Illumina MiSeq, with concordance to CE-based typing demonstrated for all 20 CODIS loci. Version 3.0 (2019) added support for extended CODIS 20 loci and sequence-based allele reporting, enabling the isoallele discrimination that is the primary advantage of MPS over CE.

FDSTools (Forensic DNA Statistics Tools, developed at the Netherlands Forensic Institute, NL) is a more comprehensive toolkit that handles STR allele calling, noise filtering using a background noise model, and CE-to-MPS concordance analysis. FDSTools uses a stutter model trained on laboratory-specific data to distinguish true minor alleles from PCR-stutter artefacts, which is the most critical analytical step in MPS mixture interpretation. The NFI published a full validation of FDSTools for Illumina MiSeq-based STR typing in 2016, and the tool is used in operational casework at NFI.

MyFLq (My Forensic Loci Query, Ghent University, Belgium) provides a web-based and command-line interface for STR allele calling with built-in allele-frequency databases for STR reporting. It was validated using the ForenSeq DNA Signature Prep kit on the Illumina MiSeq and published by Van Neste et al. (2012). The tool's strength is its integration with allele frequency databases for forensic statistics, enabling seamless calculation of match probabilities after allele calling.

Tool	Developer	Primary function	Validated kit	Key publication
STRait Razor v3	Univ. of North Dakota (US)	STR allele calling + sequence designation	ForenSeq, GlobalFiler MPS	Just et al. 2015, 2019
FDSTools	Netherlands Forensic Institute	STR calling + stutter modelling + noise filter	Illumina MiSeq amplicons	Hoogenboom et al. 2016
MyFLq	Ghent University (Belgium)	Web-based STR calling + stats integration	ForenSeq DNA Sig. Prep	Van Neste et al. 2012
ForenSeq UAS	Verogen (US)	End-to-end pipeline for ForenSeq kit	ForenSeq DNA Sig. Prep	Verogen v.study 2021
Precision ID Reporter	Thermo Fisher (US)	Allele calls for Precision ID kits	Precision ID GlobalFiler	Mulero et al. 2020

Coverage, Balance and Analytical Thresholds: The Forensic Filter Layer

The difference between a research-grade MPS pipeline and a forensic-grade one is a set of quantitative thresholds applied before an allele call is reported, and those thresholds must be justified by the laboratory's own validation data, not by the manufacturer's insert.

Forensic MPS pipelines apply two categories of quality filter before an allele is reported: coverage thresholds and balance thresholds.

Coverage thresholds define the minimum number of reads that must map to a locus for the result to be reported. The ForenSeq UAS validation study (Verogen, 2021) sets the minimum per-locus read depth at 30 reads (reads supporting a given allele), with a recommended average depth of 150x for full-profile calls. The NFI validation for FDSTools uses a minimum of 50 reads per allele. The BKA validation for their in-house MPS pipeline uses a minimum of 30 reads with an additional requirement that at least two independent PCR replicates agree on the allele call for low-coverage loci. These numbers reflect each laboratory's empirically determined point at which allele calls become unreliable due to stochastic sampling effects at the bottom of the coverage distribution.

Balance thresholds define the minimum ratio of read counts between alleles at a heterozygous locus (interlocus balance) or between loci in a profile (interlocus balance). For a heterozygous STR locus, a read balance of 60:40 (the minor allele contributes at least 60% of the major allele's read count) is a common threshold for reporting a genuine heterozygous call rather than a homozygous call with background noise. Below 60:40, the locus may be flagged for review and the profile reported with a note. In mixture interpretation, the balance between contributors' alleles reflects their DNA input ratio, and the forensic-specific mixture software (probabilistic genotyping tools like STRmix and EuroForMix, now adapted for MPS input from FDSTools output) uses this balance information as the primary data for mixture deconvolution.

The SWGDAM 2020 guidelines on validation requirements for MPS in forensic DNA typing specify that validation studies must report: (1) concordance with CE typing for the same samples; (2) sensitivity (minimum input DNA at which a full profile is reliably obtained); (3) mixture studies demonstrating accurate allele calls at defined contributor ratios; (4) stochastic effect studies demonstrating the analytical thresholds below which drop-out is possible; and (5) reproducibility studies demonstrating consistent results across instruments, reagent lots, and analysts. These requirements mirror the ISO/IEC 17025 validation framework and the ENFSI DNA Working Group guidelines that European forensic laboratories follow.

Raw FASTQ quality check
FastQC or MultiQC report: per-base quality distribution, adapter content, GC bias. Flag runs with Q30 base rate below 85% for resequencing.
Read trimming
Trimmomatic (SLIDINGWINDOW:4:20) or pipeline-internal trimming. Document tool version and parameters in case file.
Reference alignment
BWA-MEM (whole-genome) or Bowtie2 (amplicon). samtools sort and index. qualimap report: per-locus coverage, uniformity.
Variant or allele calling
GATK HaplotypeCaller (single-source SNP/indel) or FreeBayes (mixtures); STRait Razor / FDSTools / UAS for STR repeat counting and sequence allele designation.
Coverage and balance filtering
Apply laboratory-validated coverage threshold (min 30x per allele) and balance threshold (min 60:40 for heterozygote calls). Flag out-of-threshold loci for review.
Report generation
Export allele designations (numeric or sequence-based), filter flags, and coverage statistics. Feed into probabilistic genotyping software (STRmix, EuroForMix) if mixture is present.

Validation Studies and Accreditation: From Research to Court

A published validation study is the passport that takes an MPS workflow from a research paper to an accredited forensic result, and three independent validation components together make the passport valid.

The Verogen ForenSeq DNA Signature Prep kit on the MiSeq FGx is the most extensively published MPS system in forensic use. Initial validation by Zeng et al. (2015, Investigative Genetics) demonstrated full-profile concordance with CE at input DNA down to 250 pg. Subsequent validation by the FBI Laboratory (published 2021 in Forensic Science International: Genetics) across multiple analysts and instrument runs confirmed concordance at greater than 99.8% of called alleles. The New York OCME validation (2022) added mixture performance data, demonstrating accurate allele calls in two-person mixtures at contributor ratios from 1:4 to 4:1.

The Thermo Fisher Precision ID GlobalFiler NGS STR kit on the Ion S5 platform was validated by the Royal Canadian Mounted Police (RCMP) Laboratory (2019, FSI: Genetics) and by Queensland Health Forensic and Scientific Services, Australia (2020). The RCMP study demonstrated concordance across 500 samples and showed that sequence-based allele designation increased the discrimination power of the profile by approximately one order of magnitude compared to length-based CE typing, because sequence variants within STR repeats are resolved.

Accreditation of MPS-based forensic DNA typing under ISO/IEC 17025 has been achieved by: Verogen-certified laboratories in the US (OCME New York, Virginia DFS), NFI (Netherlands, accredited under the Dutch accreditation board RvA), the BKA (Germany, accredited under DAkkS), and Victoria Police (Australia, accredited under NATA). In each case, the accreditation assessment included review of the validation study, an internal audit of the analytical pipeline, a proficiency test, and a blind sample test using reference material from NIST's Standard Reference Material 2372a (human DNA quantitation standard).

Key terms

FASTQ: The standard sequencing output file format containing four-line records per read: identifier, nucleotide sequence, separator, and Phred quality score string. All MPS pipelines begin with FASTQ input.
BAM file: Binary Alignment/Map format, the compressed binary version of a SAM file. Stores aligned reads with their mapping positions, quality scores, and alignment flags. samtools is the standard tool for BAM manipulation.
VCF: Variant Call Format, the standard output from variant callers (GATK, FreeBayes). Records each variant position with reference allele, alternative allele(s), genotype likelihoods, and quality filters. The forensic-STR tools parse VCF or BAM directly.
Phred quality score (Q): A logarithmic probability score for base-call accuracy. Q30 = 99.9% accuracy (1 in 1000 error probability). Q20 = 99% accuracy. Most forensic MPS pipelines apply a Q30 minimum threshold.
Isoallele: Two STR alleles that share the same repeat count (and therefore appear as the same allele on CE) but differ by a single nucleotide within the repeat sequence. MPS resolves isoalleles; CE does not. Resolution of isoalleles increases the discrimination power of the DNA profile.
Stutter (MPS): In MPS, stutter includes both length-based stutter (as in CE, an n-1 repeat artefact from PCR slippage) and sequence-based stutter (single-nucleotide errors within the repeat from sequencing error). Both must be modelled by forensic-specific tools before allele calls are reported.

Worked example

MPS Mixture Analysis, FreeBayes vs GATK on a Two-Person Forensic Exhibit

A GATK pipeline silently discards the minor contributor's alleles at allele fractions below 0.3. How does the forensic analyst recognise the failure and switch to the right tool?

Scene: The Virginia Department of Forensic Science receives a touch-DNA swab from a firearm grip (homicide investigation). MiSeq FGx with ForenSeq DNA Signature Prep yields 2 × 150 bp paired-end reads. The forensic scientist runs the standard bioinformatics pipeline: BWA-MEM alignment, GATK HaplotypeCaller for variant calling. The VCF is clean with no mixed-base calls. The analyst considers this a single-source profile.

Step 1 (Quality check catches the issue): Routine QC on the BAM file using qualimap shows that at 6 STR loci, between 3 and 5 reads carry alternative alleles at allele fractions of 0.12-0.19. GATK has filtered these as low-quality variants (QUAL < 30) because its diploid model expects allele fractions near 0.5. The minor-contributor alleles are invisible in the VCF but visible in the BAM.

Step 2 (Re-analysis with FreeBayes): The analyst re-calls variants using FreeBayes in polyploid mode (--pooled-continuous, --min-alternate-fraction 0.10). FreeBayes returns 11 additional allele calls consistent with a second contributor at approximately 15% mixture proportion. A two-person mixture is now apparent.

Step 3 (STR-specific allele calling): STRait Razor v2.0 is applied to the BAM file to call STR repeat counts directly from read length and sequence, bypassing variant callers. It returns 14 loci with 3-4 alleles each, confirming two contributors at the locus level. FDSTools processes the STRait Razor output, applying the laboratory's validated stutter model to distinguish genuine minor-contributor alleles from PCR sequence artefacts.

Step 4 (Mixture deconvolution): The STRmix-MPS module is applied, returning an LR of 8.3 × 10^6 for the second weapon-handler matching the suspect's reference profile. The minor contributor matches the victim.

Conclusion: The case illustrates the most important principle in forensic MPS bioinformatics: clinical genomics variant callers (GATK) are not designed for forensic mixture analysis and will silently discard minority alleles. SWGDAM 2020 MPS guidelines require that any mixture-capable tool be separately validated for forensic use, precisely because off-the-shelf clinical tools introduce systematic error at the forensic allele fraction range.

Frequently asked questions

What is a BAM file and why do forensic MPS pipelines require it as an intermediate step?

A BAM (Binary Alignment/Map) file stores sequencing reads after alignment to a reference, with each read's mapping position, strand orientation, and per-base quality scores. The BAM is the forensic working record: it preserves every read for re-analysis. VCF files (variant call format) summarise only detected variants and discard reads that fall below the caller's thresholds. When minor contributor alleles are filtered by GATK, they are still present in the BAM and can be recovered by reanalysis with FreeBayes or a forensic-specific tool at a lower allele-fraction threshold.

Why is GATK unsuitable for forensic mixture analysis without modification?

GATK HaplotypeCaller is designed for diploid organisms and computes genotype likelihoods assuming allele fractions near 0.5 or 1.0. A minor contributor at 10-20% mixture proportion produces allele fractions outside this range; GATK flags these as low quality and may filter them entirely. FreeBayes in pooled-continuous mode, FDSTools, STRait Razor, and the Verogen UAS pipeline are validated alternatives with user-defined minimum allele fraction thresholds appropriate for forensic mixture analysis. See the [NGS pipeline section on variant calling](/topics/forensic-biotechnology/ngs-data-analysis-allele-calling-and-variant-callers) for the worked example showing GATK silently discarding a minor contributor.

What does a Phred Q30 score mean and why is it the forensic MPS threshold?

Q30 means the base-calling algorithm is 99.9% confident in the call at that position (1 in 1,000 error probability). Below Q30, sequencing errors become numerous enough to mimic genuine rare alleles or minor contributor alleles. MPS runs where more than 15% of bases fall below Q30 are typically flagged for resequencing in published SOPs from Virginia DFS and OCME New York. The [sequence alignment and BLAST](/topics/forensic-biotechnology/sequence-alignment-blast-genbank-mitomap-empop) topic covers how Q-score filtering interacts with downstream database queries.

What is an isoallele and why does MPS detect it when CE cannot?

An isoallele is a pair of STR alleles with the same repeat count (and therefore the same CE migration position) that differ by a single nucleotide within the repeat sequence. CE separates fragments by size only and cannot resolve isoalleles 1 bp apart in a 200 bp amplicon. MPS reads the full repeat sequence and distinguishes them, increasing the discrimination power of the profile by expanding the number of distinguishable allele combinations per locus. The Royal Canadian Mounted Police validation of the Thermo Fisher Precision ID GlobalFiler NGS kit showed approximately one order of magnitude improvement in discrimination power over length-based CE typing due to isoallele resolution.

Practice

Question 1 of 5· 0 answered

A forensic MPS pipeline using BWA-MEM and GATK HaplotypeCaller produces a VCF file from a single-source reference sample. Coverage at locus CSF1PO is 22 reads. The laboratory's validated minimum coverage threshold is 30 reads. What is the correct analytical action?

Test yourself on Forensic Biotechnology with free, timed mocks.

Practice Forensic Biotechnology questions

Found this useful? Pass it along.

Spotted an error in this page? Report a correction or read our editorial standards.

Frequently asked questions

Your journey to becoming a forensic professional starts here.