Method Validation, Measurement Uncertainty and Proficiency Testing
The validation parameters every forensic method must establish, the GUM approach to measurement uncertainty, control charts, and the PT schemes that keep an Indian lab honest.
Last updated:
Method validation is the documented proof that an analytical procedure is fit for its intended purpose across a defined matrix, analyte, and concentration range. ISO 17025:2017 clause 7.2.2 makes validation mandatory for every non-standard or laboratory-developed method. Measurement uncertainty quantifies the dispersion of values attributable to the measurand and must accompany every reported result; proficiency testing provides the external check that a laboratory's results agree with those of other accredited laboratories on the same blind sample. Together, these three elements determine whether a forensic analytical certificate withstands scrutiny in court.
Method validation, measurement uncertainty and proficiency testing are the three pillars that decide whether a forensic instrumental result holds up at trial or falls apart on cross-examination. Validation is the documented evidence that the method does what the laboratory says it does. Measurement uncertainty is the honest statement of how wide the confidence interval really is. Proficiency testing is the external check that the laboratory's number agrees with what other accredited labs get on the same blind sample. Take any one away and the chemical examiner's certificate under Bharatiya Sakshya Adhiniyam Section 63 is open to challenge.
Key takeaways
- Method validation, measurement uncertainty, and proficiency testing are the three pillars that determine whether a forensic result holds up under cross-examination in an Indian trial court.
- ISO 17025:2017 clause 7.2.2 makes validation mandatory for any non-standard or laboratory-developed method, which covers almost every forensic toxicology and chemistry method at an Indian SFSL or CFSL.
- Measurement uncertainty must be calculated using the GUM-aligned approach from JCGM 100:2008, because defence counsel in NDPS and BSA Section 63 cross-examinations now routinely ask for the uncertainty figure.
- Proficiency testing is the external check that a laboratory's result agrees with what other accredited labs get on the same blind sample, and sustained PT performance is required to maintain NABL accreditation.
- Shewhart control charts catch instrument drift between external PT rounds, giving the laboratory an internal early-warning system before a bias affects a reported case result.
This page covers the validation parameters codified by ISO 17025:2017 and NABL Document 141, the GUM-aligned uncertainty budget from JCGM 100:2008, the Shewhart control charts that catch instrument drift, and the PT schemes Indian laboratories participate in. The question "What is the measurement uncertainty on this number?" is now routine in NDPS and Section 63 cross-examinations, and a certificate that cannot answer it is open to challenge.
By the end of this topic you will be able to:
- Identify the validation parameters required under ISO 17025:2017 and NABL Document 141 and state the acceptance criteria for each.
- Calculate a measurement uncertainty budget using the GUM (JCGM 100:2008) approach, distinguishing Type A and Type B contributions and combining them in quadrature.
- Interpret Shewhart control chart patterns, including action limits, warning limits, and trend rules, and explain when each triggers investigation.
- Calculate a z-score from a proficiency-testing round and classify the result as satisfactory, questionable, or unsatisfactory under ISO 13528.
- Describe the purpose and scope of at least three proficiency testing schemes used by Indian forensic laboratories.
- Method validation
- Documented exercise that proves a method is fit for purpose for a defined matrix, analyte and concentration range. Required by ISO 17025:2017 clause 7.2.2 and NABL Document 141, inspected on every accreditation visit.
- Measurement uncertainty (MU)
- The dispersion of values reasonably attributable to the measurand. Calculated through GUM (JCGM 100:2008) and reported as an expanded uncertainty U at a coverage factor (usually k=2) corresponding to roughly 95 percent confidence.
- Accuracy and precision
- Accuracy is closeness to the true value, measured as recovery from a spiked CRM with the 80 to 120 percent window. Precision is the spread of replicates as RSD: intra-day typically 1 to 5 percent for a well-tuned LC-MS/MS, inter-day 2 to 10 percent.
- LOD and LOQ
- Limit of detection is three times the SD of blank or S/N of 3:1. Limit of quantitation is ten times SD of blank or S/N of 10:1. Below LOD: 'not detected'. Between LOD and LOQ: 'detected, below LOQ' rather than a number.
- Proficiency testing (PT)
- Inter-laboratory comparison scored as a z-score: |z| below 2 satisfactory, 2 to 3 questionable, above 3 unsatisfactory. ISO 17025 clause 7.7.2 requires regular participation.
- Control chart
- Time-series plot of a QC measurement with warning limits at ±2σ and action limits at ±3σ from validation data. Patterns such as seven consecutive points on one side trigger investigation before drift reaches case work.
Method validation: what ISO 17025 and NABL 141 demand
ISO 17025:2017 clause 7.2.2 makes validation mandatory for any non-standard or laboratory-developed method, which covers almost every forensic toxicology and chemistry method at an Indian SFSL or CFSL. NABL Document 141 inspects the file on every accreditation visit. The Bharatiya Sakshya Adhiniyam does not name validation in Section 63, but the courts read into the section a requirement that the method be one a competent expert would accept as reliable, and the file is how a laboratory documents that. The parameter list is consistent across ICH Q2(R2), the FDA Bioanalytical guidance with 2022 ICH M10, AOAC, SOFT and SAMHSA, and the WADA International Standard for Laboratories.
Accuracy is percentage recovery from a spiked CRM with the 80 to 120 percent window for forensic toxicology. Precision splits into repeatability (intra-day, RSD below 5 percent), intermediate precision (inter-day with different operators, 5 to 15 percent) and reproducibility (inter-laboratory, what PT tests). Linearity needs R² above 0.995 across five to seven calibrators in triplicate, with back-calculated concentrations within ±15 percent of nominal (±20 percent at LOQ). A high R² alone is insufficient; an over-fitted polynomial can show R² of 0.999 and still produce back-calculated concentrations well outside acceptance criteria at the curve extremes.
LOD and LOQ come from statistics, not eyeball. The 3σ method runs ten blank-matrix injections, takes the SD of the response, and reports LOD as three times that SD; LOQ is the same with a factor of ten. Specificity is the demonstration that no interfering peak appears at the analyte's retention time and m/z transition in a blank matrix or with structural analogues. Robustness is the demonstration that small intentional variations (pH ±0.1, column temperature ±2 °C, flow ±5 percent) do not shift the result outside the precision window. Carry-over checks that a blank following the highest calibrator gives no peak above 20 percent of LOQ. System suitability is the pre-run check that catches an instrument drifting before the batch begins.

Measurement uncertainty: the GUM approach in practice
The Guide to the Expression of Uncertainty in Measurement (GUM, JCGM 100:2008) is the international consensus on how to compute and report uncertainty. NABL has adopted it through TR-001 and ISO 17025 clause 7.6. The BSA Section 63 certificate is now expected, as best practice, to include the expanded uncertainty alongside the reported value.
GUM splits contributions into Type A (statistical evaluation of repeated measurements, the SD of the mean) and Type B (the calibration certificate of the balance, the CRM's stated uncertainty, the volumetric flask's tolerance, temperature variation, matrix inhomogeneity). Each Type B contribution becomes a standard uncertainty by dividing the tolerance by an appropriate divisor (square root of three for a rectangular distribution, two for a stated 95 percent confidence interval). The combined standard uncertainty u_c is propagated through the measurement equation; for the multiplicative model most forensic methods use, u_c/c is the square root of the sum of (u_i/x_i)² across every contribution. The expanded uncertainty U is u_c multiplied by a coverage factor k, with k=2 giving roughly 95 percent confidence. The convention is "ethanol 110 ± 5 mg/100 mL (k=2)".
Bottom-up, the strict GUM method, identifies every contribution from first principles and computes a full budget; rigorous but laborious. Top-down uses validation data, CRM bias and PT performance to estimate combined uncertainty empirically. NABL accepts top-down for routine forensic work. For a typical LC-MS/MS toxicology method, contributions are repeatability (1 to 3 percent), calibration curve (0.5 to 2 percent), reference standard (0.3 to 1 percent for Sigma, 0.1 to 0.5 percent for IPC), sample inhomogeneity (0.5 to 2 percent for whole blood) and matrix effect on ionisation (1 to 3 percent with a deuterated IS). Combined in quadrature, u_c typically lands at 2 to 4 percent and expanded U at k=2 at 4 to 8 percent.
Control charts and proficiency testing: catching drift and bias
A validated method is not a permanent guarantee. Instruments drift, reference standards age, columns degrade. The Shewhart control chart catches this: a time-series plot of a QC measurement (spiked blank or CRM) against date, with warning limits at ±2σ and action limits at ±3σ from validation data. A point outside the action limits triggers investigation. Trend rules catch drift that single-point limits miss: seven consecutive points on one side of the mean indicates a systematic shift even within warning limits; a steady trend of six or more points often traces to a column ageing or a detector lamp dimming. The Western Electric rules formalise these patterns. The CUSUM chart catches a 0.5σ shift in five to ten measurements where Shewhart needs twenty to thirty. CFSL Chandigarh runs NIST SRM 1577c monthly on the ICP-MS panel and tracks lead, arsenic, mercury and cadmium on individual Shewhart charts.
Internal QC catches drift but not bias. A laboratory that consistently quantitates morphine 12 percent low because of an extraction step that strips the analyte will get tight Shewhart charts and a wrong number every time. The only way to catch systematic bias is proficiency testing.
The standard scoring statistic is the z-score: participant minus assigned value, divided by the target SD. |z| below 2 is satisfactory, 2 to 3 questionable, above 3 unsatisfactory. The target SD comes from the CRM's certified uncertainty, the consensus SD of participants (Algorithm A from ISO 13528, robust to outliers), or a fitness-for-purpose value. The En score incorporates each laboratory's stated MU. ISO 17025 clause 7.7.2 requires regular PT participation; NABL 141 expects at least annual participation for every major analyte class. An unsatisfactory result triggers investigation, root-cause analysis, CAPA, and retest where the scheme allows. Repeated unsatisfactory results without effective CAPA can cost the affected scope from accreditation.
| Provider | Scope | Frequency | Indian participants |
|---|---|---|---|
| NABL PT scheme | Forensic toxicology, chemistry, environmental | Annual | All NABL-accredited SFSLs and CFSLs |
| Collaborative Testing Services (CTS, USA) | Forensic chemistry, toxicology, trace, firearms | Twice yearly | CFSL Chandigarh, CFSL Hyderabad, FSL Madhuban |
| UNODC International Collaborative Exercises | Seized drug identification and quantitation | Twice yearly | CFSL Chandigarh, CDSCO regional labs |
| GeT-RM (USA) | Forensic DNA, mitochondrial, Y-STR | Annual | CFSL DNA divisions, state DNA units |
| WADA EQAS | Anti-doping, threshold substances, IRMS | Three times per year | NDTL Delhi (only WADA-accredited Indian lab) |
| SoHT proficiency | Hair toxicology, drugs in keratinised matrices | Annual | Specialist hair-toxicology units |

Common pitfalls in Indian validation and PT practice
Over-fitted calibration curves are the most common: R² reads 0.9999 but back-calculated concentrations at the lowest and highest calibrators deviate by 25 percent because the analyst fitted a quadratic where a linear model belonged. Skipping intermediate precision is the second; files often report intra-day RSD only across six replicates within an hour, and the inter-day RSD across three to five days never makes it into the file even though that is what case work actually experiences. LOD and LOQ from a single low-concentration injection are the third; the chromatographer eyeballs the lowest standard at S/N of around 10:1 and writes that into the LOQ field instead of running the 3σ and 10σ calculation from blank-replicate SD.
Failing to revalidate after a meaningful method change is the fourth. Column lot, mobile phase composition, internal standard swap to a deuterated isotopologue. Each should trigger at least a partial revalidation. A change visible in the SOP revision history without a corresponding revalidation record is a finding at accreditation visits and a potential challenge point in court. Omitting the MU statement from the Section 63 certificate is the fifth pitfall, and the one with the most direct courtroom consequence. The validation file has the calculation, the SOP references it, but a certificate template written before the requirement became standard practice may omit it. Updating the template to include the expanded uncertainty and coverage factor resolves this.
Under ISO 17025:2017 and NABL Document 141, the accepted RSD for intra-day precision of a forensic toxicology LC-MS/MS method at a working-range concentration is typically:
Frequently asked questions
What is method validation and why does ISO 17025 require it?
What is the difference between accuracy and precision?
How is measurement uncertainty calculated using GUM?
What does a z-score above 3 mean in proficiency testing?
Which PT schemes do Indian forensic laboratories actually participate in?
Why must a Section 63 certificate include a measurement uncertainty statement?
What is a control chart and what does it tell a forensic analyst?
Test yourself on Instrumental Techniques with free, timed mocks.
Practice Instrumental Techniques questionsSpotted an error in this page? Report a correction or read our editorial standards.