Multimedia Authentication and Deepfake Forensicsmedium Premium

Multimedia Forensics: Voice Spectrography and MFCC Features

Published: 26 May 2026

Questions

Duration

30 min

Faculty-reviewed

Updated

26 May 2026

About this mock

This set drills the acoustic and computational foundations of forensic voice examination as tested in UGC-NET Forensic Science Paper II Unit VIII. Wide-band spectrography (analysis filter bandwidth 300 Hz) resolves formant bars F1 through F4 clearly but smears individual pitch harmonics; narrow-band spectrography (45 Hz bandwidth) resolves individual harmonics and tracks the fundamental frequency F0 but blurs formant structure. Knowing which bandwidth to choose for a given evidential question is a daily decision in a forensic audio unit. Pitch tracking algorithms covered here include autocorrelation, cepstral peak picking, and the YIN algorithm; each has a different error mode when voice is creaky or whispery. Formant analysis maps the resonant frequencies of the vocal tract, with F1 inversely related to vowel height and F2 related to vowel backness, giving each speaker a characteristic vowel space. MFCC extraction follows the canonical pipeline: pre-emphasis filter (coefficient 0.97) boosts high frequencies before framing (20 to 40 ms frames with 50 percent overlap), a Hamming window reduces spectral leakage, FFT converts each frame to the frequency domain, a Mel-scale filterbank maps the spectrum to perceptual frequency bins, log compression mimics the auditory dynamic-range mechanism, and the DCT decorrelates the filterbank energies into 13 standard cepstral coefficients. The Mel scale (Stevens, Volkmann, and Newman 1937) places equal perceptual pitch intervals at equal linear distances. Delta and delta-delta coefficients append first- and second-order temporal derivatives to capture speaking rate and spectral dynamics. LPC models speech production as a source-filter system; the filter order governs how many formant peaks the model can represent. VAD removes silence frames before feature extraction. Phonetic alignment tools Praat and HTK anchor acoustic measurements to specific phones.

Aimed at UGC-NET Forensic Science Paper II aspirants covering Unit VIII, NFSU MSc students in multimedia forensics, CFSL and state FSL audio-forensics trainees, and candidates preparing for IAFPA-aligned competency assessments. CDAC speech-research groups and ENFSI Forensic Speech and Audio Analysis Working Group guidelines inform cohort-selection and reliability questions in this set.

Topics covered:

Wide-band vs narrow-band spectrogram: analysis bandwidth and resolution trade-off
Pitch tracking algorithms: autocorrelation, cepstral peak picking, YIN
Formant analysis: F1/F2/F3/F4, vowel space, and speaker comparison
MFCC pipeline: pre-emphasis, framing, Hamming window, FFT, Mel filterbank, log, DCT
Mel scale: perceptual frequency mapping and its forensic motivation
Delta and delta-delta MFCC: temporal derivative features
LPC: source-filter model, prediction order, and residual signal
VAD, cohort selection, Indian language phonetics, Praat and HTK alignment

Work through each question before checking the explanation, and revisit every wrong answer against the cited Rose, Hollien, Maher, and Rabiner and Schafer references. Allow 30 minutes.

Sources & references

Questions in this mock are written and verified against the following sources. Citations are recorded per question and shown in the explanation after submission.

Rabiner, Lawrence R. and Schafer, Ronald W. -- Theory and Applications of Digital Speech Processing, Pearson, 2011
Chapter 6: Mel-Frequency Cepstral Coefficients -- Mel scale formula, Davis and Mermelstein 1980, HTK implementation
cited in 17 questions
Rose, Philip -- Forensic Speaker Identification, CRC Press, 2002
Chapter 5: Speaker-Specific Features -- F3 and higher formants as anatomical individuality markers in speaker comparison
cited in 7 questions
Hollien, Harry -- The Acoustics of Crime: The New Science of Forensic Phonetics, Plenum Press, 1990
Chapter 5: Spectrographic Analysis -- Narrow-band and wide-band spectrogram configurations for harmonic and formant analysis
cited in 3 questions
Maher, Robert C. -- Principles of Forensic Audio Analysis, Springer, 2018
Chapter 4: Spectrographic Analysis -- Time-frequency uncertainty principle and spectrogram resolution trade-off
cited in 3 questions

How our mocks are built

Questions are written and edited by the ForensicSpot team and cited from peer-reviewed forensic textbooks, official syllabi and primary case law. Each one is verified before publishing. Detailed explanations show after you submit, so the test stays a real test. See a mistake? Tell us.

Common questions

What does the Multimedia Forensics: Voice Spectrography and MFCC Features mock cover?+

How many questions and how long is the test?+

30 multiple-choice questions, 30 minutes total. Difficulty: medium. Tier: Premium.

Who is this mock for?+

Forensic science students and aspirants who want timed, exam-style practice with explanations and verified source citations on Multimedia Authentication and Deepfake Forensics, NET. Useful for postgraduate entrance preparation and for BSc / MSc forensic students testing their recall under time.

Are the questions reviewed?+

Each question carries a verified source citation. Faculty review for individual questions is in progress.

Do I need an account to take this mock?+

Yes, a free ForensicSpot account is required to start a timed attempt — this lets you save progress, see per-question explanations after submission, and track your topic-level performance over time.

Browse more mocks

About this mock

Sources & references

How our mocks are built

Common questions

Your journey to becoming a forensic professional starts here.