Skip to content

Closed-corpus trap

Definition

The error of validating a stylometric method exclusively on texts from authors whose profiles were used to build the model, inflating apparent accuracy beyond what is achievable on truly unseen authors.

Related terms

Burrows's Delta
A distance metric introduced by John Burrows in 2002 that measures how far a text's standardised word-frequency profile deviates from the mean...
Cosine Delta
A variant of Burrows's Delta that uses cosine similarity instead of Manhattan distance. Research by Argamon and later Eder has shown it...
Cross-validation
A validation strategy that repeatedly partitions a labelled corpus into training and test subsets to estimate the method's error rate on unseen...
Principal component analysis (PCA)
A multivariate statistical method used in fire debris research to reduce chromatographic data matrices to principal components that capture major variance. Used...
Rolling Delta
Application of the Delta metric to a moving window of text, producing a curve showing how stylistic similarity to each candidate changes...

Explained in

Your journey to becoming a forensic professional starts here.

Practice with mock tests, learn from structured notes, and get your questions answered by a global forensic community, all in one place.