Cross-validation
Definition
A validation strategy that repeatedly partitions a labelled corpus into training and test subsets to estimate the method's error rate on unseen data. A key safeguard against over-fitting.
Related terms
- Burrows's Delta
- A distance metric introduced by John Burrows in 2002 that measures how far a text's standardised word-frequency profile deviates from the mean...
- Closed-corpus trap
- The error of validating a stylometric method exclusively on texts from authors whose profiles were used to build the model, inflating apparent...
- Cosine Delta
- A variant of Burrows's Delta that uses cosine similarity instead of Manhattan distance. Research by Argamon and later Eder has shown it...
- Principal component analysis (PCA)
- A multivariate statistical method used in fire debris research to reduce chromatographic data matrices to principal components that capture major variance. Used...
- Rolling Delta
- Application of the Delta metric to a moving window of text, producing a curve showing how stylistic similarity to each candidate changes...
Explained in
- Stylometry and Statistical Distance MethodsA validation strategy that repeatedly partitions a labelled corpus into training and test subsets to estimate the method's error rate on unseen data. A key saf...