Skip to content

Function words

Definition

Grammatical words, prepositions, conjunctions, articles, pronouns, with little independent content meaning but high frequency in any text. Because they are used without conscious thought, their distributional patterns across a corpus are more resistant to stylistic manipulation than content-word choices.

Examples
Articles (the, a), prepositions (in, on, for), conjunctions (and, but, or), pronouns (he, she, it)
Frequency
Used at high frequency across any text regardless of topic or style
Attribution strength
Among the strongest features available for authorship identification

Common questions

Why are function words useful for figuring out who wrote something?+

Function words like articles, prepositions, and conjunctions are used unconsciously and at high frequency in any text. Because people don't think about using them, they're harder to fake or copy from someone else, making them strong markers of how a particular person writes.

Can someone change their function words to hide their writing style?+

Not easily. Unlike content words, function words are grammatically required and largely automatic. People can deliberately choose interesting adjectives or avoid certain topics, but they can't easily change their unconscious use of words like 'the', 'and', 'but', or 'in' across a whole text.

What's the difference between function words and content words?+

Function words are grammatical tools like prepositions, conjunctions, articles, and pronouns. They have little meaning on their own but are essential for structure. Content words are nouns, verbs, adjectives, and adverbs that carry the main ideas. Function words appear in every text; content words depend on the topic.

Related terms

Idiolect
The language variety specific to an individual, comprising their characteristic vocabulary, syntactic preferences, spelling habits, punctuation patterns, and discourse-level style. Authorship attribution...
Closed-set attribution
An attribution task where the true author is assumed to be one of a defined list of candidates. The system ranks candidates;...
Corpus
A principled, structured collection of texts or transcripts used as the basis for systematic frequency analysis. In forensic work a comparison corpus...
Dialect
A variety of language defined by a geographic region or social group, characterised by systematic differences in pronunciation, vocabulary, and grammar from...
Discourse structure
The way a text or conversation is organised above the sentence level: the sequence of moves in an argument, the turn-taking structure...
Feature extraction
The process of converting raw text into a numerical vector of linguistic measurements. The choice of features determines what signal the classifier...
N-gram
A contiguous sequence of n items (characters, words, or part-of-speech tags) extracted from text. Character n-grams and word n-grams are both standard...
Open-set attribution
An attribution task where the true author may or may not appear in the candidate pool. The system must both rank candidates...
Register
The variety of language associated with a particular situation, task, or relationship. Register varies along dimensions of formality, technicality, and interactional mode....

Explained in these topics

Your journey to becoming a forensic professional starts here.

Practice with mock tests, learn from structured notes, and get your questions answered by a global forensic community, all in one place.