Large language model (LLM)
Definition
A neural network trained on large text corpora to predict likely next tokens. At inference, it generates text by sampling from probability distributions over vocabulary, producing output that is statistically smooth and consistent with its training distribution.
Related terms
- AI detection classifier
- A machine-learning system trained to discriminate between human-written and LLM-generated text by measuring features such as perplexity, burstiness, and n-gram probabilities. Current...
- Burstiness
- The variance in sentence-level complexity across a passage. Human writing tends to mix complex and simple sentences unevenly; LLM output tends to...
- Human-LLM collaboration
- Text production in which a human contributes prompts, editing decisions, and intentional content while an LLM generates or transforms the prose. The...
- Perplexity
- A measure of how surprising a sequence of words is to a language model. LLMs tend to generate low-perplexity text (predictable word...
- Stylometric signature
- The set of measurable linguistic features : function-word frequencies, sentence length, punctuation patterns, vocabulary richness : that characterises a specific writer's output...
Explained in
- AI-Generated Text: Authorship, Detection, and the New Evidential FrontierA neural network trained on large text corpora to predict likely next tokens. At inference, it generates text by sampling from probability distributions over v...