Large language models are physically incapable of being random, even when you give them complete nonsense as an input.
April 29, 2026
Original Paper
The Randomness Floor: Measuring Intrinsic Non-Randomness in Language Model Token Distributions
arXiv · 2604.22771
The Takeaway
Researchers found that 88% to 93% of a model's token distribution is determined by its internal weights rather than the user's prompt. This randomness floor means that models have a permanent, hard-coded preference for certain words and patterns. No matter what you ask or how much noise you add, the model will always revert to these built-in statistical biases. This discovery proves that LLMs are not flexible mirrors of their input but are rigidly biased by their training. It explains why AI personalities are so difficult to change through simple prompting.
From the abstract
Language models cannot be random. This paper introduces Entropic Deviation (ED), the normalised KL divergence between a model's token distribution and the uniform distribution, and measures it systematically across 31,200 generations spanning seven models, two architectures (transformer and state space), nine prompt categories, three temperatures, and five languages. Under semantically neutral prompts (empty strings, random characters, nonsense syllables) transformers still exhibit ED of approxi