Introduces the Compression-Consistency Principle, arguing that LLMs prefer truth only when false alternatives are structurally harder to compress.
March 13, 2026
Original Paper
Compression Favors Consistency, Not Truth: When and Why Language Models Prefer Correct Information
arXiv · 2603.11749
The Takeaway
This shifts the understanding of 'truth' in LLMs from a semantic property to a structural byproduct of compression efficiency. It explains why truth bias emerges from next-token prediction and provides a roadmap for improving model accuracy through architectural constraints rather than just data volume.
From the abstract
Why do language models sometimes prefer correct statements even when trained on mixed-quality data? We introduce the Compression--Consistency Principle: next-token prediction favors hypotheses that allow shorter and more internally consistent descriptions of the training data. Truth bias emerges only when false alternatives are structurally harder to compress. We test this using small GPT-2-style character-level transformers (3.5M--86M parameters) on synthetic math corpora with controlled mixtur