AI & ML Scaling Insight

Discovers that LLM hidden states undergo geometric 'warping' at digit-count boundaries, mimicking human psychological perception.

March 31, 2026

Original Paper

Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries

Jon-Paul Cacioli

arXiv · 2603.28258

The Takeaway

The study proves that structural tokenization discontinuities (like 9 to 10) create discrete 'category' boundaries in the model's internal geometry. This insight challenges the idea that LLMs treat numbers as continuous values and reveals how architectural choices dictate semantic representation independently of training data.

From the abstract

Categorical perception (CP) -- enhanced discriminability at category boundaries -- is among the most studied phenomena in perceptual psychology. This paper reports that analogous geometric warping occurs in the hidden-state representations of large language models (LLMs) processing Arabic numerals. Using representational similarity analysis across six models from five architecture families, the study finds that a CP-additive model (log-distance plus a boundary boost) fits the representational ge

Read the original paper →

← Back to today's papers