Transformers hide semantic meaning in quiet regions of their internal space to prevent it from interfering with the loud, high-variance signals of grammar.
AI models organize their internal representations like a radio with multiple frequencies. Grammar and syntax occupy the high-volume signals, while the actual meaning of words is tucked into low-variance spectral regions. This separation allows the model to manipulate the point of a sentence without breaking the grammatical rules. It explains why models are so good at being fluent even when they are talking nonsense. Practitioners can use this dual geometry to steer models more precisely by targeting the quiet concept regions specifically.
Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations
arXiv · 2605.01609
We test whether the causal inner product of \citet{park2024linear} -- defined by the unembedding covariance $\Sigma$ -- enables cross-lingual concept transport. Across 17 models and 4 language pairs, a matched-spectrum randomization test finds that Whitened Causal Alignment is indistinguishable from spectral regularization alone ($p = 0.95$). However, this failure reveals a broader phenomenon: anti-concentration is observed in residual-stream difference-of-means vectors across five architecture