We found a 10-dimensional 'dark manifold' in LLMs that acts as a dial for hallucinations.
April 14, 2026
Original Paper
Mechanistic Steering of Low-Variance Subspaces for Improved Factual Calibration in Large Language Models
SSRN · 6450939
The Takeaway
By identifying this specific activation subspace, researchers reduced hallucination rates from 14.2% to 2.1%. It proves that factual uncertainty is a distinct, steerable signal within the model's internal geometry.
From the abstract
The reliability of Large Language Models (LLMs) in high-stakes extraction tasks is limited by stochastic hallucination. We investigate the hypothesis that specific low-variance activation subspaces-systematically suppressed by Layer Normalization-encode latent signals of epistemic uncertainty. Utilizing Singular Value Decomposition (SVD) on hidden states from a 32Bparameter transformer, we identify a 10-dimensional "dark" manifold where activation variance correlates with factual inconsistency.