AI & ML Nature Is Weird

LLMs have a 'semantic bottleneck' where they think in a universal language that is independent of English, French, or Chinese.

April 16, 2026

Original Paper

LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety

arXiv · 2604.12710

The Takeaway

This research identifies a specific intermediate layer in LLMs where the internal representation of an idea is the same, regardless of the language used to input it. This 'semantic bottleneck' is a 'language of thought' inside the machine. This discovery is a breakthrough for AI safety and translation; if we can align models at this bottleneck, we can create safety guards that work across all languages simultaneously. It suggests that LLMs aren't just translating; they are mapping everything to a universal conceptual space. For practitioners, this unlocks 'language-agnostic' alignment, where you only need to train a safety filter once to protect the model in 100 different languages.

From the abstract

Large language models (LLMs) often demonstrate strong safety performance in high-resource languages, yet exhibit severe vulnerabilities when queried in low-resource languages. We attribute this gap to a mismatch between language-agnostic semantic understanding ability and language-dominant safety alignment biased toward high-resource languages. Consistent with this hypothesis, we empirically identify the semantic bottleneck in LLMs, an intermediate layer in which the geometry of model representa

Read the original paper →

← Back to today's papers