SeriesFusion
Science, curated & edited by AI
Nature Is Weird  /  AI

Disgust is processed far more weakly and diffusely in large language models than any other primary emotion.

Language models process emotions in a distinct three-phase flow that only clarifies in the final layers. Internal circuitry for disgust is surprisingly shallow compared to the dense representations of joy or anger. This uneven emotional state suggests that AI is naturally biased toward certain human sentiments during reasoning tasks. Mapping these circuits shows exactly where the model switches from raw syntax to emotional understanding. Developers can use these insights to fine-tune specific emotional sensitivities or debug why a model fails to grasp subtle social cues.

Original Paper

From Syntax to Emotion: A Mechanistic Analysis of Emotion Inference in LLMs

Bangzhao Shu, Arinjay Singh, Mai ElSherief

arXiv  ·  2604.25866

Large language models (LLMs) are increasingly used in emotionally sensitive human-AI applications, yet little is known about how emotion recognition is internally represented. In this work, we investigate the internal mechanisms of emotion recognition in LLMs using sparse autoencoders (SAEs). By analyzing sparse feature activations across layers, we identify a consistent three-phase information flow, in which emotion-related features emerge only in the final phase. We further show that emotion r