AI & ML Nature Is Weird

You can now 'flip a switch' inside an LLM to shift its personality from neurotic to agreeable.

April 14, 2026

Original Paper

Psychological Concept Neurons: Can Neural Control Bias Probing and Shift Generation in LLMs?

Yuto Harada, Hiro Taiyo Hamada

arXiv · 2604.11802

The Takeaway

Researchers identified specific 'concept neurons' for the Big Five personality traits. By manually stimulating these, practitioners can causally steer model output, proving complex psychological traits are localized rather than just diffuse emergent patterns.

From the abstract

Using psychological constructs such as the Big Five, large language models (LLMs) can imitate specific personality profiles and predict a user's personality. While LLMs can exhibit behaviors consistent with these constructs, it remains unclear where and how they are represented inside the model and how they relate to behavioral outputs. To address this gap, we focus on questionnaire-operationalized Big Five concepts, analyze the formation and localization of their internal representations, and u

Read the original paper →

← Back to today's papers