Large language models store information about different people or objects in separate, orthogonal "slots" within a single token's activation.
April 24, 2026
Original Paper
Slot Machines: How LLMs Keep Track of Multiple Entities
arXiv · 2604.21139
The Takeaway
Neural networks organize data into a mechanical structure that keeps the current entity and the previous entity from getting mixed up. Even though the model holds both pieces of information, it can only access factual memories related to the specific slot it is currently focusing on. This explains why an AI might know a fact in one part of a sentence but fail to use it just a few words later. Engineers previously viewed model memory as a giant soup of associations, but it functions more like a filing cabinet with strict access rules. Understanding these slots allows for more precise control over how models retrieve and apply knowledge during complex tasks. This mechanical insight could lead to better ways of training AI to track multiple actors in a story.
From the abstract
Language models must bind entities to the attributes they possess and maintain several such binding relationships within a context. We study how multiple entities are represented across token positions and whether single tokens can carry bindings for more than one entity. We introduce a multi-slot probing approach that disentangles a single token's residual stream activation to recover information about both the currently described entity and the immediately preceding one. These two kinds of inf