AI & ML Nature Is Weird

The 'black box' of in-context learning has been cracked open to reveal four distinct mechanical phases that switch based on data complexity.

April 17, 2026

Original Paper

Distinct mechanisms underlying in-context learning in transformers

arXiv · 2604.12151

The Takeaway

In-context learning (ICL) was previously treated as a mysterious emergent property of scale. This work identifies specialized multi-layer subcircuits that handle different algorithmic stages of learning on the fly. Depending on the diversity of the input, the model activates specific mechanical pathways to process information. Understanding these discrete 'gears' allows engineers to optimize models for specific types of few-shot tasks by targeting these internal circuits. It transforms our view of LLMs from monolithic predictors into complex, multi-modal algorithmic machines.

From the abstract

Modern distributed networks, notably transformers, acquire a remarkable ability (termed `in-context learning') to adapt their computation to input statistics, such that a fixed network can be applied to data from a broad range of systems. Here, we provide a complete mechanistic characterization of this behavior in transformers trained on a finite set $S$ of discrete Markov chains. The transformer displays four algorithmic phases, characterized by whether the network memorizes and generalizes, an

Read the original paper →

← Back to today's papers