AI & ML Breaks Assumption

Provides a formal proof and empirical evidence that Transformers can learn symbolic rules entirely absent from training, debunking the 'stochastic parrot' interpolation-only hypothesis.

March 19, 2026

Original Paper

Transformers Can Learn Rules They've Never Seen: Proof of Computation Beyond Interpolation

Andy Gray

arXiv · 2603.17019

The Takeaway

It demonstrates that with multi-step unrolling, small Transformers can compute XOR and symbolic chains that are linearly inseparable and impossible to solve via nearest-neighbor interpolation. This suggests reasoning capabilities are emergent from architectural computation rather than just pattern memorization.

From the abstract

A central question in the LLM debate is whether transformers can infer rules absent from training, or whether apparent generalisation reduces to similarity-based interpolation over observed examples. We test a strong interpolation-only hypothesis in two controlled settings: one where interpolation is ruled out by construction and proof, and one where success requires emitting intermediate symbolic derivations rather than only final answers. In Experiment 1, we use a cellular automaton with a pur

Read the original paper →

← Back to today's papers