RSM achieves 20x faster training for recursive reasoning models and enables test-time scaling for up to 20,000 refinement steps.
March 18, 2026
Original Paper
Form Follows Function: Recursive Stem Model
arXiv · 2603.15641
AI-generated illustration
The Takeaway
It solves the high training cost of recursive models by detaching history and using stochastic depth, allowing small networks to solve NP-hard puzzles with massive test-time compute scaling without retraining.
From the abstract
Recursive reasoning models such as Hierarchical Reasoning Model (HRM) and Tiny Recursive Model (TRM) show that small, weight-shared networks can solve compute-heavy and NP puzzles by iteratively refining latent states, but their training typically relies on deep supervision and/or long unrolls that increase wall-clock cost and can bias the model toward greedy intermediate behavior. We introduce Recursive Stem Model (RSM), a recursive reasoning approach that keeps the TRM-style backbone while cha