Researchers identified a 'critique vector' in the latent space of Large Reasoning Models that can be steered to improve self-correction and test-time scaling.
March 18, 2026
Original Paper
Decoding the Critique Mechanism in Large Reasoning Models
arXiv · 2603.16331
The Takeaway
It moves beyond treating the 'Chain of Thought' as a black box and identifies an internal mechanism for error detection. This allows for improving model reasoning performance at inference time without additional training cost by steering latent representations.
From the abstract
Large Reasoning Models (LRMs) exhibit backtracking and self-verification mechanisms that enable them to revise intermediate steps and reach correct solutions, yielding strong performance on complex logical benchmarks. We hypothesize that such behaviors are beneficial only when the model has sufficiently strong "critique" ability to detect its own mistakes. This work systematically investigates how current LRMs recover from errors by inserting arithmetic mistakes in their intermediate reasoning s