Proposes replacing backpropagation with recursive Bayesian filtering for training dynamical systems and Transformers.
March 17, 2026
Original Paper
From Gradients to Riccati Geometry: Kalman World Models for Single-Pass Learning
arXiv · 2603.13423
The Takeaway
Kalman World Models (KWM) offer a gradient-free training paradigm where parameter updates are treated as Kalman gain adaptations. This provides a principled path for online, continual adaptation and improved stability compared to standard reverse-mode differentiation.
From the abstract
Backpropagation dominates modern machine learning, yet it is not the only principled method for optimizing dynamical systems. We propose Kalman World Models (KWM), a class of learned state-space models trained via recursive Bayesian filtering rather than reverse-mode automatic differentiation. Instead of gradient descent updates, we replace parameter learning with Kalman-style gain adaptation. Training becomes online filtering; error signals become innovations. We further extend this framework t