SpectralGuard identifies a 'memory collapse' vulnerability in State Space Models (like Mamba) where adversarial inputs can drive the transition operator's spectral radius to zero.
March 16, 2026
Original Paper
SpectralGuard: Detecting Memory Collapse Attacks in State Space Models
arXiv · 2603.12414
The Takeaway
It reveals a silent failure mode in recurrent foundation models where reasoning capacity can be destroyed without affecting output fluency. The proposed real-time monitor provides a necessary security layer for deploying SSMs in production.
From the abstract
State Space Models (SSMs) such as Mamba achieve linear-time sequence processing through input-dependent recurrence, but this mechanism introduces a critical safety vulnerability. We show that the spectral radius rho(A-bar) of the discretized transition operator governs effective memory horizon: when an adversary drives rho toward zero through gradient-based Hidden State Poisoning, memory collapses from millions of tokens to mere dozens, silently destroying reasoning capacity without triggering o