Achieves 'zero forgetting' in continual learning by stacking frozen domain-specific MoE-LoRA adapters with a meta-router.
April 2, 2026
Original Paper
Brainstacks: Cross-Domain Cognitive Capabilities via Frozen MoE-LoRA Stacks for Continual LLM Learning
arXiv · 2604.01152
The Takeaway
Instead of fine-tuning or simple LoRA, it treats domain expertise as additive, combinable 'brain stacks' that allow models to gain new capabilities without ever degrading on previous ones, breaking the traditional stability-plasticity trade-off.
From the abstract
We present Brainstacks, a modular architecture for continual multi-domain fine-tuning of large language models that packages domain expertise as frozen adapter stacks composing additively on a shared frozen base at inference. Five interlocking components: (1) MoE-LoRA with Shazeer-style noisy top-2 routing across all seven transformer projections under QLoRA 4-bit quantization with rsLoRA scaling; (2) an inner loop performing residual boosting by freezing trained stacks and adding new ones; (3)