AI & ML New Capability

SOMA provides a plug-and-play memory and orchestration system that increases Vision-Language-Action (VLA) robot success rates by over 50% without fine-tuning.

March 26, 2026

Original Paper

SOMA: Strategic Orchestration and Memory-Augmented System for Vision-Language-Action Model Robustness via In-Context Adaptation

Zhuoran Li, Zhiyang Li, Kaijun Zhou, Jinyu Gu

arXiv · 2603.24060

The Takeaway

Current VLAs are often brittle 'one-shot' controllers; SOMA introduces long-term memory and failure attribution through RAG and the Model Context Protocol (MCP). It allows robots to learn from their own environment interactions and adapt to out-of-distribution tasks in real-time.

From the abstract

Despite the promise of Vision-Language-Action (VLA) models as generalist robotic controllers, their robustness against perceptual noise and environmental variations in out-of-distribution (OOD) tasks remains fundamentally limited by the absence of long-term memory, causal failure attribution, and dynamic intervention capability. To address this, we propose SOMA, a Strategic Orchestration and Memory-Augmented System that upgrades frozen VLA policies for robust in-context adaptation without parame