AI & ML Breaks Assumption

Shows that simple sequential fine-tuning with LoRA outperforms complex algorithms for continual reinforcement learning in VLA models.

March 13, 2026

Original Paper

Simple Recipe Works: Vision-Language-Action Models are Natural Continual Learners with Reinforcement Learning

Jiaheng Hu, Jay Shim, Chen Tang, Yoonchang Sung, Bo Liu, Peter Stone, Roberto Martin-Martin

arXiv · 2603.11653

The Takeaway

It de-bunks the assumption that large embodied models require sophisticated 'continual learning' strategies to avoid catastrophic forgetting. This simplifies the recipe for building self-improving robots, showing that scale and parameter-efficient tuning naturally stabilize the learning process.

From the abstract

Continual Reinforcement Learning (CRL) for Vision-Language-Action (VLA) models is a promising direction toward self-improving embodied agents that can adapt in openended, evolving environments. However, conventional wisdom from continual learning suggests that naive Sequential Fine-Tuning (Seq. FT) leads to catastrophic forgetting, necessitating complex CRL strategies. In this work, we take a step back and conduct a systematic study of CRL for large pretrained VLAs across three models and five c

Read the original paper →

← Back to today's papers