AI & ML New Capability

Moves autonomous driving from 'predict-then-plan' to an interleaved VLA model where future frames and ego-actions are generated step-by-step.

March 31, 2026

Original Paper

Uni-World VLA: Interleaved World Modeling and Planning for Autonomous Driving

Qiqi Liu, Huan Xu, Jingyu Li, Bin Sun, Zhihui Hao, Dangen She, Xiatian Zhu, Li Zhang

arXiv · 2603.27287

The Takeaway

This tight coupling prevents the common 'imagination drift' in world models by ensuring planning is continuously conditioned on an evolving future, enabling adaptive decisions in dynamic traffic that open-loop systems fail to handle.

From the abstract

Autonomous driving requires reasoning about how the environment evolves and planning actions accordingly. Existing world-model-based approaches typically predict future scenes first and plan afterwards, resulting in open-loop imagination that may drift from the actual decision process. In this paper, we present Uni-World VLA, a unified vision-language-action (VLA) model that tightly interleaves future frame prediction and trajectory planning. Instead of generating a full world rollout before pla

Read the original paper →

← Back to today's papers