Reduces reaction latency in flow-based VLA models by 10x, enabling real-time responsiveness on consumer GPUs.
March 20, 2026
Original Paper
FASTER: Rethinking Real-Time Flow VLAs
arXiv · 2603.19199
The Takeaway
Identifies that standard action-chunking schedules are the bottleneck for robot reaction time and introduces a Horizon-Aware Schedule that prioritizes immediate actions. This enables generalist policies to perform highly dynamic tasks like table tennis that were previously too latency-sensitive for large VLAs.
From the abstract
Real-time execution is crucial for deploying Vision-Language-Action (VLA) models in the physical world. Existing asynchronous inference methods primarily optimize trajectory smoothness, but neglect the critical latency in reacting to environmental changes. By rethinking the notion of reaction in action chunking policies, this paper presents a systematic analysis of the factors governing reaction time. We show that reaction time follows a uniform distribution determined jointly by the Time to Fir