AI & ML New Capability

Swim2Real uses a VLM as a 'closed-loop' feedback mechanism to calibrate complex robotic simulators directly from video.

March 24, 2026

Original Paper

Swim2Real: VLM-Guided System Identification for Sim-to-Real Transfer

Kevin Qiu, Kyle Walker, Mike Y. Michelis, Marek Cygan, Josie Hughes

arXiv · 2603.20827

The Takeaway

It replaces manually tailored Sim-to-Real pipelines with an automated VLM feedback loop that interprets visual discrepancies to tune 16 simulation parameters. This enables zero-shot transfer for soft aquatic robots where traditional fluid dynamics modeling usually fails.

From the abstract

We present Swim2Real, a pipeline that calibrates a 16-parameter robotic fish simulator from swimming videos using vision-language model (VLM) feedback, requiring no hand-designed search stages. Calibrating soft aquatic robots is particularly challenging because nonlinear fluid-structure coupling makes the parameter landscape chaotic, simplified fluid models introduce a persistent sim-to-real gap, and controlled aquatic experiments are difficult to reproduce. Prior work on this platform required