AI & ML Paradigm Shift

ICPRL enables vision-language models to acquire physical intuition and adapt their policies in-context through trial-and-error interaction.

March 17, 2026

Original Paper

ICPRL: Acquiring Physical Intuition from Interactive Control

Xinrun Xu, Pi Bu, Ye Wang, Börje F. Karlsson, Ziming Wang, Tengtao Song, Qi Zhu, Jun Song, Shuo Zhang, Zhiming Ding, Bo Zheng

arXiv · 2603.13295

The Takeaway

It moves physical reasoning from static perception to active, vision-grounded reinforcement learning that doesn't require weight updates. This allows robots or agents to adapt to novel physical puzzles and environments purely through interaction history.

From the abstract

VLMs excel at static perception but falter in interactive reasoning in dynamic physical environments, which demands planning and adaptation to dynamic outcomes. Existing physical reasoning methods often depend on abstract symbolic inputs or lack the ability to learn and adapt from direct, pixel-based visual interaction in novel scenarios. We introduce ICPRL (In-Context Physical Reinforcement Learning), a framework inspired by In-Context Reinforcement Learning (ICRL) that empowers VLMs to acquire