MotionAnymesh automatically transforms static 3D meshes into simulation-ready, articulated digital twins for robotics using vision-language models grounded in physical priors.
March 16, 2026
Original Paper
MotionAnymesh: Physics-Grounded Articulation for Simulation-Ready Digital Twins
arXiv · 2603.12936
The Takeaway
Converting static assets to interactable ones is a massive bottleneck in robotics simulation; this zero-shot framework eliminates manual rigging and avoids mesh inter-penetration during simulation, accelerating the creation of training environments for embodied AI.
From the abstract
Converting static 3D meshes into interactable articulated assets is crucial for embodied AI and robotic simulation. However, existing zero-shot pipelines struggle with complex assets due to a critical lack of physical grounding. Specifically, ungrounded Vision-Language Models (VLMs) frequently suffer from kinematic hallucinations, while unconstrained joint estimation inevitably leads to catastrophic mesh inter-penetration during physical simulation. To bridge this gap, we propose MotionAnymesh,