AI & ML New Capability

MotionAnymesh automatically transforms static 3D meshes into simulation-ready, articulated digital twins for robotics using vision-language models grounded in physical priors.

March 16, 2026

Original Paper

MotionAnymesh: Physics-Grounded Articulation for Simulation-Ready Digital Twins

WenBo Xu, Liu Liu, Li Zhang, Dan Guo, RuoNan Liu

arXiv · 2603.12936

The Takeaway

Converting static assets to interactable ones is a massive bottleneck in robotics simulation; this zero-shot framework eliminates manual rigging and avoids mesh inter-penetration during simulation, accelerating the creation of training environments for embodied AI.

From the abstract

Converting static 3D meshes into interactable articulated assets is crucial for embodied AI and robotic simulation. However, existing zero-shot pipelines struggle with complex assets due to a critical lack of physical grounding. Specifically, ungrounded Vision-Language Models (VLMs) frequently suffer from kinematic hallucinations, while unconstrained joint estimation inevitably leads to catastrophic mesh inter-penetration during physical simulation. To bridge this gap, we propose MotionAnymesh,