Learning Surgical Robot Policies from Videos via World Modeling

Published:Dec 29, 2025 03:03
1 min read
ArXiv

Analysis

This paper addresses the data scarcity problem in surgical robotics by leveraging unlabeled surgical videos and world modeling. It introduces SurgWorld, a world model for surgical physical AI, and uses it to generate synthetic paired video-action data. This approach allows for training surgical VLA policies that outperform models trained on real demonstrations alone, offering a scalable path towards autonomous surgical skill acquisition.

Reference

“We demonstrate that a surgical VLA policy trained with these augmented data significantly outperforms models trained only on real demonstrations on a real surgical robot platform.”