View-on-Graph: Zero-Shot 3D Visual Grounding Using Vision-Language Reasoning
Published:Dec 10, 2025 00:59
•1 min read
•ArXiv
Analysis
The paper likely presents a novel approach to 3D visual grounding, allowing models to locate objects in 3D space without prior training on specific object-scene pairs. This zero-shot capability, based on vision-language reasoning on scene graphs, is a significant advancement in the field.
Key Takeaways
Reference
“The core of the research involves zero-shot 3D visual grounding.”