SpatialMosaic: A Dataset for Multi-View Spatial Reasoning with Partial Visibility
Analysis
Key Takeaways
- •Addresses the limitations of existing MLLMs in handling partial visibility and occlusion.
- •Introduces a new dataset (SpatialMosaic) and benchmark (SpatialMosaic-Bench) for multi-view spatial reasoning.
- •Proposes a hybrid framework (SpatialMosaicVLM) to integrate 3D reconstruction models.
- •Focuses on scalability and real-world applicability.
“The paper introduces SpatialMosaic, a comprehensive instruction-tuning dataset featuring 2M QA pairs, and SpatialMosaic-Bench, a challenging benchmark for evaluating multi-view spatial reasoning under realistic and challenging scenarios, consisting of 1M QA pairs across 6 tasks.”