SE360: Semantic Edit in 360° Panoramas via Hierarchical Data Construction
Published:Dec 24, 2025 05:00
•1 min read
•ArXiv Vision
Analysis
This paper introduces SE360, a novel framework for semantically editing 360° panoramas. The core innovation lies in its autonomous data generation pipeline, which leverages a Vision-Language Model (VLM) and adaptive projection adjustment to create semantically meaningful and geometrically consistent data pairs from unlabeled panoramas. The two-stage data refinement strategy further enhances realism and reduces overfitting. The method's ability to outperform existing methods in visual quality and semantic accuracy suggests a significant advancement in instruction-based image editing for panoramic images. The use of a Transformer-based diffusion model trained on the constructed dataset enables flexible object editing guided by text, mask, or reference image, making it a versatile tool for panorama manipulation.
Key Takeaways
- •Introduces SE360, a framework for semantic editing of 360° panoramas.
- •Employs an autonomous data generation pipeline using VLM and adaptive projection.
- •Achieves improved visual quality and semantic accuracy compared to existing methods.
Reference
“"At its core is a novel coarse-to-fine autonomous data generation pipeline without manual intervention."”