Instant 3D Scene Editing from Unposed Images
Published:Dec 31, 2025 18:59
•1 min read
•ArXiv
Analysis
This paper introduces Edit3r, a novel feed-forward framework for fast and photorealistic 3D scene editing directly from unposed, view-inconsistent images. The key innovation lies in its ability to bypass per-scene optimization and pose estimation, achieving real-time performance. The paper addresses the challenge of training with inconsistent edited images through a SAM2-based recoloring strategy and an asymmetric input strategy. The introduction of DL3DV-Edit-Bench for evaluation is also significant. This work is important because it offers a significant speed improvement over existing methods, making 3D scene editing more accessible and practical.
Key Takeaways
- •Edit3r is a feed-forward framework for instant 3D scene editing.
- •It works directly from unposed, view-inconsistent images.
- •It avoids per-scene optimization and pose estimation, enabling fast rendering.
- •It uses a SAM2-based recoloring strategy and an asymmetric input strategy for training.
- •The paper introduces DL3DV-Edit-Bench for evaluation.
Reference
“Edit3r directly predicts instruction-aligned 3D edits, enabling fast and photorealistic rendering without optimization or pose estimation.”