Research Paper#Computer Vision, Image Processing, Intrinsic Image Decomposition, Transformers🔬 ResearchAnalyzed: Jan 3, 2026 16:01
IDT: Multi-View Intrinsic Decomposition with a Physically Grounded Transformer
Published:Dec 29, 2025 18:24
•1 min read
•ArXiv
Analysis
This paper introduces IDT, a novel feed-forward transformer-based framework for multi-view intrinsic image decomposition. It addresses the challenge of view inconsistency in existing methods by jointly reasoning over multiple input images. The use of a physically grounded image formation model, decomposing images into diffuse reflectance, diffuse shading, and specular shading, is a key contribution, enabling interpretable and controllable decomposition. The focus on multi-view consistency and the structured factorization of light transport are significant advancements in the field.
Key Takeaways
- •Proposes IDT, a feed-forward transformer for multi-view intrinsic image decomposition.
- •Employs a physically grounded image formation model for interpretable decomposition.
- •Achieves improved multi-view consistency compared to prior methods.
- •Decomposes images into diffuse reflectance, diffuse shading, and specular shading.
Reference
“IDT produces view-consistent intrinsic factors in a single forward pass, without iterative generative sampling.”