F2IDiff: Super-resolution with Feature-to-Image Diffusion

Paper #Image Super-Resolution, Diffusion Models, Computer Vision 🔬 Research|Analyzed: Jan 3, 2026 09:26•

Published: Dec 30, 2025 21:37

•

1 min read

Analysis

This paper addresses the limitations of using text-to-image diffusion models for single image super-resolution (SISR) in real-world scenarios, particularly for smartphone photography. It highlights the issue of hallucinations and the need for more precise conditioning features. The core contribution is the introduction of F2IDiff, a model that uses lower-level DINOv2 features for conditioning, aiming to improve SISR performance while minimizing undesirable artifacts.

Key Takeaways

•Proposes F2IDiff, a novel SISR approach using DINOv2 features for improved conditioning.
•Addresses the limitations of using text-based features in SISR for high-fidelity images.
•Aims to reduce hallucinations and improve the quality of super-resolved images in real-world scenarios, especially for smartphone photography.

Reference / Citation

View Original

"The paper introduces an SISR network built on a FM with lower-level feature conditioning, specifically DINOv2 features, which we call a Feature-to-Image Diffusion (F2IDiff) Foundation Model (FM)."

ArXivDec 30, 2025 21:37

* Cited for critical analysis under Article 32.

Older

GPT-4 LLM simulates people well enough to replicate social science experiments

Newer

Brazil’s AI moment is here