Research#Generative Models📝 BlogAnalyzed: Dec 29, 2025 01:43

Paper Reading: Back to Basics - Let Denoising Generative

Published:Nov 26, 2025 06:37
1 min read
Zenn CV

Analysis

This article discusses a research paper by Tianhong Li and Kaming He that addresses the challenges of creating self-contained models in pixel space due to the high dimensionality of noise prediction. The authors propose shifting focus to predicting the image itself, leveraging the properties of low-dimensional manifolds. They found that directly predicting images in high-dimensional space and then compressing them to lower dimensions leads to improved accuracy. The motivation stems from limitations in current diffusion models, particularly concerning the latent space provided by VAEs and the prediction of noise or flow at each time step.

Reference

The authors propose shifting focus to predicting the image itself, leveraging the properties of low-dimensional manifolds.