Paper Reading: Back to Basics - Let Denoising Generative
Analysis
This article discusses a research paper by Tianhong Li and Kaming He that addresses the challenges of creating self-contained models in pixel space due to the high dimensionality of noise prediction. The authors propose shifting focus to predicting the image itself, leveraging the properties of low-dimensional manifolds. They found that directly predicting images in high-dimensional space and then compressing them to lower dimensions leads to improved accuracy. The motivation stems from limitations in current diffusion models, particularly concerning the latent space provided by VAEs and the prediction of noise or flow at each time step.
Key Takeaways
- •The research explores an alternative approach to generative modeling by directly predicting images.
- •The study highlights the challenges of high-dimensional noise prediction in pixel space.
- •The findings suggest that compressing high-dimensional image predictions to lower dimensions can improve accuracy.
“The authors propose shifting focus to predicting the image itself, leveraging the properties of low-dimensional manifolds.”