Internal Guidance for Diffusion Transformers
Analysis
Key Takeaways
- •Proposes Internal Guidance (IG) as a novel method for improving diffusion model image generation.
- •IG uses auxiliary supervision during training and extrapolates intermediate layer outputs during sampling.
- •Achieves state-of-the-art FID scores on ImageNet 256x256, especially when combined with CFG.
- •Demonstrates improved training efficiency and generation quality compared to existing methods.
“LightningDiT-XL/1+IG achieves FID=1.34 which achieves a large margin between all of these methods. Combined with CFG, LightningDiT-XL/1+IG achieves the current state-of-the-art FID of 1.19.”