Internal Guidance for Diffusion Transformers
Published:Dec 30, 2025 12:16
•1 min read
•ArXiv
Analysis
This paper introduces a novel guidance strategy, Internal Guidance (IG), for diffusion models to improve image generation quality. It addresses the limitations of existing guidance methods like Classifier-Free Guidance (CFG) and methods relying on degraded versions of the model. The proposed IG method uses auxiliary supervision during training and extrapolates intermediate layer outputs during sampling. The results show significant improvements in both training efficiency and generation quality, achieving state-of-the-art FID scores on ImageNet 256x256, especially when combined with CFG. The simplicity and effectiveness of IG make it a valuable contribution to the field.
Key Takeaways
- •Proposes Internal Guidance (IG) as a novel method for improving diffusion model image generation.
- •IG uses auxiliary supervision during training and extrapolates intermediate layer outputs during sampling.
- •Achieves state-of-the-art FID scores on ImageNet 256x256, especially when combined with CFG.
- •Demonstrates improved training efficiency and generation quality compared to existing methods.
Reference
“LightningDiT-XL/1+IG achieves FID=1.34 which achieves a large margin between all of these methods. Combined with CFG, LightningDiT-XL/1+IG achieves the current state-of-the-art FID of 1.19.”