Accelerating Diffusion Transformers with Fidelity Optimization

Published:Dec 29, 2025 07:36
1 min read
ArXiv

Analysis

This paper addresses the slow inference speed of Diffusion Transformers (DiT) in image and video generation. It introduces a novel fidelity-optimization plugin called CEM (Cumulative Error Minimization) to improve the performance of existing acceleration methods. CEM aims to minimize cumulative errors during the denoising process, leading to improved generation fidelity. The method is model-agnostic, easily integrated, and shows strong generalization across various models and tasks. The results demonstrate significant improvements in generation quality, outperforming original models in some cases.

Reference

CEM significantly improves generation fidelity of existing acceleration models, and outperforms the original generation performance on FLUX.1-dev, PixArt-$α$, StableDiffusion1.5 and Hunyuan.