Real-Time Interactive Human Avatars with Streaming Diffusion Models
Published:Dec 26, 2025 15:41
•1 min read
•ArXiv
Analysis
This paper addresses the challenge of creating real-time, interactive human avatars, a crucial area in digital human research. It tackles the limitations of existing diffusion-based methods, which are computationally expensive and unsuitable for streaming, and the restricted scope of current interactive approaches. The proposed two-stage framework, incorporating autoregressive adaptation and acceleration, along with novel components like Reference Sink and Consistency-Aware Discriminator, aims to generate high-fidelity avatars with natural gestures and behaviors in real-time. The paper's significance lies in its potential to enable more engaging and realistic digital human interactions.
Key Takeaways
- •Addresses the limitations of existing diffusion-based avatar generation methods for real-time applications.
- •Proposes a novel two-stage framework for efficient and high-quality avatar generation.
- •Introduces key components like Reference Sink and Consistency-Aware Discriminator to ensure stability and consistency.
- •Achieves state-of-the-art performance in generation quality, real-time efficiency, and interaction naturalness.
Reference
“The paper proposes a two-stage autoregressive adaptation and acceleration framework to adapt a high-fidelity human video diffusion model for real-time, interactive streaming.”