Adversarial Attacks on Text-to-Video Models

Published:Dec 30, 2025 03:00
1 min read
ArXiv

Analysis

This paper addresses a critical, yet under-explored, area of research: the adversarial robustness of Text-to-Video (T2V) diffusion models. It introduces a novel framework, T2VAttack, to evaluate and expose vulnerabilities in these models. The focus on both semantic and temporal aspects, along with the proposed attack methods (T2VAttack-S and T2VAttack-I), provides a comprehensive approach to understanding and mitigating these vulnerabilities. The evaluation on multiple state-of-the-art models is crucial for demonstrating the practical implications of the findings.

Reference

Even minor prompt modifications, such as the substitution or insertion of a single word, can cause substantial degradation in semantic fidelity and temporal dynamics, highlighting critical vulnerabilities in current T2V diffusion models.