Adversarial Attacks on Text-to-Video Models
Analysis
Key Takeaways
- •Introduces T2VAttack, a framework for adversarial attacks on Text-to-Video models.
- •Focuses on both semantic and temporal aspects of video generation.
- •Proposes two attack methods: T2VAttack-S (synonym substitution) and T2VAttack-I (word insertion).
- •Evaluates the adversarial robustness of several state-of-the-art T2V models.
- •Demonstrates that even small prompt modifications can significantly degrade video quality.
“Even minor prompt modifications, such as the substitution or insertion of a single word, can cause substantial degradation in semantic fidelity and temporal dynamics, highlighting critical vulnerabilities in current T2V diffusion models.”