Universal Targeted Attack on Audio-Language Models
Analysis
Key Takeaways
- •Identifies a vulnerability in audio-language models at the encoder level.
- •Proposes a universal, targeted, latent-space attack.
- •Attack generalizes across inputs and speakers.
- •Demonstrates high attack success rates with minimal distortion.
- •Highlights a previously underexplored attack surface.
“The paper demonstrates consistently high attack success rates with minimal perceptual distortion, revealing a critical and previously underexplored attack surface at the encoder level of multimodal systems.”