Multimodal Concept Erasure Benchmark for Diffusion Models

Published:Dec 28, 2025 10:58
1 min read
ArXiv

Analysis

This paper introduces M-ErasureBench, a novel benchmark for evaluating concept erasure methods in diffusion models across multiple input modalities (text, embeddings, latents). It highlights the limitations of existing methods, particularly when dealing with modalities beyond text prompts, and proposes a new method, IRECE, to improve robustness. The work is significant because it addresses a critical vulnerability in generative models related to harmful content generation and copyright infringement, offering a more comprehensive evaluation framework and a practical solution.

Reference

Existing methods achieve strong erasure performance against text prompts but largely fail under learned embeddings and inverted latents, with Concept Reproduction Rate (CRR) exceeding 90% in the white-box setting.