Retriever Backdoors Pose a Practical Threat to Code Generation

Published:Dec 25, 2025 13:53
1 min read
ArXiv

Analysis

This paper highlights a critical and previously underexplored security vulnerability in Retrieval-Augmented Code Generation (RACG) systems. It introduces a novel and stealthy backdoor attack targeting the retriever component, demonstrating that existing defenses are insufficient. The research reveals a significant risk of generating vulnerable code, emphasizing the need for robust security measures in software development.

Reference

By injecting vulnerable code equivalent to only 0.05% of the entire knowledge base size, an attacker can successfully manipulate the backdoored retriever to rank the vulnerable code in its top-5 results in 51.29% of cases.