RAG Security Risks — Retrieval-Augmented Generation Vulnerabilities
Retrieval-Augmented Generation (RAG) introduces unique security risks because it connects LLMs to external data sources. Attackers can poison the knowledge base with malicious content, use indirect prompt injection via retrieved documents, exfiltrate data through crafted queries, or manipulate retrieval rankings to surface attacker-controlled information.
Attack Vectors
Knowledge base poisoning: If the RAG system indexes user-generated content, emails, or web pages, attackers can inject content designed to manipulate LLM responses when retrieved. Indirect prompt injection: Hidden instructions in documents that activate when the document is retrieved and fed to the LLM. Data exfiltration: Crafted queries designed to retrieve and expose sensitive documents from the knowledge base. Retrieval manipulation: SEO-like techniques to ensure attacker-controlled documents rank highest for specific queries.
Common RAG Vulnerabilities
No access control on retrieved documents: The RAG system retrieves documents the user should not have access to. Mixing trusted and untrusted sources: User-uploaded documents are treated with the same trust level as curated knowledge. No output filtering: Retrieved sensitive data passes through to the LLM response without redaction. Embedding injection: Adversarial text designed to have high similarity to target queries in the embedding space.
Defense Strategies
Implement document-level access control that mirrors your existing permissions model. Separate trusted (curated) and untrusted (user-uploaded) document collections. Sanitize retrieved documents before injecting into the LLM prompt. Use output filtering to detect and redact sensitive data patterns. Monitor retrieval patterns for anomalies. Apply the principle of least privilege — only retrieve documents relevant to the user's role.
Related Questions
- What Is Indirect Prompt Injection
- What Is Prompt Injection
- Llm Data Poisoning
- Llm Hallucination Security
Scan your system prompt with LochBot — free, client-side, no data sent anywhere.