What are the main security risks of RAG?

The main risks are: knowledge base poisoning (injecting malicious content), indirect prompt injection via retrieved documents, data exfiltration through crafted queries, and missing access controls that let users access documents they should not see.

How do I secure a RAG pipeline?

Implement document-level access control, separate trusted and untrusted sources, sanitize retrieved content before passing to the LLM, filter sensitive data from outputs, and monitor for anomalous retrieval patterns. Treat retrieved documents as untrusted input.

Can RAG systems leak private documents?

Yes. If the RAG system does not enforce access control at retrieval time, a user can craft queries that retrieve documents they should not have access to. The LLM may then include content from these documents in its response.

RAG Security Risks — Retrieval-Augmented Generation Vulnerabilities

Retrieval-Augmented Generation (RAG) introduces unique security risks because it connects LLMs to external data sources. Attackers can poison the knowledge base with malicious content, use indirect prompt injection via retrieved documents, exfiltrate data through crafted queries, or manipulate retrieval rankings to surface attacker-controlled information.

Attack Vectors

Knowledge base poisoning: If the RAG system indexes user-generated content, emails, or web pages, attackers can inject content designed to manipulate LLM responses when retrieved. Indirect prompt injection: Hidden instructions in documents that activate when the document is retrieved and fed to the LLM. Data exfiltration: Crafted queries designed to retrieve and expose sensitive documents from the knowledge base. Retrieval manipulation: SEO-like techniques to ensure attacker-controlled documents rank highest for specific queries.

Common RAG Vulnerabilities

No access control on retrieved documents: The RAG system retrieves documents the user should not have access to. Mixing trusted and untrusted sources: User-uploaded documents are treated with the same trust level as curated knowledge. No output filtering: Retrieved sensitive data passes through to the LLM response without redaction. Embedding injection: Adversarial text designed to have high similarity to target queries in the embedding space.

Defense Strategies

Implement document-level access control that mirrors your existing permissions model. Separate trusted (curated) and untrusted (user-uploaded) document collections. Sanitize retrieved documents before injecting into the LLM prompt. Use output filtering to detect and redact sensitive data patterns. Monitor retrieval patterns for anomalies. Apply the principle of least privilege — only retrieve documents relevant to the user's role.

RAG Security Risks — Retrieval-Augmented Generation Vulnerabilities

Attack Vectors

Common RAG Vulnerabilities

Defense Strategies

Related Questions

Frequently Asked Questions