Data, RAG & Memory Security
Data is the lifeblood of agentic systems—and the primary target for attackers. Protect it through classification, access controls, and integrity verification.
8.1 Data Classification and Access Control
Define and enforce a simple classification scheme, e.g.:
- Public
- Internal
- Confidential
- Regulated (GDPR, HIPAA, PCI, local data-protection laws)
For each class, define:
- Who can access it (roles, departments, tenants)
- Where it may be processed (on-prem / specific regions)
- How long it may be retained
Implementation
- Record/document-level filters:
- Always attach tenant/user context to queries.
- Enforce filters server-side (never trust client or model to self-filter).
- Avoid cross-tenant RAG indices when possible.
- If shared, enforce tenant filters in the query and re-check in code.
8.2 RAG Integrity and Indirect Prompt Injection (XPIA) Defenses
RAG content is a persistent indirect prompt injection vector. When agents retrieve and process documents, those documents can contain malicious instructions that hijack agent behavior. Hardening both ingestion and retrieval is critical.
Ingestion Controls
- Restrict who can edit high-impact corpora (e.g., configuration, policy docs).
- Require approvals for:
- Content heavily used by agents
- Regulated or sensitive docs
- Log and review changes to key sources.
- Optionally:
- Hash and sign critical documents; verify at retrieval time.
- Maintain versions and rollback capability.
Retrieval Controls
- Apply row/document-level access controls:
- Only retrieve documents the current user is allowed to see.
- Limit:
- Number of retrieved documents
- Maximum size per document
- Tag documents with:
- Tenant, classification, origin, last editor, last review date.
In Prompt Construction
- Explicitly separate untrusted RAG snippets from system instructions.
- Label them as untrusted context, and tell the model not to follow instructions within them.
8.3 Memory Tiers and Poisoning Defenses
Define memory types:
1. Session Memory
- Lives for a conversation or short task.
- Cleared after completion or short timeout.
2. Short-Term Memory
- Spans multiple sessions (hours–days) for continuity.
- Auto-expiring and limited in size.
3. Long-Term Memory / Knowledge
- RAG collections, profiles, configuration.
- Curated, versioned, and usually human-reviewed.
Promotion Rules
- Only promote data to long-term memory if:
- Source is trusted, or
- It passes validation workflows (consistency checks, approvals).
- For high-impact information (e.g., policy or config):
- Require manual review.
Poisoning Detection
- Monitor agent behavior over time:
- Sudden shifts in tone, recommendations, or policies may indicate KB poisoning.
- Keep snapshots of important memory sets so you can roll back.
8.4 PII, Secrets, and Retention
PII and secrets must not be casually fed into third-party models or stored long-term.
PII & Secrets Detection
- Use detectors to find PII/PHI/financial data and secrets in:
- Prompts
- Logs
- RAG ingestion streams
- Redact or tokenize as required by policy.
Data Minimization for Models
For third-party LLMs:
- Avoid sending raw identifiers (names, IDs).
- Use pseudonyms or tokens where possible.
- Turn off training/retention features, or use dedicated, non-training endpoints.
Retention and Deletion
- Different retention periods per data type and classification.
- Support deletion/erasure requests (e.g., GDPR/CCPA) by:
- Deleting or anonymizing chat logs, embeddings, and related artifacts.
- Ensure logs maintain security value but minimize personal data.
8.5 Database Mediation
Agents should not have raw SQL access.
Introduce a database mediation layer:
- Expose safe, domain-specific operations (e.g.,
get_sales_summary,find_customer_by_name) as tools. - Use parameterized queries only.
- Enforce limits on:
- Rows returned
- Query complexity
- Frequency and volume
For analytical agents:
- Consider pre-aggregated marts or views rather than direct access to transactional tables.