Summary Checklist
For each agentic application, teams should be able to answer "yes" to all the following.
Architecture and Agents
- We have a documented threat model including agentic risks (prompt injection, RAG poisoning, tool misuse, cross-tenant leakage).
- Each agent has a clear, narrow scope and minimal tool set.
- The orchestration layer, not the model, enforces security and policy.
- Plan–verify–execute and/or controlled breakpoints are used for high-risk actions.
- Multi-agent workflows and communications are explicitly defined, schema-validated, and restricted.
Identity and Access
- Agents are first-class identities with owners, roles, and environments.
- RBAC/ABAC rules constrain data and tools per agent, user, and tenant.
- High-privilege access is granted just-in-time and is short-lived.
- All elevation events and high-risk actions are logged and monitored.
Data, RAG, and Memory
- Data is classified and access-controlled at record/document level.
- RAG and long-term memory are protected against unauthorized edits and poisoning.
- We minimize what is sent to models and avoid sending secrets.
- PII and secrets are detected and appropriately redacted or tokenized before storage.
- Retention and deletion policies meet regulatory and contractual requirements.
Tools, MCP, and External APIs
- Tools are atomic, single-purpose, and schema-validated server-side.
- Per-agent and per-tenant tool allowlists are enforced with default deny.
- High-impact tools require human approval or propose/confirm flows.
- Code execution tools run in hardened sandboxes with no default network.
- MCP servers and plugins go through security review and are centrally registered.
- Network egress from agents and tools is restricted to approved endpoints.
Frontend and UX
- Standard web security (auth, CSRF, XSS, CSP) is implemented.
- All model output is treated as untrusted and rendered safely.
- The UI clearly communicates agent capabilities and limitations.
- Domain-specific disclaimers are present where needed (health, legal, financial).
- Users can flag harmful or incorrect outputs and escalate to humans.
Infrastructure and Model Gateway
- Agent workloads run in hardened, least-privilege containers.
- Network segmentation prevents agents from reaching core data stores directly.
- A model gateway mediates all LLM traffic with auth, rate limits, and logging.
- High-risk tools are isolated with additional sandboxing and resource limits.
- Supply chain risks are managed (SBOMs, scanning, model version tracking).
- Cloud provider resources are configured following security best practices.
Guardrails and Responsible AI
- Pre-input, mid-execution, and post-output guardrails are implemented and configurable per product.
- RAI harm categories relevant to the use case have been assessed and mitigations implemented.
- Domain-specific constraints (financial, health, legal, security) are encoded as policies/guardrails.
- Hallucination/grounding checks are in place where correctness is critical.
- Bias/fairness considerations are explicitly addressed in sensitive contexts.
Monitoring and Incident Response
- Comprehensive, structured, privacy-aware audit logging is enabled end-to-end.
- Behavioral baselines and anomaly detection for agents and tools are in place.
- Alerts exist for AI-specific threats (injection attempts, tool misuse, exfiltration, behavior shifts).
- AI-specific incident response runbooks are defined and integrated with security operations.
- There is clear operational ownership and on-call coverage for agentic systems.
SDLC, Testing, and Red Teaming
- AI/agent threat modeling is part of design reviews.
- Automated security tests (SAST, DAST, dependency, container) are run in CI/CD.
- AI-specific tests (prompt injection, RAG poisoning, tool misuse, RAI harms) are part of regression suites.
- Periodic adversarial red teaming is performed and findings are addressed.
- Continuous evaluation uses incidents, metrics, and assessments to harden defenses.