AI/LLM Cloud/AppSec Governance | Jobs About Contact

Agentic AI Security Guide / Summary Checklist

Agentic AI Security

Overview
Summary Checklist
Risk Landscape
Core Design Principles
Secure Architecture
Identity & Access Control
Frontend & UX Security
Orchestration & Tools
Data, RAG & Memory
Guardrails & RAI
Infrastructure
Monitoring & IR
SDLC & Testing

Summary Checklist

For each agentic application, teams should be able to answer "yes" to all the following.

Architecture and Agents

We have a documented threat model including agentic risks (prompt injection, RAG poisoning, tool misuse, cross-tenant leakage).
Each agent has a clear, narrow scope and minimal tool set.
The orchestration layer, not the model, enforces security and policy.
Plan–verify–execute and/or controlled breakpoints are used for high-risk actions.
Multi-agent workflows and communications are explicitly defined, schema-validated, and restricted.

Identity and Access

Agents are first-class identities with owners, roles, and environments.
RBAC/ABAC rules constrain data and tools per agent, user, and tenant.
High-privilege access is granted just-in-time and is short-lived.
All elevation events and high-risk actions are logged and monitored.

Data, RAG, and Memory

Data is classified and access-controlled at record/document level.
RAG and long-term memory are protected against unauthorized edits and poisoning.
We minimize what is sent to models and avoid sending secrets.
PII and secrets are detected and appropriately redacted or tokenized before storage.
Retention and deletion policies meet regulatory and contractual requirements.

Tools, MCP, and External APIs

Tools are atomic, single-purpose, and schema-validated server-side.
Per-agent and per-tenant tool allowlists are enforced with default deny.
High-impact tools require human approval or propose/confirm flows.
Code execution tools run in hardened sandboxes with no default network.
MCP servers and plugins go through security review and are centrally registered.
Network egress from agents and tools is restricted to approved endpoints.

Frontend and UX

Standard web security (auth, CSRF, XSS, CSP) is implemented.
All model output is treated as untrusted and rendered safely.
The UI clearly communicates agent capabilities and limitations.
Domain-specific disclaimers are present where needed (health, legal, financial).
Users can flag harmful or incorrect outputs and escalate to humans.

Infrastructure and Model Gateway

Agent workloads run in hardened, least-privilege containers.
Network segmentation prevents agents from reaching core data stores directly.
A model gateway mediates all LLM traffic with auth, rate limits, and logging.
High-risk tools are isolated with additional sandboxing and resource limits.
Supply chain risks are managed (SBOMs, scanning, model version tracking).
Cloud provider resources are configured following security best practices.

Guardrails and Responsible AI

Pre-input, mid-execution, and post-output guardrails are implemented and configurable per product.
RAI harm categories relevant to the use case have been assessed and mitigations implemented.
Domain-specific constraints (financial, health, legal, security) are encoded as policies/guardrails.
Hallucination/grounding checks are in place where correctness is critical.
Bias/fairness considerations are explicitly addressed in sensitive contexts.

Monitoring and Incident Response

Comprehensive, structured, privacy-aware audit logging is enabled end-to-end.
Behavioral baselines and anomaly detection for agents and tools are in place.
Alerts exist for AI-specific threats (injection attempts, tool misuse, exfiltration, behavior shifts).
AI-specific incident response runbooks are defined and integrated with security operations.
There is clear operational ownership and on-call coverage for agentic systems.

SDLC, Testing, and Red Teaming

AI/agent threat modeling is part of design reviews.
Automated security tests (SAST, DAST, dependency, container) are run in CI/CD.
AI-specific tests (prompt injection, RAG poisoning, tool misuse, RAI harms) are part of regression suites.
Periodic adversarial red teaming is performed and findings are addressed.
Continuous evaluation uses incidents, metrics, and assessments to harden defenses.

Previous Next

On this page

Home
⋅
About
⋅
AI/LLM
⋅
Cloud/Appsec
⋅
Governance
⋅
Contact

Red Teaming
⋅
IoT
⋅
Threat Modeling
⋅
SDL

Copyright © World-class AI, Cloud, and Application Security Testing and Governance. All Rights Reserved

Phone: 1-888-869-6708
Notices: Privacy Policy

info@casaba.com