Chapter 11
Monitoring and Incident Response
You cannot defend what you cannot see. Implement telemetry across the full agent lifecycle and build incident response procedures that account for AI-specific failure modes. Without visibility, attacks go undetected and incidents spiral.
11.1 Structured Telemetry and Immutable Audit
For each agentic task, log at minimum these four categories:
Identity
- User ID (or pseudonymous ID), role, tenant/organization
- Agent ID and version
Request
- Timestamp, environment, region
- User prompt (sanitized - PII and secrets redacted)
- High-level context such as retrieved document IDs, not full content
Actions
- Tools called and parameters (sanitized)
- Data domains touched (e.g., which tables, collections, or indices)
- Guardrails triggered and decisions taken
Outcome
- Final agent output (sanitized)
- Status: success, blocked, or error
- Any policy violations or escalations
Store logs in append-only or tamper-evident storage. Use WORM (write once, read many), hash-chaining, or signed logs. Align retention periods with your regulatory requirements. If someone can modify or delete audit logs after the fact, you have lost your ability to investigate incidents reliably.
11.2 Behavioral Monitoring and AI-Specific Detection
Static rules are not enough. You need to establish baselines per agent and watch for deviations.
Establish Baselines
For each agent, track:
- Normal tool usage frequency and mix
- Typical data volumes and classifications accessed
- Usual response lengths, latency, and behavioral patterns
Monitor for Anomalies
- Sudden spikes in high-risk tool calls, bulk data exports, or guardrail violations
- Activity at unusual times or from unexpected geographies
- Sudden shifts in agent behavior - tone changes, altered recommendations, or systematic policy deviations
Detect AI-Specific Threats
- Prompt injection and jailbreak attempts - Repeated attempts to override system instructions
- Cross-tenant access attempts - Any probe for data outside the current tenant boundary
- Data exfiltration patterns - Unusually large responses, repeated "list all" requests, or attempts to encode data in output
- Tool abuse - Misuse of code-execution or generic HTTP tools beyond their intended scope
11.3 Automated Safeguards
Detection without response is just an expensive logging exercise. Build automatic controls that act on what you detect.
Circuit Breakers for Agents
If error rates or policy violation rates cross defined thresholds, disable the agent automatically or switch to a degraded mode - read-only, no tool access. Do not let a malfunctioning agent keep operating at full capability while you figure out what went wrong.
Adaptive Security Posture
When threat levels are elevated:
- Disable risky tools temporarily
- Tighten rate limits
- Force human approval for actions that would normally be automated
Quarantine Modes
For suspicious users or tenants, move them to stricter policies and manual review. This limits blast radius while you investigate, without shutting down the entire system.
11.4 AI-Specific Incident Response
AI incidents are real incidents. Treat them as first-class concerns, integrated with your existing security operations.
Common Incident Classes
- Data leakage - PII, secrets, or confidential data exposed through agent outputs
- Tool misuse - Unauthorized changes made through agent tool calls
- RAG or memory poisoning - Corrupted retrieval data or manipulated agent memory
- Unsafe or harmful outputs - Toxic, biased, or dangerous content reaching production users
- Provider compromise or misconfiguration - Issues at the model provider or infrastructure level
For Each Class, Define a Runbook
1. Detection. Which alerts or metrics indicate the problem? Define specific thresholds and signals so your team knows what to look for.
2. Containment. Disable affected agents or tools. Revoke or rotate compromised credentials. Apply network lockdown if needed. Speed matters here.
3. Triage and analysis. Determine the scope: which tenants, users, and data were affected, over what time period, and through which flows. Identify root cause - was it prompt injection, misconfiguration, a code bug, or infrastructure compromise?
4. Remediation. Fix the code, policies, or patch the affected components. Clean or roll back poisoned memory and RAG indices. Restore systems under stricter observation until you have confidence the fix holds.
5. Communication. Notify internal stakeholders immediately. For regulated industries or contractual obligations, notify customers and regulators as required by applicable law and agreements.
6. Learning and improvement. Update your threat models, test suites, guardrails, and runbooks based on what you learned. Every incident should make the system harder to attack next time.
11.5 Agentic Incident Response - What Changes
The runbook structure in 11.4 applies to any AI incident. But agentic systems introduce specific challenges that traditional IR playbooks - built for human-speed events and deterministic system behavior - are not designed to handle. The incident response frameworks most teams have in place weren't designed for systems that make decisions. We're working through what that means on engagements.
Forensics for Agent Decisions
Standard logs capture API calls and system events, not reasoning chains or goal state. When you investigate an agentic incident, you need to reconstruct why the agent did what it did - not just what it did. This requires decision-layer logging: the full context window at each step, the tools considered and selected, the reasoning that led to each action. Without this, your forensic investigation is limited to input-output pairs with a black box in between.
Blast Radius Containment at Machine Speed
A compromised agent can affect multiple downstream agents, services, and data stores at machine speed. By the time a human analyst notices the anomaly, the damage chain may already be complete. IR playbooks need pre-defined automated containment triggers - circuit breakers that fire based on behavioral signals, not just human escalation paths. NHI governance enables fast containment: if every agent has a distinct identity and scoped credentials, you can isolate a compromised agent without disrupting the entire system.
Memory and Context Store Forensics
After a suspected memory poisoning incident, the memory store is evidence. It must be preserved before re-baselining - snapshot it, secure the snapshot, then clean the production store. If you re-baseline first, you destroy the forensic record of what was poisoned, when, and through what vector.
Rogue Agent Containment
Stopping the agent process is not sufficient. When an agent goes rogue, you also need to audit its task queue (what was it about to do?), check downstream effects (what did it trigger that is still in flight?), and verify that no sub-agents or delegated tasks are continuing to execute. A killed agent with live downstream work is a partial containment.
Action Replay
The ability to replay recorded agent actions in an isolated environment is an emerging practice worth building toward. When an incident occurs, replay the agent's action sequence to understand its behavior, test whether the same sequence would trigger cascading failures, and validate that your containment measures would have caught it earlier. This requires the decision-layer logging described above - you cannot replay what you did not record.
Need help with monitoring and incident response?
We help teams design telemetry architectures and build AI-specific incident response playbooks. If your agents are running in production, let's make sure you can see what they are doing and respond when something goes wrong.
Get in touchAgentic AI Security Guide V1.1 · Changelog