MCP & Tool Integration Security

Organizations building MCP servers or plugin surfaces face a distinct challenge: they're designing a trust boundary where an AI they don't control interacts with their infrastructure. The AI client calling your MCP server may be compromised, manipulated, or simply behaving in unexpected ways. Your server needs to handle all of that gracefully.

Casaba has performed focused MCP security assessments using scenario-based, adversarial workflows designed to mimic realistic use of autonomous agents calling MCP tools.

Five attack surfaces

Authentication and Authorization Model

How does the MCP server verify who (or what) is calling it? We assess whether the auth model properly distinguishes between different AI clients, users, and permission levels, and whether token scoping is tight enough.

Schema Validation and Input Sanitization

The MCP server defines what tools are available and what parameters they accept. We test whether schema enforcement is actually server-side, whether unexpected parameter types or values can slip through, and whether tool definitions leak information about internal systems.

Trust Boundary Between AI Client and Your Product

The AI calling your MCP server is not under your control. We test what happens when a compromised or manipulated AI sends malicious tool calls, attempts to chain operations in unintended sequences, or probes for capabilities beyond what's advertised.

Data Exposure Through Tool Responses

MCP tool responses flow back into an external AI's context. We evaluate whether responses leak more data than intended, whether sensitive fields are properly filtered, and whether an attacker controlling the AI client can use iterative queries to extract data in aggregate.

Rate Limiting and Abuse Resistance

An AI client can call tools much faster and more systematically than a human. We assess whether the MCP server is resilient to automated enumeration, bulk data extraction, and denial-of-service through rapid tool invocation.

Scenario-based adversarial workflows

Testing is performed using scenario-based, adversarial workflows designed to mimic realistic use of autonomous agents calling MCP tools. We emphasize what the agent and MCP tools actually do at runtime.

Our approach includes configuration review by behavior (validating how MCP servers are defined, how tool schemas are exposed, and what guardrails apply at invocation time), untrusted-content simulations, tool-chaining analysis, data handling checks, control-plane abuse scenarios, and logging and audit review.

We build custom test harnesses tailored to each MCP server's tool definitions and expected interaction patterns.

Typical finding patterns

Indirect Prompt Injection via Tool Output

MCP responses that can steer the agent into taking actions not required by the user's original task.

Privilege and Boundary Crossing

MCP tools used to reach capabilities outside their intended scope, either directly or by chaining into other tools. Configuration mechanisms that become code execution pathways.

Unconstrained External Dependencies

Third-party MCP servers that introduce supply-chain style risk through uncontrolled content, behavior, or updates.

Exfiltration Through Adjacent Services

Data sent outward not through the MCP server directly, but through adjacent allowed services or side channels.

Weak Policy Enforcement and Unsafe Defaults

Administrators unable to express least-privilege policies per tool or server, or policies that aren't enforced reliably at runtime.

Hardening recommendations we commonly make

Treat All MCP Output as Untrusted

Separate data from instructions, apply structured parsing, use safety classifiers to detect injection-like patterns.

Constrain Tool Invocation by Intent

Require the agent to justify tool calls against the user task, block or require confirmation for actions that exceed requested scope.

Per-Tool Egress Controls

Per-tool and per-server egress controls with least-privilege allowlists and strong defaults.

Strong Server Isolation

Minimal privileges for MCP servers, separate from orchestration and high-impact tools.

Harden the Control Plane

Restrict who can add or modify servers, validate configuration inputs, prevent configuration mechanisms from becoming execution pathways.

Reduce Tool-Chaining Blast Radius

Step budgets, recursion limits, and circuit breakers for unexpected tool loops or suspicious escalation patterns.

Every AI system that connects to your server

MCP is becoming the standard interface between AI systems and external services. Organizations building MCP servers are creating trust boundaries that will be probed by every AI system that connects to them. A vulnerability in an MCP server doesn't just affect one AI client - it affects every system that integrates with it.

Need your MCP server tested?

We've been assessing AI tool integrations since the beginning. Let's talk about yours.

Get in touch