Infrastructure & Sandboxing
Infrastructure security provides the foundation for all other controls. Isolate components, harden containers, and leverage cloud-native security features.
10.1 Execution Isolation
Differentiate between:
Standard Services
Orchestrator, model gateway, many tools:
- Container best practices:
- Non-root users
- Dropped Linux capabilities
- Read-only root file systems where possible
- Regular image scanning and patching
High-Risk Tools
Code execution, document parsing of untrusted binaries, browser automation:
- Extra isolation:
- gVisor, Firecracker, Kata, or similar lightweight VMs
- No network access by default; selectively enable if necessary
- Strict CPU, memory, and time limits to prevent DoS
- Ephemeral environments purged after each run
10.2 Kubernetes and Service Mesh
- Use namespaces to separate:
- Agent workloads
- Core services
- Tool services
- Apply NetworkPolicies (or service mesh authZ) so:
- Only approved services can call the model gateway and tools.
- Agents cannot directly talk to databases or internal admin services.
- Use service-to-service authN (mTLS, JWTs) for internal calls.
10.3 Model Gateway and Plane Segregation
Introduce a model gateway that centralizes:
- Provider credentials
- Rate limiting
- Request and response logging
- Allowlist of services allowed to call models
Segregate:
- Control plane: Orchestration, policies, configuration, governance
- Data plane: Inference traffic, tool invocation, data I/O
Restrict control plane APIs to small sets of admin services and teams; audit all changes.
10.4 Supply Chain and Model Provenance
- Maintain SBOMs for:
- Base images
- Key libraries and frameworks (LLM SDKs, vector DBs, guardrail engines)
- Regularly scan for vulnerabilities and outdated components.
- Track model versions:
- Provider, model name, version
- Training policies (as disclosed), model cards, evaluation results
- Correlate behavioral changes with model or framework updates.
10.5 Cloud Provider-Specific Recommendations
When deploying agentic AI systems on major cloud platforms, leverage provider-native services that align with security best practices.
Azure
Identity and Access Management
- Azure Entra ID for user authentication with Conditional Access policies
- Managed Identities for agent service accounts (avoid storing credentials)
- Azure RBAC for resource-level access control
- Privileged Identity Management (PIM) for just-in-time admin access
Secrets Management
- Azure Key Vault with private endpoints, RBAC-based access policies, and key rotation automation
Container Orchestration
- Azure Kubernetes Service (AKS) with Azure Network Policies, Azure Policy for Kubernetes, Workload Identity, and confidential containers for sensitive workloads
Network Security
- Azure Virtual Network with NSGs, Azure Firewall, Private Endpoints for all PaaS services
- Azure Private Link for secure access to Azure services
Model Gateway
- Azure API Management with OAuth 2.0/JWT validation, rate limiting, request/response logging, and private VNET integration
AWS
Identity and Access Management
- AWS IAM Identity Center for user authentication
- IAM Roles with least-privilege policies, SCPs, and permission boundaries
- AWS IAM Access Analyzer to identify overly permissive policies
Secrets Management
- AWS Secrets Manager with automatic rotation and VPC endpoints
Container Orchestration
- Amazon EKS with Pod Identity, Calico or AWS Network Policies, EKS Security Groups for Pods, and Fargate for serverless pods
Network Security
- Amazon VPC with Security Groups, Network ACLs, VPC endpoints for AWS services
- AWS Network Firewall and AWS WAF for advanced filtering
Model Gateway
- Amazon API Gateway with IAM authorization, usage plans, VPC Link for private integration
Google Cloud Platform (GCP)
Identity and Access Management
- Google Cloud Identity for user authentication
- Service Accounts with Workload Identity for GKE pods
- IAM Conditions for fine-grained access control
- VPC Service Controls for data perimeter enforcement
Secrets Management
- Google Secret Manager with IAM-based access control and versioning
Container Orchestration
- Google Kubernetes Engine (GKE) with Workload Identity, Network Policies, Binary Authorization, and GKE Autopilot
- Cloud Run for stateless container workloads
Network Security
- VPC with firewall rules, Private Google Access, VPC Service Controls
- Cloud Armor for DDoS protection and WAF
Model Gateway
- Cloud Endpoints or Apigee with service-to-service authentication, rate limiting, and Cloud Armor integration
Cross-Cloud Considerations
When operating across multiple clouds:
- Unified Identity: Use OIDC/SAML federation; consider HashiCorp Vault for cross-cloud secrets
- Network Connectivity: AWS Direct Connect, Azure ExpressRoute, or Cloud Interconnect for private connectivity
- Observability: Centralize logs in a SIEM with consistent formatting and correlation IDs
- Data Residency: Clearly define which regions handle what data classifications
- Disaster Recovery: Multi-region within a single cloud first; multi-cloud for critical systems with regular DR drills