Generative AI Security

With the increasing integration of Artificial Intelligence (AI), particularly Large Language Models (LLMs), into almost every type of application, the integrity of these models has become paramount. Casaba's AI security assurance team digs deep into AI-powered products to make sure they can't be hacked, manipulated, tampered with, or run amok once they enter the real world. Our team has the extensive experience needed to carry out complex testing methods for AI systems, including LLMs, as well as enforce governance requirements.

Through a rigorous testing process that examines the entire product ecosystem – from AI models and LLM programs to plugins and supporting Cloud infrastructure and web apps – Casaba's experts conduct hardcore technical penetration tests (black box, gray box, white box), vulnerability assessments, prompt injection (indirect or cross-boundary prompt injection attacks), Responsible AI (RAI) tests, and compliance assessments. This comprehensive governance and testing framework helps identify and mitigate any potential risks like subtle vulnerabilities, design flaws, ineffective guardrails, flawed training models, and problematic plugins, ensuring all issues are addressed before the product goes public.


While other security companies play catch-up, our team has been working behind the scenes to test and secure the world's top AI and LLM services and platforms, giving us a significant advantage in understanding the security complexities of this new technology that others will be facing for years to come.


We have extensive hands-on experience troubleshooting and testing the many AI-related security challenges and problems that most of the world has yet to face. We stay attuned to the latest academic and industry research and incorporate the most relevant techniques into our testing methodology.


Through our comprehensive testing methods, we've seen first-hand the impact potential of AI-related security vulnerabilities and complex exploit chains that pose real-world threats.


We've developed cutting-edge custom tooling to automate and accelerate security testing for prompt injection and RAI violations. When new research papers or methods get released we evaluate them for inclusion in our automation.

What to Expect with Casaba's AI Assurance Process:

1. Initial Scoping

We assess the attack surface and define key security objectives to meet your specific needs. We understand the context this technology operates in, and will thoughtfully identify the most important features for testing. We present a detailed proposal with fixed pricing.

2. Kickoff

We dive deeper into the architecture and code with the dev team, to develop an intelligently prioritized test plan, setting priorities, communication channels, and a weekly status meeting.

3. Execution

We carry out comprehensive, in-depth testing of the product, from targeted code review to surgical runtime testing and infrastructure analysis, we are looking for meaningful issues that you care about.

4. Reporting

Key findings about vulnerabilities such as successful prompt engineering and jailbreaking are provided to the customer in a detailed written report, along with thematic and design issues we identified, followed by an in-person or remote presentation from our team.

"Casaba Security's AI/LLM security testing services have been invaluable to our organization. Their expertise and comprehensive approach helped us identify and mitigate critical vulnerabilities in our AI systems. We highly recommend their services."

Casaba's Unique AI Penetration Testing Approach

Our penetration testing is meticulously structured into Black Box, Gray Box, and White Box testing, each incorporating the latest research findings to provide an exhaustive analysis of LLM vulnerabilities. From universal jailbreaking techniques to systematic approaches like Greedy Coordinate Gradient, our penetration testing services incorporate cutting-edge methods to offer a comprehensive, research-informed analysis of LLM security. We continuously update our methodologies with the latest research findings and techniques to ensure that our services remain at the forefront of LLM security, providing our clients with the most up-to-date, thorough, and effective security analyses.

Detect Hidden Vulnerabilities

Find and fix all software-based vulnerabilities in your LLM, including technical and process-related issues in the OWASP LLM Top 10.

Secure from Prompt Injections

Address critical weaknesses in your LLM's ability to understand and react to user inputs – including direct injection as well as the most subtle attempts at indirect manipulation and prompt injection.

Block Jailbreaking

Prevent dangerous lapses in your product's output security controls which could allow it to share sensitive or prohibited information.

Prevent Harmful Content

Avoid rogue behavior from your LLM by enabling strong security controls to guide and limit its outputs.

Responsible AI (RAI) Compliance

Achieve the highest level of Responsible AI design to ensure your product will always behave in a safe, trustworthy, and ethical way.

Validate Security Controls

Root out design flaws, weaknesses, and lax guardrails through extensive pen-testing that puts your LLM through the ultimate stress test.

Audit Training Data

Ensure that the core data at the heart of your LLM is sound, safe, and accurate.

Prevent Plugin Risks

Maintain the integrity of your LLM from potentially flawed or risky plugins, and interactions between separate components.

Black Box Testing

The only way to determine how your AI will react under a real-world attack is by exposing it to simulated attacks ahead of time. Our penetration testing team incorporates advanced research methodologies and dynamic testing tools like Burp Suite to probe and stress-test every element of the LLM, from malicious prompts to appsec, all from the perspective of an attacker without internal knowledge of the system.

A critical risk with Generative AI and LLMs is their susceptibility to manipulation, which is why we conduct robust prompt engineering and jailbreaking tests to see how the LLM reacts to unexpected or manipulative inputs that real attackers use. Our prompt-based attack strategy incorporates novel techniques as they continue to emerge. Techniques like Pretending, Attention Shifting, Privilege Escalation and more - each of which represents a unique challenge for LLMs, as these methods are used by attackers to manipulate conversation context or user intent to exploit LLM vulnerabilities.

LLM vs. LLM Testing

Threat actors can use other LLMs to attack your product and this is a critical part of our black box testing regimen. We combine our advanced methodologies for black box testing with our in-house developed tools to run a comprehensive range of LLM vs. LLM tests to see how your product will stand up to aggressive interaction and potential exploitation by other LLMs.

Gray Box Testing

Comprehensive infrastructure evaluation is vital for achieving a secure product. Using a gray-box approach, we work closely with your development team to understand the intricacies of system prompts and user input integration in your Generative AI and LLM product. This enables us to identify and remediate all the potential hotspots for prompt injection attacks.

Throughout this process, our team seeks out critical vulnerabilities such as resource overconsumption, unsafe credential handling, and tooling-related vulnerabilities which are high-risk but often neglected in LLM security.

Our approach includes complex, highly targeted prompt injection methods which are specially designed to test your AI and LLM's ability to discern and counteract subtle adversarial attempts that might be embedded within seemingly innocuous inputs.

White Box Testing

This takes a deeply informed and privileged approach to the testing process, where we leverage our access to model weights to conduct a thorough analysis of your LLM. Here, research techniques such as the Greedy Coordinate Gradient (GCG) are pivotal. Inspired by the Greedy coordinate descent (GCD) method, this approach evaluates all possible single-token substitutions, using gradients to find promising candidates. This technique, an extension of the AutoPrompt method, has shown remarkable effectiveness in optimizing "jailbreaking suffixes," achieving significant success rates in research studies.

We also employ gradient-based distributional attacks like GBDA and HotFlip. GBDA, for instance, uses Gumbel-Softmax approximation to make adversarial loss optimization differentiable, a novel approach that has proven effective in manipulating token probabilities for adversarial attacks. HotFlip, on the other hand, employs a different strategy, treating text operations as inputs in vector space and optimizing adversarial loss through a series of calculated text manipulations.

AI/LLM Governance

Casaba excels in the realm of AI governance, guiding teams through the sophisticated process of launching AI products and services with confidence. This journey begins with strategic discussions on infrastructure, deployment environments, resource allocation, and defining the core features of your service. Once a suitable model is chosen for your needs, we craft precise prompts to optimize AI performance for specific tasks. Following model development, we employ advanced automated tools to scale testing efficiently, ensuring the products meet the highest standards of safety, reliability, and security.

Attention then shifts to the user interface, whether it's a command-line application, chatbot, or another tool, to ensure clarity and enhance productivity. This step is crucial in making AI tools a beneficial addition to workflows rather than a hindrance.

Before any tool goes live, it must earn the green light from product owners and management. To this end, an internal board of trusted employees conducts thorough, impartial reviews of each feature, ensuring strict adherence to our established guidelines.

Leveraging Casaba's deep understanding of industry standards, tools, and best practices is foundational in paving the way for the successful deployment and future-proofing of your AI products.

Trusted for over 20 years

Our reputation speaks for itself, delivering expertise and quality known throughout the industry, we are the team to call when you want the confidence that your project will be done right.