AI adoption in the enterprise is not creeping forward. It’s sprinting.
In many organizations, it’s closer to “build first, figure out the risk later” than many would care to admit. New copilots, internal assistants, and increasingly autonomous agents are being wired directly into data, workflows, and decision-making processes. The business sees speed and advantage. Security sees… well, a growing list of unanswered questions.
Here’s the uncomfortable truth: most enterprises are making consequential AI security decisions without a reliable way to evaluate whether the controls they’re buying actually work.
That’s not a knock on buyers. It’s a gap in the market.
Right now, AI security is full of confident claims, polished demos, and tidy architecture diagrams. But those don’t tell you how a system behaves under pressure, how it fails, or whether the controls hold up when someone actively tries to break them. And if I have learned anything in my decades at NSS Labs testing this stuff, it is that if there is a vulnerability, it will be exploited.
So we decided to take a step back and ask a very simple question: what does “good” actually look like?
That question led to a two-part research series from NSS Labs focused on how enterprises should think about—and evaluate—AI security.
The first paper, “AI Security Beyond the Model,” makes a point that sounds obvious once you say it out loud: the model is only part of the problem. The real risk lives in everything around it—the data it can touch, the instructions it can be manipulated with, the tools it can call, and the permissions it inherits. If those aren’t controlled properly, even a well-aligned model can do the wrong thing, quickly and at scale.
The second paper, “Evaluating Enterprise AI Security,” takes that idea and turns it into something buyers can actually use. It lays out the questions that should be asked in every evaluation, the red flags that should raise eyebrows, and the criteria that help separate meaningful controls from wishful thinking.
A big part of that conversation centers on runtime guardrails.
Not the model. Not the training process. The controls that sit around the model and determine what actually happens in production.
These are the mechanisms that enforce policy, limit access, constrain agent behavior, and—crucially—provide us with a solid trail of evidence. Because sooner or later, something will go wrong. When it does, “the model decided to do it” is not going to satisfy anyone in legal, compliance, or the boardroom.
If that sounds a bit like traditional security thinking, that’s because it is. We’re just applying it to a new class of systems that behave in less predictable ways.
There’s also a broader point here. AI security is evolving quickly, but the way we evaluate it hasn’t caught up. Without clearer expectations, buyers are left comparing apples to… well, marketing slides. And vendors with genuinely strong capabilities don’t have a consistent way to prove it.
That’s not a healthy place for the industry to be. So to address the problem, NSS Labs has been working hard behind the scenes with the major players in the AI Protections System (AIPS) space to define a new and comprehensive test methodology that will allow us to apply the NSS Labs usual stringent testing and evaluation approach to this emerging market.
Our goal with this work isn’t to declare winners or define a single “right” approach. It’s to raise the bar on how these systems are assessed. To move the conversation from “this looks good in a demo” to “this holds up under scrutiny.”
Because AI isn’t slowing down. If anything, it’s accelerating. And the gap between deployment and accountability isn’t going to close on its own.
If we want AI to be trusted at enterprise scale, we need to get serious about how we evaluate the controls that make it safe to use.
That starts with asking better questions—and expecting better answers. And the way to do that is through truly independent third-party testing. A brand-new comprehensive test and validation methodology, published today, evaluates AIPS products across the core areas that matter most in real enterprise deployments, including protection against prompt injection, prevention of harmful or unauthorized output, evasion techniques, resilience under stress and adverse conditions, policy and filter efficacy, security of agentic behavior and tool invocation, observability and auditability, and performance impact.
Each test dimension is designed to represent realistic risks that enterprise customers may encounter when deploying AI systems connected to users, enterprise data, tools, APIs, and business processes. The goal is to provide enterprise buyers, security leaders, and product vendors with a clear, repeatable, and technically rigorous basis for measuring how effectively an AIPS performs under conditions that reflect real-world use and abuse scenarios
If you’re working through these challenges now, the two white papers are designed to give you a practical starting point: what matters, what to test, and what good should look like when you find it. The goal for the AIPS test methodology is to provide enterprise buyers, security leaders, and product vendors with a with a clear, repeatable, and technically rigorous basis for measuring how effectively an AIPS performs under conditions that reflect real-world use and abuse scenarios.
Once testing is completed later this year, the final reports will provide incredible insight into how security vendors are addressing these problems.