AI Security Is Moving Fast. Evaluation Isn’t. That’s a Problem.

AI adoption in the enterprise is not creeping forward. It’s sprinting.

In many organizations, it’s closer to “build first, figure out the risk later” than many would care to admit. New copilots, internal assistants, and increasingly autonomous agents are being wired directly into data, workflows, and decision-making processes. The business sees speed and advantage. Security sees… well, a growing list of unanswered questions.

Here’s the uncomfortable truth: most enterprises are making consequential AI security decisions without a reliable way to evaluate whether the controls they’re buying actually work.

That’s not a knock on buyers. It’s a gap in the market.

Right now, AI security is full of confident claims, polished demos, and tidy architecture diagrams. But those don’t tell you how a system behaves under pressure, how it fails, or whether the controls hold up when someone actively tries to break them. And if I have learned anything in my decades at NSS Labs testing this stuff, it is that if there is a vulnerability, it will be exploited.

So we decided to take a step back and ask a very simple question: what does “good” actually look like?

That question led to a two-part research series from NSS Labs focused on how enterprises should think about—and evaluate—AI security.

The first paper, “AI Security Beyond the Model,” makes a point that sounds obvious once you say it out loud: the model is only part of the problem. The real risk lives in everything around it—the data it can touch, the instructions it can be manipulated with, the tools it can call, and the permissions it inherits. If those aren’t controlled properly, even a well-aligned model can do the wrong thing, quickly and at scale.

The second paper, “Evaluating Enterprise AI Security,” takes that idea and turns it into something buyers can actually use. It lays out the questions that should be asked in every evaluation, the red flags that should raise eyebrows, and the criteria that help separate meaningful controls from wishful thinking.

A big part of that conversation centers on runtime guardrails.

Not the model. Not the training process. The controls that sit around the model and determine what actually happens in production.

These are the mechanisms that enforce policy, limit access, constrain agent behavior, and—crucially—provide us with a solid trail of evidence. Because sooner or later, something will go wrong. When it does, “the model decided to do it” is not going to satisfy anyone in legal, compliance, or the boardroom.

If that sounds a bit like traditional security thinking, that’s because it is. We’re just applying it to a new class of systems that behave in less predictable ways.

There’s also a broader point here. AI security is evolving quickly, but the way we evaluate it hasn’t caught up. Without clearer expectations, buyers are left comparing apples to… well, marketing slides. And vendors with genuinely strong capabilities don’t have a consistent way to prove it.

That’s not a healthy place for the industry to be. So to address the problem, NSS Labs has been working hard behind the scenes with the major players in the AI Protections System (AIPS) space to define a new and comprehensive test methodology that will allow us to apply the NSS Labs usual stringent testing and evaluation approach to this emerging market.

Our goal with this work isn’t to declare winners or define a single “right” approach. It’s to raise the bar on how these systems are assessed. To move the conversation from “this looks good in a demo” to “this holds up under scrutiny.”

Because AI isn’t slowing down. If anything, it’s accelerating. And the gap between deployment and accountability isn’t going to close on its own.

If we want AI to be trusted at enterprise scale, we need to get serious about how we evaluate the controls that make it safe to use.

That starts with asking better questions—and expecting better answers. And the way to do that is through truly independent third-party testing. A brand-new comprehensive test and validation methodology, published today, evaluates AIPS products across the core areas that matter most in real enterprise deployments, including protection against prompt injection, prevention of harmful or unauthorized output, evasion techniques, resilience under stress and adverse conditions, policy and filter efficacy, security of agentic behavior and tool invocation, observability and auditability, and performance impact.

Each test dimension is designed to represent realistic risks that enterprise customers may encounter when deploying AI systems connected to users, enterprise data, tools, APIs, and business processes. The goal is to provide enterprise buyers, security leaders, and product vendors with a clear, repeatable, and technically rigorous basis for measuring how effectively an AIPS performs under conditions that reflect real-world use and abuse scenarios

If you’re working through these challenges now, the two white papers are designed to give you a practical starting point: what matters, what to test, and what good should look like when you find it. The goal for the AIPS test methodology is to provide enterprise buyers, security leaders, and product vendors with a with a clear, repeatable, and technically rigorous basis for measuring how effectively an AIPS performs under conditions that reflect real-world use and abuse scenarios.

Once testing is completed later this year, the final reports will provide incredible insight into how security vendors are addressing these problems.

The AI Automation Arms Race: Why Defense Is Not Symmetrical

The security industry likes to tell itself a comforting story: as attackers adopt artificial intelligence, defenders will respond in kind, and the balance of power will remain roughly equal. AI on both sides, the thinking goes, should cancel out.

This assumption is wrong — and potentially dangerous.

As with most other areas of security, the use of AI in practice is deeply asymmetrical. Attackers benefit disproportionately from automation, while defenders struggle to translate AI adoption into meaningful risk reduction. The result is not an arms race between equals, but a widening gap between the speed at which attacks evolve and the pace at which enterprises can govern, understand, and respond to them.

Automation Favors the Offense

Attackers have always benefited from scale and, unfortunately, AI simply amplifies that advantage.

Modern attack campaigns will utilize automation not to invent entirely new exploits, but to industrialize existing ones. Known vulnerabilities, misconfigurations, and weak identity controls are now stitched together by AI-assisted tooling that adapts quickly, probes relentlessly, and exploits opportunity at machine speed. These campaigns are quieter, more persistent, and harder to distinguish from background noise.

Critically, attackers do not need perfect precision. They benefit from volume, iteration, and probabilistic success. A small improvement in targeting or evasion, multiplied across thousands of attempts, yields meaningful results and AI excels at this kind of iterative optimization.

Why Defensive AI Struggles to Keep Up

On the defensive side, AI is often deployed as an enhancement to existing tools, offering faster detection, better prioritization, and smarter correlation within an enterprise SIEM, for example. These are valuable improvements, but they do not change the fundamental constraints under which defenders must operate.

Security teams are accountable for outcomes, needing to explain decisions, justify actions, and demonstrate control. False positives disrupt business. False negatives create risk. Every automated response must be defensible after the fact to management, auditors, regulators, or customers.

This asymmetry matters. An attacker can afford to be wrong repeatedly, trusting that eventually a shot will be on target. Defenders, just like the poor old goalkeeper in soccer, cannot afford to be wrong once.

AI-powered detection may surface more signals, but without clear governance, visibility, and control, those signals quickly become noise. Automation without accountability simply accelerates confusion.

The Persistence of “Old” Attack Surfaces

One of the more uncomfortable realities emerging from recent incidents is that many AI-enabled attacks still rely on very traditional weaknesses, such as exposed services, misconfigured cloud environments, weak access controls, unpatched software, and multiple evasion techniques. AI does not replace these attack vectors but instead makes them easier to discover and exploit at scale.

To any seasoned security professional this sounds very familiar. The old saying goes: “there is nothing new under the sun” and the danger is not that AI introduces entirely new classes of risk overnight (although it inevitably will – see our white papers AI Security Beyond the Model: What Enterprises Need to Care About — and Whyand Evaluating Enterprise AI Security: Questions Every Buyer Should Be Able to Answer”). The real danger in the short term is that it quietly magnifies existing weaknesses until they become systemic failures.

Why Symmetry Is the Wrong Mental Model

The idea of a balanced AI arms race assumes that both sides gain comparable benefits from automation. In reality, the incentives and constraints are fundamentally different.

Attackers optimize for opportunity and speed, while defenders optimize for stability, correctness, and trust. AI aligns naturally with the former, aligning with the latter only when paired with strong governance, observability, and control.

This is why simply “adding AI” to security tools does not meaningfully close the arms race gap. Without clear policies, auditability, and predictable behavior under stress, automation can undermine confidence rather than strengthen it.

Reframing the Defensive Objective

The goal of AI security should not be to match threat actors’ algorithm for algorithm, since that race is unwinnable and misses the point.

AI-enabled systems must be designed and evaluated not just for detection capability, but for how they behave when assumptions break, when inputs are ambiguous, dependencies fail, or automation makes the wrong decision.

This requires shifting emphasis away from novelty and toward discipline and focusing on visibility into how decisions are made, evidence that controls behave predictably under stress, and governance that aligns automation with enterprise risk tolerance.

What CISOs Should Do Now

  • Treat AI-enabled security controls as governed systems, not intelligent features. Demand clarity on how automation is authorized, constrained, and audited.
  • Insist on observability and accountability. If your team cannot reconstruct why an automated decision was made, you cannot defend it to regulators, boards, or customers.
  • Pressure-test failure modes. Ask how controls behave when dependencies degrade, data is ambiguous, or models behave unexpectedly.
  • Resist the temptation to equate speed with strength. Automation that outpaces governance increases risk rather than reducing it.

Enterprises that succeed in this environment will not be those that deploy the most AI, but those that govern it best.

The uncomfortable truth is that AI makes security failures more likely when organizations do not understand their own systems. The answer is not less automation, but better control over how automation is introduced, evaluated, and held accountable.

The arms race metaphor obscures this reality. Defense is not about symmetry, but responsibility and accountability. And in an AI-enabled world, responsibility is the hardest capability to automate.

When AI Finds Every Bug

The discovery clock just accelerated

On April 7, Anthropic announced that its newest model, Claude Mythos Preview, had autonomously discovered thousands of high and critical severity zero-day vulnerabilities across every major operating system and web browser—many hiding in plain sight for over a decade. A 27-year-old bug in OpenBSD. A 16-year-old flaw in FFmpeg that automated fuzzers had hit five million times without catching. And Mythos doesn’t just find vulnerabilities—it writes the exploits, succeeding on over 83% of first attempts where previous models achieved close to zero.

That is good news in one sense: software vendors and maintainers may be able to identify and patch flaws earlier, and reduce the time that dangerous defects remain unknown.

But that same acceleration also accelerates threat landscape changes for everyone else. The same class of AI capability that can help defenders find weaknesses can also help bad actors understand them faster, build exploits around them faster, and operationalize attacks at far greater speed and scale.

This is a watershed moment. But while the headlines focus on offensive implications, the downstream consequences for enterprise security operations are just as profound.

The deployment clock has not

The optimistic reading is simple: vulnerabilities found earlier get patched earlier. Project Glasswing—Anthropic’s well thought-out consortium with AWS, Apple, Cisco, Google, Microsoft, and others—is already putting Mythos to work scanning critical codebases. But enterprise security leaders know the uncomfortable truth: a patch being available and a patch being deployed are two very different things.

Even when software owners produce fixes more quickly, enterprise deployment lifecycles still have to contend with regression testing, change windows, operational dependencies, rollback planning, uptime requirements, and the broader risks that come with touching mission-critical systems.

For most enterprises, especially those operating essential services or critical infrastructure, upgrade patterns will still need to follow established risk-mitigation best practices. The cost of an unstable production change can be just as severe as the vulnerability itself.

Layered defenses under increasing pressure

That leaves a familiar but increasingly compressed exposure gap: the period between when a vulnerability is known and when an organization can safely deploy the patched version into production.

As has been true for years, enterprises will need to rely on layered defenses to help close that gap. Firewalls, IDS/IPS, segmentation controls, and related security systems will remain essential in providing mitigation protection while upgrades are planned, tested, and rolled out safely.

What changes now is the pace and volume. If AI sharply increases the rate at which vulnerabilities are discovered, then the burden on protective controls will grow with it. Those systems will need to implement and deploy signatures, policies, and other mitigation mechanisms more quickly and more often.

Deployment has to be not just faster, but smarter

In this environment, simply deploying a mitigation faster is no longer enough. A signature added to an IPS, a rule pushed to a firewall, or a policy configured on paper does not by itself prove that the control is effective against the exploit path it is supposed to stop. An overly broad signature triggering false positives will interrupt legitimate business.

Those mitigation measures can be made smarter through validation that they are effective in protecting against the consequential vulnerabilities and exploits being targeted. Otherwise, organizations are not demonstrating risk reduction; they are assuming it.

This distinction will matter more as AI also helps adversaries accelerate exploit development and surrounding tooling. The attack side of the equation will move faster, which means control effectiveness must be measured more often and with greater rigor. Absent that, we will have a case where enterprises try to move fast but end up breaking things.

Why oversight pressure will rise

A hyper-accelerated threat environment will not only affect security teams. Lawmakers, regulators, insurers, auditors, boards, and other oversight bodies focused on enterprise risk are watching the same headlines. When AI can autonomously compromise critical infrastructure software, tolerance for vague assurances evaporates.

Their questions will increasingly shift from broad policy statements to operational evidence. Questions will evolve from “Do you have security controls?” to “What have you done to prevent this, and what is the evidence it’s working?”.

As a result, requirements for proof of continuous validation testing and evaluation of deployed security controls are likely to become more common across corporate governance, risk management, compliance, insurance underwriting, and sector-specific oversight agencies.

Continuous validation becomes a core operating discipline

This is why continuous validation testing of security controls will become more critical in enterprise IT operations.

Continuous validation gives organizations a practical way to bridge the gap between faster vulnerability discovery and the slower, smarter, necessary discipline of safe production change. It helps prove that compensating controls are providing real mitigation while the enterprise works through responsible remediation cycles.

In the age of AI-accelerated vulnerability discovery and exploit creation, continuous validation will become central not only to security operations, but also to meeting corporate GRC obligations, supporting regulatory readiness, and demonstrating defensible cyber resilience.

Organizations that build this into their operational cadence, GRC programs, and vendor accountability frameworks will be able to answer the hard questions when regulators and boards come asking. Those that don’t will find themselves defending assumptions in a world that no longer accepts them.

Continuous validation isn’t the future of enterprise security operations. It’s the present. Mythos just made it impossible to ignore, and Project Glasswing presents a great opportunity to respond.

What enterprise leaders should do now

Priority Why it matters
Preserve disciplined change management Do not trade mission-critical availability for superficial patch velocity. Upgrades still need controlled testing and rollout.
Strengthen layered defenses Firewall and IDS/IPS mitigations will be asked to carry more of the burden during the exposure window.
Validate mitigations continuously Controls need evidence-based testing to prove they actually block the vulnerabilities and exploits they target.
Prepare for evidence demands Regulators, insurers, boards, and auditors will increasingly expect proof, not just claims, that controls are working.

When Firewalls Fail Gracefully

The latest NSS Labs Enterprise Firewall Comparative Report was published this month and, as usual, provided a deep insight into the state of the enterprise firewall market.

Seven of the most widely deployed products were tested using real-world attack scenarios, enterprise-grade workloads, and adversarial evasion techniques to measure their resilience, reliability, and performance.

The results reveal a security landscape that remains uneven: most products blocked the majority of exploits and malware, but a few stumbled when exposed to modern, and not so modern, evasion techniques.

However, the story doesn’t end with the Comparative Security Map – it is also a case study in vendor accountability. How vendors respond when weaknesses are exposed in independent tests such as this tells us a lot about how they are likely to support their enterprise customers in a pinch. It also tells us how seriously they take engineering challenges that could result in serious failures, or even breaches, when installed in live environments.

Palo Alto Networks and Fortinet, though not the highest-scoring participants, stand out precisely because they treated the findings as an opportunity to rectify shortcomings in their products that could have a serious impact on their customers. Within days of publication, both vendors confirmed patches for the issues identified and scheduled retests for the affected products. That kind of responsiveness deserves as much attention as raw test scores.

The Test That Matters

NSS Labs enterprise-firewall evaluations are the most comprehensive in the industry. The 2025 round measured not only exploit and malware detection, but also resilience against 53 evasion categories, false-positive accuracy, TLS/SSL handling, and sustained throughput under realistic enterprise workloads.

In other words, this isn’t a marketing test with cherry-picked “perfect” network traffic and well-known basic exploits and malware. Each firewall was deployed in-line between trusted and untrusted networks, then stress-tested with:

  • A broad range of “real world” network traffic designed to emulate typical enterprise traffic, both encrypted and plain text.
  • 3,326 exploit samples from vulnerabilities found in the wild in enterprise environments.
  • 11,311 malware samples drawn from active campaigns.
  • 5,752 evasion variations spanning 53 evasion categories, crafted to bypass defenses.
  • 55 performance stress tests spanning HTTP, HTTPS, and UDP traffic, created to measure throughput, stability, and reliability under stress.

This combination produces an in-depth view of security efficacy, together with an evaluation of performance using mixtures of real-world traffic. In today’s enterprise networks, where more than 95 percent of web sessions are HTTPS, it is important for firewalls to be able to handle encrypted traffic.

How the Vendors Fared

Three of the seven firewalls achieved Recommended ratings: Check Point, Juniper Networks, and Versa Networks. All delivered security effectiveness above 99 percent with false-positive accuracy in the high 90s.

Three vendors received Caution ratings: Cisco, Fortinet, and Palo Alto Networks. Their placement wasn’t due to catastrophic malware or exploit detection failures, since each still handled most malicious payloads effectively, but because of critical failures in their ability to resist low-level evasion techniques.

This continues to be an issue today, just as it was at the inception of NSS Labs 1.0 in 2007. You might think that we should be seeing 100% resistance by now, but instead coverage appears to be cyclical. It seems that vendors will work hard to build robust code that handles evasions well, but later engineering teams deprioritize that area of development, or complex new features simply break it.

Two key points are evident:

  1. Evasion handling is a powerful differentiator today, just as it has always been.
  2. Throughput disparities can be significant, especially when encrypted traffic is thrown into the mix.

What Went Wrong—and Right

While malware and exploit detection rates across the board were excellent (most above 99 percent), the evasion results exposed real-world risk. A single missed evasion can allow bad actors to reuse entire classes of exploits allowing malicious traffic to go undetected.

Cisco failed one critical TCP-segmentation evasion, reducing its exploit-evasion resistance to 40 percent; Fortinet missed one transport-layer variant, scoring 60 percent; and Palo Alto Networks failed both network and transport-layer categories, resulting in 0 percent exploit-evasion resistance.

Why Responsiveness Matters

However, it is not all about pure test results, but rather how a vendor responds to those results that really matters. That defines the kind of relationship they are likely to have with their customers, and how seriously they take their engineering mission. In cybersecurity, perfection is fleeting. Every product eventually encounters a configuration bug or parser flaw. What separates mature vendors from pretenders is how quickly and transparently they respond.

Palo Alto Networks and Fortinet publicly acknowledged the test outcomes, issued software updates within a couple of weeks, and scheduled retesting. That is what enterprise customers should be looking for from their security partners: transparency and the willingness to participate in independent tests in the first place, followed by the desire to act on the results of those tests to improve their product expediently where necessary.

NSS Labs urges enterprises to hold vendors accountable and demand transparency. Vendors who view testing as collaboration rather than confrontation, will build lasting trust as well as solid products.

Performance Under Pressure

Security effectiveness means little if performance tanks under real workloads. NSS Labs Rated Throughput metric weights encrypted traffic at 95 percent, mirroring modern conditions. Versa achieved the highest sustained throughput (7.6 Gbps) with strong security; Juniper balanced speed and protection; Fortinet offered excellent value; Palo Alto trailed but excelled in accuracy.

False Positives: The Hidden Cost

NSS Labs replaced its previous price-per-protected-megabit metric with false-positive accuracy as a more meaningful measure of operational overhead. Cisco’s 80 percent accuracy implies legitimate traffic was incorrectly blocked one-fifth of the time, which may cause issues in live deployments. Conversely, Palo Alto, Versa, and Fortinet all exceeded 99 percent in terms of resistance to false positive scenarios.

The New Baseline: Encryption Everywhere

With more than 95 percent of global web traffic encrypted, enterprise firewalls need to be able to handle it without suffering significant performance degradation. All firewalls handled decryption properly, but some paid steep penalties in terms of performance. Versa and Juniper maintained 80–90 percent efficiency, while Palo Alto and Cisco lagged near 70 percent.

Beyond the Scoreboard

At first glance, a Caution rating in the CSM might appear damning, but within weeks those numbers will likely change as fixes are validated and re-tested. Resilience isn’t static; what defines market leadership is the ability to recover quickly, transparently, and collaboratively.

Independent testing remains the crucible through which trust is forged. The vendors who embrace scrutiny, fix what’s broken, and invite another round of validation are the ones enterprises should bet their networks on.

Because in the end, cybersecurity isn’t about being flawless. It’s about being fast, honest, and relentless in pursuit of better protection.

Beyond Assumptions: Why Validation is the Next Frontier in Cybersecurity Defense

Last week, CISA published an incident response report detailing how a federal civilian executive branch (FCEB) agency was breached through exploitation of a known and documented vulnerability in GeoServer (https://www.cisa.gov/news-events/alerts/2025/09/23/cisa-releases-advisory-lessons-learned-incident-response-engagement). This was not a “sophisticated zero-day,” but a widely reported weakness defenders have been aware of for some time (https://nvd.nist.gov/vuln/detail/cve-2024-36401).

This breach underscores a sobering reality: attackers don’t need innovation when defenders rely on assumptions.

Known Exploits, Unknown Effectiveness

Each time an advisory like this is released, many CISOs and CTOs are left asking the same question: “Would this have worked against us?”

The uncomfortable truth is that in many environments, the answer is uncertain. Security leaders often deploy products with the expectation of protection, but without direct validation, those expectations may not hold under real-world attack conditions.

This is why we at NSS Labs regularly evaluate security products against actual exploit samples going as far back as 10 years—including the very vulnerability used in this breach. That type of validation gives defenders evidence, not just hope, that their technologies will withstand known threats.

The Flaw in “Defense-in-Depth by Assumption”

Defense-in-depth is a well-established strategy. Multiple layers of technology—firewalls, intrusion prevention, endpoint agents, and monitoring—create redundancy and resilience. But the mere presence of these controls is not enough.

  • Deployment ≠ Effectiveness. A product installed in the stack doesn’t guarantee it will perform as intended.
  • Context Matters. Effectiveness depends on how controls are configured, tuned, and integrated into the environment.
  • Silent Gaps Exist. Without validation, security teams may not realize that certain attack vectors bypass defenses entirely.

The CISA advisory makes clear: organizations cannot rely on “best practice” architectures alone. They must prove their defenses actually work.

Validation as the Next Frontier

Cybersecurity has long emphasized the importance of prevention and detection. The next frontier is validation: treating effectiveness as a measurable, verifiable outcome.

Validation is not theoretical—it’s practical. The critical difference between assumption and assurance is data. Testing security products against real exploits, simulating adversarial behavior, and quantifiably measuring whether defenses hold provides assurance that investment translates into protection. Independent testing bodies such as NSS Labs help provide this evidence, bridging the gap between vendor claims and operational reality.

A Practical Checklist for CISOs

Security leaders looking to strengthen their posture against known vulnerabilities can use the following framework:

  1. Inventory What Matters
    1. Catalog critical applications, platforms, and workloads.
    2. Prioritize those most tied to mission outcomes.
  2. Map Defenses to Assets
    1. Identify which controls protect which workloads.
    2. Look for overlaps, blind spots, and single points of failure.
  3. Validate Against Exploit Samples
    1. Test defenses against real-world exploits and malware, not just lab simulations.
    2. Leverage independent testing where available.
  4. Simulate Adversarial Behavior
    1. Ensure that each component in the defensive chain is independently tested and validated.
    2. Focus on tactics, techniques, and procedures (TTPs) that correspond to those utilized by threat actors in the current threat landscape.
  5. Make Validation Continuous
    1. Move from one-time testing to an ongoing validation cycle.
    2. Adjust configurations, patching, and investments based on results.

Final Thoughts

The lesson from this breach is clear: known vulnerabilities remain one of the most significant risks to enterprise security. Defense-in-depth alone is insufficient if it is built on untested assumptions.

The industry must embrace validation as a core pillar of cybersecurity strategy. By demanding measurable proof of effectiveness—through independent testing, adversarial approaches, and continuous validation—CISOs and CTOs can move from assumption to assurance.

The goal is simple yet profound: to ensure that when the next advisory is issued, leaders can answer with confidence:

“Yes—our defenses will hold.”

Reintroducing NSS Labs: A New Chapter in Cybersecurity Assurance

AUSTIN, TX – July 9, 2025 — NSS Labs, long known as the gold standard for independent security product testing, is back—with a new ownership structure, revitalized leadership, and a mission laser-focused on the evolving demands of today’s cybersecurity and advanced technology ecosystem. Led by Founder and CEO Vikram Phatak, NSS Labs 2.0 is now wholly owned and operated by its senior partners and executive team.

The reimagined NSS Labs will deliver confidential, data-driven testing services tailored to three core audiences:

  • Enterprises will benefit from objective testing and continuous validation of security technologies—on-premises, in the cloud, or delivered via third-party services. These assessments support strategic initiatives including risk governance, supply chain validation, vendor accountability, and regulatory compliance.
  • Security Vendors will gain rigorous third-party validation of product efficacy using real-world attack scenarios. Testing data will help refine product strategy, accelerate go-to-market timelines, and support credible claims in a competitive market.
  • Service Providers will receive evaluations built for multi-vendor environments, helping them benchmark offerings, support procurement decisions, and communicate service value with independently validated data. Testing also supports operational readiness and future roadmap development.

In addition to its commercial services, NSS Labs is also the Official Testing Partner of CyberRatings.org, the non-profit that publishes public test results and research on cybersecurity technologies. NSS Labs contributes by developing test methodologies, authoring both individual and comparative reports, and producing educational and thought leadership content for the broader cybersecurity community.

Originally founded in 2007, NSS Labs quickly became a globally trusted authority, shaping product development, procurement, and strategy decisions across the cybersecurity ecosystem. After private equity ceased operations in 2020, Phatak acquired select assets and intellectual property from the custodian of NSS Labs 1.0 and recognizing renewed demand from stakeholders, reassembled veteran talent and new industry leaders to launch NSS Labs 2.0.

“We’re not just relaunching NSS Labs—we’re rebuilding it for the future,” said Phatak. “We’ve preserved the integrity and rigor that put NSS Labs on the map, and supercharged it with interactive tools, modern methodologies, and a team with decades of hands-on experience evaluating cybersecurity products across a wide range of categories—including endpoint, cloud, AI, and post-quantum cryptography.”

What’s New at NSS Labs 2.0

  • Interactive, Data-Driven Tools: Stakeholders can engage with test data through intuitive interfaces, enabling real-time comparison and deeper insights into product performance.
  • Expanded Testing Portfolio: Beyond traditional technologies like enterprise firewalls and SD-WAN, NSS Labs now evaluates advanced solutions such as AI/ML-powered tools, ransomware defenses, and post-quantum cryptographic systems.
  • Tailored Services by Audience: Purpose-built programs for enterprises, vendors, and service providers combine transparency, speed, and rigorous technical validation.

A Legacy That Inspires Confidence

NSS Labs 2.0 draws on decades of hands-on experience testing hundreds of cybersecurity products across diverse categories. Each evaluation begins with the development of robust, detailed methodologies, followed by rigorous real-world testing to ensure products meet the demands of today’s fast-evolving threat landscape.

The lab’s live, real-time infrastructure tests against a broad array of exploits, evasions, malware, and malicious URLs—mirroring the complexity of real-world attacks. Unique to NSS Labs are thousands of hand-crafted evasions that mimic sophisticated threat actor techniques, designed specifically to bypass detection systems. Scalable cloud infrastructure simulates large enterprise environments, enabling high-fidelity assessments under realistic conditions. 

Led by Visionaries, Powered by Experts

The new NSS Labs is guided by a leadership team with deep expertise in cybersecurity testing, strategy, and innovation:

  • Vikram Phatak, Chief Executive Officer & Founder
  • Ian Foo, Chief Technology Officer & EVP of Product
  • Carma Austin, Chief Strategy Officer
  • Cathy Main, Chief Marketing & Communications Officer
  • Tim Otto, Vice President of Test Operations
  • Thomas Skybakmoen, Vice President of Research

To learn more about NSS Labs and its services, visit www.nsslabs.com.

About NSS Labs

NSS Labs delivers research-backed insights through its advanced testing platforms, empowering enterprises, security vendors, and service providers to make informed, evidence-based cybersecurity decisions. By handling the heavy lifting of testing for effectiveness, performance, and suitability, NSS Labs helps clients move beyond assumptions to gain actionable clarity. Its auditing and governance services offer continuous assurance that deployed security technologies are performing as expected—protecting investments and supporting accountability.