When AI Finds Every Bug

The discovery clock just accelerated

On April 7, Anthropic announced that its newest model, Claude Mythos Preview, had autonomously discovered thousands of high and critical severity zero-day vulnerabilities across every major operating system and web browser—many hiding in plain sight for over a decade. A 27-year-old bug in OpenBSD. A 16-year-old flaw in FFmpeg that automated fuzzers had hit five million times without catching. And Mythos doesn’t just find vulnerabilities—it writes the exploits, succeeding on over 83% of first attempts where previous models achieved close to zero.

That is good news in one sense: software vendors and maintainers may be able to identify and patch flaws earlier, and reduce the time that dangerous defects remain unknown.

But that same acceleration also accelerates threat landscape changes for everyone else. The same class of AI capability that can help defenders find weaknesses can also help bad actors understand them faster, build exploits around them faster, and operationalize attacks at far greater speed and scale.

This is a watershed moment. But while the headlines focus on offensive implications, the downstream consequences for enterprise security operations are just as profound.

The deployment clock has not

The optimistic reading is simple: vulnerabilities found earlier get patched earlier. Project Glasswing—Anthropic’s well thought-out consortium with AWS, Apple, Cisco, Google, Microsoft, and others—is already putting Mythos to work scanning critical codebases. But enterprise security leaders know the uncomfortable truth: a patch being available and a patch being deployed are two very different things.

Even when software owners produce fixes more quickly, enterprise deployment lifecycles still have to contend with regression testing, change windows, operational dependencies, rollback planning, uptime requirements, and the broader risks that come with touching mission-critical systems.

For most enterprises, especially those operating essential services or critical infrastructure, upgrade patterns will still need to follow established risk-mitigation best practices. The cost of an unstable production change can be just as severe as the vulnerability itself.

Layered defenses under increasing pressure

That leaves a familiar but increasingly compressed exposure gap: the period between when a vulnerability is known and when an organization can safely deploy the patched version into production.

As has been true for years, enterprises will need to rely on layered defenses to help close that gap. Firewalls, IDS/IPS, segmentation controls, and related security systems will remain essential in providing mitigation protection while upgrades are planned, tested, and rolled out safely.

What changes now is the pace and volume. If AI sharply increases the rate at which vulnerabilities are discovered, then the burden on protective controls will grow with it. Those systems will need to implement and deploy signatures, policies, and other mitigation mechanisms more quickly and more often.

Deployment has to be not just faster, but smarter

In this environment, simply deploying a mitigation faster is no longer enough. A signature added to an IPS, a rule pushed to a firewall, or a policy configured on paper does not by itself prove that the control is effective against the exploit path it is supposed to stop. An overly broad signature triggering false positives will interrupt legitimate business.

Those mitigation measures can be made smarter through validation that they are effective in protecting against the consequential vulnerabilities and exploits being targeted. Otherwise, organizations are not demonstrating risk reduction; they are assuming it.

This distinction will matter more as AI also helps adversaries accelerate exploit development and surrounding tooling. The attack side of the equation will move faster, which means control effectiveness must be measured more often and with greater rigor. Absent that, we will have a case where enterprises try to move fast but end up breaking things.

Why oversight pressure will rise

A hyper-accelerated threat environment will not only affect security teams. Lawmakers, regulators, insurers, auditors, boards, and other oversight bodies focused on enterprise risk are watching the same headlines. When AI can autonomously compromise critical infrastructure software, tolerance for vague assurances evaporates.

Their questions will increasingly shift from broad policy statements to operational evidence. Questions will evolve from “Do you have security controls?” to “What have you done to prevent this, and what is the evidence it’s working?”.

As a result, requirements for proof of continuous validation testing and evaluation of deployed security controls are likely to become more common across corporate governance, risk management, compliance, insurance underwriting, and sector-specific oversight agencies.

Continuous validation becomes a core operating discipline

This is why continuous validation testing of security controls will become more critical in enterprise IT operations.

Continuous validation gives organizations a practical way to bridge the gap between faster vulnerability discovery and the slower, smarter, necessary discipline of safe production change. It helps prove that compensating controls are providing real mitigation while the enterprise works through responsible remediation cycles.

In the age of AI-accelerated vulnerability discovery and exploit creation, continuous validation will become central not only to security operations, but also to meeting corporate GRC obligations, supporting regulatory readiness, and demonstrating defensible cyber resilience.

Organizations that build this into their operational cadence, GRC programs, and vendor accountability frameworks will be able to answer the hard questions when regulators and boards come asking. Those that don’t will find themselves defending assumptions in a world that no longer accepts them.

Continuous validation isn’t the future of enterprise security operations. It’s the present. Mythos just made it impossible to ignore, and Project Glasswing presents a great opportunity to respond.

What enterprise leaders should do now

Priority Why it matters
Preserve disciplined change management Do not trade mission-critical availability for superficial patch velocity. Upgrades still need controlled testing and rollout.
Strengthen layered defenses Firewall and IDS/IPS mitigations will be asked to carry more of the burden during the exposure window.
Validate mitigations continuously Controls need evidence-based testing to prove they actually block the vulnerabilities and exploits they target.
Prepare for evidence demands Regulators, insurers, boards, and auditors will increasingly expect proof, not just claims, that controls are working.

NSS Labs Appoints Industry Veteran Dominick Delfino as Executive Advisor

Austin, TX – March 24, 2026 – NSS Labs, the leading authority in independent cybersecurity product validation, today announced the appointment of Dominick Delfino as Executive Advisor. A seasoned technology leader with more than 25 years of experience at Google Cloud, Nutanix, Pure Storage, and Cisco, Delfino will provide strategic guidance to the NSS Labs leadership team as the company expands its testing capabilities for the next generation of AI-driven cybersecurity.

Delfino joins NSS Labs at a pivotal moment for enterprises, where the rise of sophisticated, automated threats has made independent, real-world validation of security efficacy more critical than ever.

Most recently, Delfino served as Global Vice President of Cybersecurity Sales at Google Cloud, where he led the global go-to-market strategy for the company’s security portfolio, including the integration of Mandiant. His distinguished career also includes serving as Chief Revenue Officer at Nutanix and Pure Storage, as well as holding senior leadership roles at VMware and Cisco.

“Dominick is a distinguished leader in the technology and security space,” said Vikram Phatak, CEO of NSS Labs. “His experience scaling global organizations and his deep understanding of the cloud and security landscape from his time at Google Cloud and VMware will be invaluable. Dominick understands exactly what enterprise customers need, and his guidance will be instrumental as we grow our enterprise programs.”

“Throughout my career, I’ve witnessed how difficult it is for organizations to separate marketing claims from actual security performance,” said Delfino. “NSS Labs has always stood for transparency and data-driven truth in a crowded marketplace. I am thrilled to be helping the team scale and ensure that enterprises have the right tools to deliver independent, real-world validation of their security controls.”

As Executive Advisor, Delfino will focus on accelerating NSS Labs global sales, enhancing strategic partnerships, and aligning the company’s roadmap with the rapidly shifting requirements of AI.

SDxCentral: Palo Alto Networks and Fortinet Given All Clear After Firewall Hiccups

Palo Alto Networks and Fortinet have received a clean bill of health for their firewall protections, while the jury is still out on current Cisco defenses.

CyberRatings.org recommended both Palo Alto and Fortinet after new tests confirmed they had patched evasions previously discovered by the security testing firm.

In tests carried out at the start of the month by CyberRatings’ testing partner NSS Labs, researchers found they were able to bypass protection using Layer 4 TCP evasions in both Palo Alto’s PAN-OS (version 11.2.8-c537) and Fortinet’s IPS (v7.01154), as well as evading Layer 3 IP in the Palo Alto operating system.

Both firms reacted quickly, with Palo Alto developing an updated PAN-OS firmware package (PAN-OS 11.2.10-c37) and Fortinet deploying an updated IPS package (v7.01165 (33.00064) to fix the vulnerabilities.

Read the full article here.

When Firewalls Fail Gracefully

The latest NSS Labs Enterprise Firewall Comparative Report was published this month and, as usual, provided a deep insight into the state of the enterprise firewall market.

Seven of the most widely deployed products were tested using real-world attack scenarios, enterprise-grade workloads, and adversarial evasion techniques to measure their resilience, reliability, and performance.

The results reveal a security landscape that remains uneven: most products blocked the majority of exploits and malware, but a few stumbled when exposed to modern, and not so modern, evasion techniques.

However, the story doesn’t end with the Comparative Security Map – it is also a case study in vendor accountability. How vendors respond when weaknesses are exposed in independent tests such as this tells us a lot about how they are likely to support their enterprise customers in a pinch. It also tells us how seriously they take engineering challenges that could result in serious failures, or even breaches, when installed in live environments.

Palo Alto Networks and Fortinet, though not the highest-scoring participants, stand out precisely because they treated the findings as an opportunity to rectify shortcomings in their products that could have a serious impact on their customers. Within days of publication, both vendors confirmed patches for the issues identified and scheduled retests for the affected products. That kind of responsiveness deserves as much attention as raw test scores.

The Test That Matters

NSS Labs enterprise-firewall evaluations are the most comprehensive in the industry. The 2025 round measured not only exploit and malware detection, but also resilience against 53 evasion categories, false-positive accuracy, TLS/SSL handling, and sustained throughput under realistic enterprise workloads.

In other words, this isn’t a marketing test with cherry-picked “perfect” network traffic and well-known basic exploits and malware. Each firewall was deployed in-line between trusted and untrusted networks, then stress-tested with:

  • A broad range of “real world” network traffic designed to emulate typical enterprise traffic, both encrypted and plain text.
  • 3,326 exploit samples from vulnerabilities found in the wild in enterprise environments.
  • 11,311 malware samples drawn from active campaigns.
  • 5,752 evasion variations spanning 53 evasion categories, crafted to bypass defenses.
  • 55 performance stress tests spanning HTTP, HTTPS, and UDP traffic, created to measure throughput, stability, and reliability under stress.

This combination produces an in-depth view of security efficacy, together with an evaluation of performance using mixtures of real-world traffic. In today’s enterprise networks, where more than 95 percent of web sessions are HTTPS, it is important for firewalls to be able to handle encrypted traffic.

How the Vendors Fared

Three of the seven firewalls achieved Recommended ratings: Check Point, Juniper Networks, and Versa Networks. All delivered security effectiveness above 99 percent with false-positive accuracy in the high 90s.

Three vendors received Caution ratings: Cisco, Fortinet, and Palo Alto Networks. Their placement wasn’t due to catastrophic malware or exploit detection failures, since each still handled most malicious payloads effectively, but because of critical failures in their ability to resist low-level evasion techniques.

This continues to be an issue today, just as it was at the inception of NSS Labs 1.0 in 2007. You might think that we should be seeing 100% resistance by now, but instead coverage appears to be cyclical. It seems that vendors will work hard to build robust code that handles evasions well, but later engineering teams deprioritize that area of development, or complex new features simply break it.

Two key points are evident:

  1. Evasion handling is a powerful differentiator today, just as it has always been.
  2. Throughput disparities can be significant, especially when encrypted traffic is thrown into the mix.

What Went Wrong—and Right

While malware and exploit detection rates across the board were excellent (most above 99 percent), the evasion results exposed real-world risk. A single missed evasion can allow bad actors to reuse entire classes of exploits allowing malicious traffic to go undetected.

Cisco failed one critical TCP-segmentation evasion, reducing its exploit-evasion resistance to 40 percent; Fortinet missed one transport-layer variant, scoring 60 percent; and Palo Alto Networks failed both network and transport-layer categories, resulting in 0 percent exploit-evasion resistance.

Why Responsiveness Matters

However, it is not all about pure test results, but rather how a vendor responds to those results that really matters. That defines the kind of relationship they are likely to have with their customers, and how seriously they take their engineering mission. In cybersecurity, perfection is fleeting. Every product eventually encounters a configuration bug or parser flaw. What separates mature vendors from pretenders is how quickly and transparently they respond.

Palo Alto Networks and Fortinet publicly acknowledged the test outcomes, issued software updates within a couple of weeks, and scheduled retesting. That is what enterprise customers should be looking for from their security partners: transparency and the willingness to participate in independent tests in the first place, followed by the desire to act on the results of those tests to improve their product expediently where necessary.

NSS Labs urges enterprises to hold vendors accountable and demand transparency. Vendors who view testing as collaboration rather than confrontation, will build lasting trust as well as solid products.

Performance Under Pressure

Security effectiveness means little if performance tanks under real workloads. NSS Labs Rated Throughput metric weights encrypted traffic at 95 percent, mirroring modern conditions. Versa achieved the highest sustained throughput (7.6 Gbps) with strong security; Juniper balanced speed and protection; Fortinet offered excellent value; Palo Alto trailed but excelled in accuracy.

False Positives: The Hidden Cost

NSS Labs replaced its previous price-per-protected-megabit metric with false-positive accuracy as a more meaningful measure of operational overhead. Cisco’s 80 percent accuracy implies legitimate traffic was incorrectly blocked one-fifth of the time, which may cause issues in live deployments. Conversely, Palo Alto, Versa, and Fortinet all exceeded 99 percent in terms of resistance to false positive scenarios.

The New Baseline: Encryption Everywhere

With more than 95 percent of global web traffic encrypted, enterprise firewalls need to be able to handle it without suffering significant performance degradation. All firewalls handled decryption properly, but some paid steep penalties in terms of performance. Versa and Juniper maintained 80–90 percent efficiency, while Palo Alto and Cisco lagged near 70 percent.

Beyond the Scoreboard

At first glance, a Caution rating in the CSM might appear damning, but within weeks those numbers will likely change as fixes are validated and re-tested. Resilience isn’t static; what defines market leadership is the ability to recover quickly, transparently, and collaboratively.

Independent testing remains the crucible through which trust is forged. The vendors who embrace scrutiny, fix what’s broken, and invite another round of validation are the ones enterprises should bet their networks on.

Because in the end, cybersecurity isn’t about being flawless. It’s about being fast, honest, and relentless in pursuit of better protection.