Following publication of our new breach detection systems (BDS) test results, FireEye responded somewhat forcefully in a blog post by Manish Gupta.
Not everyone can end up in the top right quadrant of the NSS Labs Security Value Map™ (SVM), so it is not unusual for someone to be unhappy. It is, however, unusual for someone to behave the way FireEye did in this instance. Normally we would not respond to such attacks, but there are a number of untruths and misdirections in their blog post that we feel we must address.
For that reason I have pulled out the FireEye claims in the table below and have responded to each one in turn:
FireEye claim | NSS response |
“We declined to participate in this test” | Untrue. When this test started 7 months ago, FireEye was a willing participant. |
“The FireEye product they used was not even fully functional, leveraged an old version of our software and didn’t have access to our threat intelligence”; | Untrue, see above. It was a fully functional product installed and configured by FireEye engineers. |
“We did participate in the BDS test in 2013” | True. This is the same test! Started in 2013, finished in 2014. |
“We insisted that the only way to properly test was to run in a REAL environment” | We agree! NSS Labs uses a real, live environment, with real PCs going to real, live malicious URLs, opening malicious emails, etc. All exploits that are run and all malware that is dropped are live on the Internet at the time of the test. This is made clear in our published methodology. |
“NSS declined to change their testing methodology” | Partially true. FireEye asked for us to change the methodology AFTER they saw the test results. Clearly we cannot do that. All vendors (and our enterprise clients) are invited to participate in the development of new test methodologies before the test begins. Even FireEye (which they did). |
“FireEye detected 201 of 348 total samples. Of the 147 “missed” samples 11 were non-malicious, 19 were corrupted, 117 were duplicates” | Seems as though someone is cherry-picking the data, or just doesn’t understand how the tests or scoring work.
201/348 is only 57.7%. NSS tests found FireEye detected 94.5% of attacks (329 of 348). This is because we give credit for a detected breach – period. They get credit if they: (1) detected the successful exploit; or (2) detected the callback to command and control (C&C); or (3) detected the malware that was then (a) downloaded and (b) installed / executed; or (4) detected the exfiltration of data to the C&C; or (5) detected the compromised host’s attempt to attack other PCs on the network (usually via a P2P SMB2 attack). Duplicates are to be expected given the nature of the crimeware ecosystem. It would appear that, in this case, “corrupted” is code for “it didn’t run in our sandbox”. In other words, they didn’t have 64bit support or didn’t have the right emulation in the guest, etc. |
“Understanding advanced threats still represents a black hole for many in today’s security industry” | We agree! |
“NSS mostly relied on VirusTotal to download payloads” | Totally untrue. Hard to see where this claim even comes from. |
“The NSS sample set doesn’t include Unknowns, Complex Malware (Encoded/Encrypted Exploit Code & Payload), and APTs.” | Untrue. And if everything was so easy, why did they miss attacks? If you cannot detect it in the lab, you cannot detect it in the field. Further, our evasion testing showed that FireEye had issues with detecting attacks that were compressed using self-extracting technology (i.e., 7zip). |
“An aside: the oldest sample on VirusTotal is from 2006”; | What does the median sample age from VirusTotal have to do with this test? Moving on . . . Regardless, every single exploit and malware sample in the NSS test was actively in use as part of a current malware campaign at the time of testing. If FireEye missed a sample that has been around since 2006, then maybe that is WHY the bad guys are using it? |
“The other vendors in the NSS report are built for detecting known malware” | Interesting observation. What exactly is “unknown malware” except malware that is going to be known tomorrow? How can FireEye claim to detect “unknown” malware but then miss “known” malware? If the product is adept at detecting “unknown” malware, shouldn’t it be able to detect “known” malware by the same methods?
This whole line of argument is a setup. It tries to imply that NSS (or anyone) cannot possibly test the FireEye product effectively, because as soon as they find malware on the internet, it becomes “known.” Alternatively, NSS could create 0-day malware/exploits, but then we would be accused of running a “lab” test and FireEye would claim they would have caught them in the “real world.” It’;s a Catch 22. The problem is that by FireEye’;s definition, the efficacy of the product can never be proven since they claim to detect all “unknown” malware, yet they can never tell you about the stuff they missed. The essence of FireEye’s complaint is that we did, in fact, tell you about the things they missed. |
“Even for Payloads, NSS doesn’t perform Forensics Analysis to understand if the sample is malicious, goodware or corrupt” | Untrue. We perform both static and dynamic analysis on the entire test set. |
“NSS gives a positive score as long as a vendor sees the sample on the wire, even if the sample is not actually malicious” | Untrue – read the methodology. For one thing, we have a false positive test that would penalize products, not reward them, for alerting on non-malicious samples. |
“The NSS test confused Adware, Spyware, & APTs and accounted for Adware and Spyware as APTs” | Untrue, everything was classified accurately. |
“There were no zero day exploits in the test sample” | True. Zero days were excluded from the first round of testing at the specific request of certain vendors, and even some of our clients, because they wanted this test to be “real world” (their words). The only way to introduce a 0-day (that no vendor is aware of) is for NSS to create it, to which FireEye objected. Another Catch 22.
Point is, this argument is a red herring designed to discredit the methodology rather than focus on weaknesses in the product. If we assume 0-days are the most difficult to detect, then any product claiming reliable 0-day detection should be getting 100% on the “known”stuff, right? Why can’t something be detected just because it is “known” |
“This year, we have already uncovered two zero days”; | Nobody is saying the FireEye product is without value. Our test simply points out that the product is not perfect and that competitors have leapfrogged FireEye in the BDS market. |
“Unlike our customers, the FireEye appliances were NOT connected to our Dynamic Threat Intelligence cloud to get latest content updates, virtual machines and detection capabilities” | Untrue. FireEye engineers set up the product. |
“Their testing methodology for BDS is also more suited to testing IPS products | Untrue. Compare the methodologies for yourselves and decide. |
“Our product’s efficacy is proven by how well we protect customers in real-world deployments.” | This is a fallacious argument. Testing a security product thoroughly does not consist of putting it in your network and waiting to see how many things it catches. This test is not about how many things you catch, it is about how many things you miss. How do you know what you don’t know? That is where NSS comes in. |
“We respect NSS and the work they do” | Thank you! We can feel the love. |
In the grand scheme of things, FireEye’s results were not that bad. The real issue here is that FireEye now has credible competition in the BDS market place and the data from this NSS test shows it.