Network IPS Criteria PDF Print E-mail

Criteria and Methodology for Intrusion Prevention Systems

 
Download Network IPS Test Methodology v5.22

 

Detection Accuracy & Breadth

This group of tests verifies that the NIPS will not block legitimate traffic (Accuracy) and is capable of detecting and blocking a wide range of common exploits (Breadth). Although breadth is extremely important, accuracy is critical because a NIPS that blocks legitimate traffic will not remain in-line for long.

NSS has a huge library of trace files of recent exploits, including multiple variations of each exploit with different payloads, using different attack vectors, etc. In the wild exploits and common attack tools, such as Metasploit and Core Impact, are also used to run real-time test cases against live vulnerable servers.

NSS carefully selects test cases to fall into different categories of severity based on: whether the exploit will provide root/administrator access on a widely deployed operating system or application; whether it imposes a DOS condition with no risk of system compromise; whether it is against a system which is not widely deployed or is not Internet-facing; whether it is purely a reconnaissance technique designed to gather information for a subsequent attack attempt; and so on.

To test false negative performance, a number of common exploits are changed in subtle ways to alter the superficial appearance on the wire, whilst ensuring that the exploit still performs the intended system compromise.

To test false positive performance, NSS uses its huge library of trace files of normal network traffic - some including “suspicious” content which is not malicious - together with a number of “neutered” exploits that have been rendered completely ineffective.

Whilst it is not possible to validate completely the entire signature set of any IPS, these tests demonstrate how accurately the IPS detects and blocks a wide range of common exploits and their variants, port scans, and Denial of Service attempts, whilst remaining resistant to false positive alerts.

All detection tests are repeated twice. The first run is with the sensor deployed in-line in blocking mode using the default policy/recommended settings provided out of the box by the vendor. This is the way most of these devices will be deployed initially and the number of test cases detected and blocked in each category is recorded. The second run is performed after the policy has been tuned to enable any low priority or audit-only signatures which may be disabled by default. No product or signature updates are allowed during the tests.

Naturally, Rate-Based IPS devices will not respond to the same attack traffic as Content-Based devices. For those devices, therefore, the Detection Accuracy tests involve detecting and mitigating a wide range of rate-based attacks such as port scans, SYN floods, connection floods, and so on. We note which of these are mitigated completely, which are mitigated partially, and which require the use of built-in firewall capabilities.

Resistance To Evasion Techniques

These tests verify that the IPS is capable of detecting and blocking basic exploits when subjected to varying common evasion techniques. An IPS that cannot detect attacks subjected to these “script kiddie” evasion techniques is easily bypassed.

The tests consist of eight parts (only the final section is applicable to Rate-Based devices):

1. Baselines

This establishes that the IPS is capable of detecting and blocking a number of common basic attacks (the baseline suite) in their normal state, with no evasion techniques applied.

2. Packet Fragmentation

The baseline attacks are repeated, running them through fragroute using IP fragmentation evasion techniques.

3. Packet Fragmentation and Stream Segmentation

The baseline attacks are repeated, running them through fragroute using TCP segmentation evasion techniques.

4. RPC Fragmentation

The baseline RPC attacks are repeated using a wide variety of RPC fragmentation techniques, and the device is also subjected to all levels of the Canvas Reference Implementation (CRI) test tool.

5. URL Obfuscation

The baseline HTTP attacks are repeated, this time applying various URL obfuscation techniques (applied at various levels of obfuscation).

6. HTML Obfuscation

The baseline server-to-client attacks are repeated, this time applying a range of techniques used to obfuscate the HTML document returned to the client, such that it evades the IPS whilst still being capable of being rendered by the client browser.

7. FTP Evasion

The baseline FTP exploits are repeated, this time with spaces and telnet control characters inserted in the FTP commands.

8. Miscellaneous Evasion Techniques

Certain baseline attacks are repeated, and are subjected to several protocol- or exploit-specific evasion techniques, including altering default ports, polymorphic mutation of shell code, and so on.

For each of the evasion techniques, we note if (i) the attempted attack is blocked successfully (the primary aim of any IPS device), (ii) the attempted attack is detected and an alert raised in any form, and (iii) if the exploit is successfully “decoded” to provide an accurate alert relating to the original exploit, rather than alerting purely on anomalous traffic detected as a result
of the evasion technique itself.

Stateful Operation

If the IPS is tracking TCP session state, then it has the potential to introduce denial of service when the session table becomes full (too many connections) or if it can’t keep up with the creation of new sessions (too many connections per second). As with latency and bandwidth, the number of connections supported by the IPS and its connection per second rate should be matched to the network.

For example, a fully saturated Gigabit Ethernet link can handle 22,000 5KByte transfers per second. Assuming each connection lasts 20 seconds, the IPS should be able to handle 448,000 simultaneous connections. These numbers scale proportionately for slower networks. Any IPS that doesn’t offer these capabilities will impact performance of Web or e-commerce servers.

The aim of this section is to be able to determine whether the IPS is capable of monitoring stateful sessions established through the device at various traffic loads without either losing state or incorrectly inferring state.

An IPS that does not maintain TCP session state can flood the management console with false-positive alerts. Although this should not directly impact the IPS blocking function, it can make it very hard to perform forensic analysis of the attacks. In addition, if the default condition of the sensor is to block all traffic for which it does not believe there is a current connection in place, then an inability to maintain state under extreme conditions could result in the sensor blocking legitimate traffic by mistake.

In the first part of this test, we determine the theoretical maximum number of concurrent TCP connections that can be supported using various HTTP response sizes.

We then test whether the sensor is capable of preserving state across increasing numbers of open connections up to, and exceeding, the maximum. The tests also ensure that the device continues to detect and block new exploits while not blocking legitimate traffic when the state tables are filled. Needless to say, the passing of any malicious traffic at any point in the tests results in an automatic “FAIL”.

In the final tests, we transmit a number of packets taken from capture files of valid exploits, but without first establishing a valid session with the target server. This determines resistance to stateless attack tools such as Stick and Snot. In order to receive a “PASS” in this test, no alerts should be raised for any of the actual exploits. However, each packet should be blocked if possible since it represents a “broken” or “incomplete” session.

Performance

Any IPS is expected to be reliable (not crash), to never block legitimate traffic, and to not unduly affect network or host system performance.

The latency and throughput of a Network IPS (NIPS) or Attack Mitigation device must be on a par with other equipment in the network on which it is deployed, and in this respect, an in-line NIPS must strive to perform much more like a switch than a typical passive security device, especially when it is necessary to install more than one NIPS in the same data path.

Detection/Blocking Performance Under Load

This group of tests verifies that the IPS does not adversely impact legitimate traffic, even when new TCP connections are being created rapidly. We also verify that the sensor is capable of detecting and blocking exploits when subjected to increasing loads of background traffic up to the maximum bandwidth supported as claimed by the vendor, using a range of HTTP response sizes and packet sizes.

An IPS that misses attacks under load can be evaded. An IPS that adversely affects legitimate background traffic will not stay in-line for long.

A fixed number of exploits are launched with zero background traffic to ensure the sensor is capable of detecting our baseline attacks. Once that has been established, increasing levels of varying types of background traffic are generated through the IPS device in order to determine the point at which the sensor begins to miss attacks.

All tests are repeated with 25 per cent, 50 per cent, 75 per cent and 100 per cent loads of background traffic up to the maximum rated throughput of the device. The tests are conducted with UDP, HTTP, and mixed-protocol traffic and include very high packet rates and TCP connection rates designed to stress the device, as well as determine its likely performance on a “typical” live network.

Latency & User Response Times

In any network environment latency is important. Latency may impose an upper bound on throughput and it also has an impact on interactive applications, thus affecting user response time. As such, it is important to understand the impact of latency introduced by a NIPS and to determine the maximum acceptable delay, which will be different for each network.

There is a direct relationship between latency introduced by a networking device and the maximum throughput allowed by that device on a single TCP connection. There is a critical value for the round trip time (RTT) of a packet in each network, and if the latency is below this critical value, TCP throughput will be unaffected - instead, it is the line speed of the underlying network which becomes the bottleneck. Above this critical value, however, TCP throughput is negatively impacted. To be specific, the maximum throughput achievable for any given TCP connection in a zero loss network is expressed as:

throughput = window / RTT

where window is the maximum TCP window size (64 Kbytes by default) and RTT is the round trip time in the network.

This equation tells us that the throughput of a TCP connection is inversely proportional to network latency (note that this is TCP throughput for one connection - the aggregate bandwidth is not affected by latency). In other words, if you double latency, you halve throughput.

Consider adding a NIPS in an internal Gigabit network where the RTT is 200 microseconds. The critical value for RTT in a Gigabit network is 500 microseconds (below which it may no longer be possible to achieve 1Gbps of throughput), which means the NIPS can add a maximum of 300 microseconds to the RTT without affecting the network. In this particular case, therefore, for an internal, high speed deployment, the administrator may determine that his chosen IPS device needs to be capable of sub-300 microsecond latency under normal traffic loads.

Of course, the latency of an IPS device may vary significantly based on packet size, complexity of the protocol, presence of attack traffic, or simply the makeup of the normal traffic passing through it. For example, Gigabit segments, will rarely carry only a single TCP connection. Rather, a saturated Gigabit segment could be supporting hundreds, if not thousands of TCP connections, and this multiplexing eases the impact of latency on the overall throughput on the segment.

Although each of these connections carries only a fraction of the total throughput, a few connections tend to dominate. The maximum latency for a NIPS is then determined by the utilisation of the fastest connection. For example, in a Gigabit Ethernet segment carrying 10,000 TCP connections the fastest connection might have a throughput of 250Mbps. In this case, the critical value for round trip latency is as high as 2 milliseconds.

Assuming the latency without the NIPS is 300 microseconds, an administrator may therefore determine that his chosen NIPS device must be capable of 1700 microsecond round trip latency (850 microseconds in each direction).

Such critical value calculations are important when TCP connections achieve maximum throughput, which is true for large data transfers. For smaller data transfers, and non-TCP applications like NFS, latency has a more direct impact on user experience - response time is directly proportional to latency. That is, doubling latency doubles response time. In these situations, the latency of the network in which a NIPS is deployed determines the acceptable latency of the NIPS.

Consider deploying a hypothetical NIPS with 1 millisecond one-way latency in the following scenarios:

  • In internal corporate LANs, the round trip latency could be in the 200-300 microsecond range. Deploying our hypothetical NIPS would increase the maximum round trip latency to 2.3 milliseconds, an increase of just over 700 per cent. The time to copy a large group of files, for example, would increase by a factor of seven.
  • In inter-campus corporate networks connected over a MAN, the latency could be in the 500-1000 microsecond range (or less). Deploying our hypothetical NIPS would increase the maximum round trip latency to 3 milliseconds, a minimum increase of 300 per cent. The time to copy a large group of files, for example, would increase by at least a factor of three.
  • Internet facing connections experience round-trip latency from 10-100 milliseconds. Deploying our hypothetical NIPS would increase the round trip latency by 1-10 per cent, which would have only a minor impact on the user experience.

The latency of the NIPS must therefore be evaluated in the context of the network in which it is deployed. For example, to protect networks that are accessed over the public Internet, one-way NIPS latencies in the 1-2 millisecond range would be acceptable. Whereas for NIPS deployments on MAN/WAN links, NIPS latencies of well under 1 millisecond would be essential. As we have already mentioned, for deployments on internal networks where latencies are a few hundred microseconds, NIPS latencies of less than 300 microseconds would be more appropriate.

Network administrators have laboured long and hard to reduce latency within the corporate network to an absolute minimum. Core network devices such as switches are frequently chosen as much on their performance - packet loss and latency under all load conditions - as any other feature. Given that Network IPS devices are operating in-line, it is not surprising that they will be evaluated in a similar way.

For this reason, part of NSS Labs methodology uses very similar testing techniques to those we would normally employ when testing switches (in order to determine packet latency), in addition to measuring application latency. This group of tests determine the effect the IPS sensor has on the traffic passing through it under various load conditions. High packet latency will lower TCP throughput. High application latency will create a negative user experience.

Bi-directional network latency of a range of differently-sized UDP packets is measured under three test conditions: with no background load (latency measurement traffic only), with varying loads of HTTP traffic (from 25 to 100 per cent of the maximum rated load of the device), and while the device is under a heavy DOS/DDOS attack (up to 10 per cent of the rated throughput of the sensor).

Spirent Avalanche and Reflector devices are also used to generate HTTP sessions through the device in order to gauge how any increases in latency will impact the user experience in terms of failed connections and increased Web response times. This “application latency” is measured both with no background load and while the device is under attack.

Stability & Reliability

These tests verify the stability of the IPS device under various extreme conditions. Long-term stability is critical for an in-line IPS device, where failure can produce network outages.

In the first part of this test, we expose the external interface of the sensor to a constant stream of attacks over an extended period of time. The device is configured to block and alert, and thus this test provides an indication of the effectiveness of both the blocking and alert handling mechanisms. A continuous stream of exploits mixed with some legitimate sessions is transmitted through the sensor at a maximum rate of 90 per cent of the claimed throughput of the device for eight hours with no additional background traffic.

The device is expected to remain operational and stable throughout this test, blocking 100 per cent of recognisable exploits, raising an alert for each, and passing as close to 100 per cent of legitimate traffic as possible. If any recognisable exploits are passed - caused by either the volume of traffic or the IPS device failing open for any reason - this will result in a FAIL. If an excessive amount of legitimate traffic is blocked - caused by either the volume of traffic or the IPS device failing closed for any reason - this will also result in a FAIL.

In the second part of the test we stress the protocol stack of the device under test by exposing it to malformed traffic from the ISIC test tool for eight hours. The device is expected to remain operational and capable of detecting and blocking exploits throughout the test to attain a PASS.

We scan the management interface for open ports and active services and report on known vulnerabilities. We also stress the protocol stack of the management interface of the NIPS by exposing it to malformed traffic from the ISIC test tool. The device is expected to remain (a) operational and capable of detecting and blocking exploits, and (b) capable of communicating in both directions with the management server/console throughout the test to attain a PASS. We also note whether the sensor detects the ISIC attacks even though targeted at the management port.

Usability

After quantitatively evaluating the network performance and security effectiveness of the IPS, NSS qualitatively evaluates the features and usability of the product.

This evaluation provides the reader with valuable insight into product features, how easy it is to configure the IPS device and perform common, day-to-day operations with the management console.

Areas evaluated include configuration, policy management, alert handling, and reporting and analysis.

Key test criteria in each of the above areas are specified in the test methodology, and these are used as the basis for this evaluation. This ensures reporting is consistent across multiple tests, and thus makes it easier to compare identical features from product to product.
 

Network IPS Test Methodology v5.22

Exploit selection criteria for IPS testing

Currently certified products

 
Home  |  Product Database  |  Certification Services  |  Resources  |  Company  |  Contact
Copyright ©2008 by NSS Labs All Rights Reserved. Privacy Policy