From Regular Expressions to AI -

Three generations of attack detection methodology

The oldest and well-studied approach is based on signatures and heuristics.

From before the internet times, this approach was implemented in most kinds of detection systems from firewalls to anti-viruses. The second genera-
tion represents an improvement over the regular expression-based signatures by replacing them with specific parsers or tokenizers.

As a third generation solution, we have decided to dive deeper into the examples of attacks that can and can not be detected by the methods of different generations. While simple issues (simple true negatives) are detectable even by the legacy first generation solutions, eliminating false positives and dealing with multiple encoding requires understanding the application context. See the table below for some of the common examples.

The innovation in the 3rd generation of detection logic is to apply machine learning techniques to bring the detection grammar as close as possible to the
real SQL/HTML/JS grammar of the protected system.

We have recently published a whitepaper on Evolution of Detection Logic where we take a closer look at the grammar models that lie in the foundation of different detection approaches.

The 3rd generation detection logic should be able to approximate a Turing machine to cover recursively enumerable grammars.

This task of creating an adaptable Turing machine was unsolvable up until 2010th when the first researches of neural Turing machines were published.

Read the full whitepaper to get a better understanding of the machine learning algorithms involved, models and approaches, that are applicable to the logic of detection not only in WAFs but also in other security solutions from IPS and DAST to compliance signatures.