Automatically Detect + Parse and Set Your Own Rules

A unique Wallarm AI feature is its ability to automatically detect and parse complicated API protocols and then set up security rules based on specific data or parameters deep inside the API.

Once parsed, the system creates the rules-based both on where in the HTTP request a specific set of data or a parameter is found and the actual data within the parameter.

The API parameters within the HTTP request are used in a number of security controls — from clusterization used to determine the application context and logic, to deep learning to setting rules specific to API-endpoints, triggered by the explicit data matches.

The technology can protect from lax input validation, behavioral attacks, and various threats.

That said, in some situations, it makes sense to input API definitions by hand. This approach can help if there is insufficient traffic to serve as a learning set or if you want to compare the rules that you enter manually and their business logic to what was set up automatically. In this article, we’ll show a step-by-step example of how it can be done.

SugarCRM and Its API

To illustrate how the tandem parsing and rule-setting works, let’s look at a popular contact management application, SugarCRM. SugarCRM has no formally defined API description, like Swagger. So, the only way to understand the API parameters is to parse it manually from the HTML docs available here.

SugarCRM and Its API

We parse that HTML-formatted doc by a simple curl+egrep combination followed below:

$ curl -s -o - | gunzip | egrep -o '>/[^GET|POST]+ (GET|POST)' | tr '>' ' ' >

This command gives us a list of all the 149 REST endpoints described there. It’s like this:

/Accounts/:record/link/:link_name/filter GET
/Activities GET
/Activities/filter GET
/<module>/temp/file/:field POST
/<module>/temp/file/:field/:temp_id GET
/mostactiveusers GET
/oauth2/bwc/login POST
/oauth2/logout POST
/oauth2/sudo/:user_name POST
/theme GET
/theme POST

As you can see, we have two ways of variable parameters representation here: :name and <module>. The second one means the list and the first one means one of the data types. To simplify, I want to avoid any conditions for those kinds of parameters and define the endpoints by static REST URL parts only.

Understanding Wallarm API Endpoint Rules

Now, it’s time to parse the list we got and create related endpoints at Wallarm API.

First, we need to parse URL from SugarCRM docs to extract parts of paths from there. So, it’s basically a split by /delimiter.

At this point, we can understand Wallarm requests representation a little bit. Wallarm processes requests by parsing them into a serialized format that covers everything, including data encodings like JSON, XML, Base64, Multipart, etc. Inside this serialized representation of the HTTP request, every single parameter can be accessible by a unique key that consists of ordered sequences of parsers and parameters’ names. For example, to identify the data of ?aaa=xxx URL parameter, we can describe the key GET_aaa_value (in a string form), or ["GET", "aaa", "value"] in an array form. These keys equal to coordinates of request data inside HTTP requests called points in the Wallarm API.

To describe API endpoint, we need to define some points and conditions for the data there. In fact, different parts of URL should be matched by some strings to make the decision that a particular request targets a particular API endpoint. These decisions are called actions in Wallarm API.

So, to describe the API endpoint for the/rest/v10/<param>/link we need to give Wallarm a hint as to how these requests could look. To do this, we can send a JSON request that looks like this:

POST /v1/objects/hint/create
{ "type":"tag",
    { "type":"equal", "value":"rest", "point": ["path",0] },
    { "type":"equal", "value":"v10",  "point": ["path",1] },
    { "type":"equal", "value":"link", "point": ["path",3] }

As you can see below, the action consists of three different conditions that should match and we have no condition for the third part of the URL (["path", 2] point because 0 is the first index in arrays). Type, name, and value parameters related to the type of hint we want to give Wallarm AI, (and it’s just a “tag” hint) serve nothing more than to mark this kind of traffic for later use.

The easiest way to do generate these hints automatically is using all the list exported from SugarmCRM documentation by making a short script, like this:

import sys
import re
import json
import requests
UUID= "" #if you don’t know your own, please ask support team or check
secret = ""
client_id = 31337
API_URL = ""
line = sys.stdin
for line in sys.stdin:
    parts = []
    parts = re.split('/|s', line)
    parts.insert(0, "rest")
    parts[1] = "v10" #for some reason, all the SugarCRM endpoints provided in their documentation without API URL prefix
    parts = parts[:-2] # cut first 2 and last 2 elements from the list
    points = []
    for i in range(len(parts)):
        if not re.match("[^0-9a-zA-Z_]", parts[i]): #check if the REST URL part is static or variable (parameter)
         cur_point = {"point":[],"type":"equal"}
         cur_point["value"] = parts[i]
   api_request = {"type":"tag","clientid":client_id,"validated":False,"name":"API-type","value":"REST"}
   api_request["action"] = points
   print(json.dumps(api_request, separators=(',', ':'))
r =
     url = API_URL,
     headers = {
         "X-WallarmAPI-UUID":    UUID,
         "X-WallarmAPI-Secret":  secret
     json = api_request)

Using this is a great shortcut. We can check the results at the Wallarm interface. There we can ensure we really helped the machine learning engine to understand our REST API better:

Wallarm interface

API Security is Critical for DevSecOps

APIs represent the core set of functionality for modular applications today. Their growth insignificance, impact, and sheer volume are unstoppable and practically unimaginable in scale. Because most legacy WAFs can only create security rules based on the source and destination URLs, they miss most of the potential API attacks leaving you significantly exposed.

We have shown you an example of how to work with strong API endpoint rules — whether generated manually or automatically with the help of machine learning — to create deeper protection. Security has to be able to reach deeper into APIs and truly understand their parameters to effectively protect against threats and vulnerabilities. Security solutions have to be able to truly understand the data and, also, the context and logic. All are subject to change at higher and higher rates, so this mission is only increasing in volume and complexity. Humans need helpful tools for intelligent automation.