HTTP fingerprinting is an invaluable way to discover the underlying technologies powering a web application.
From analysing HTTP headers to performing malformed HTTP requests, these reconnaissance techniques help offensive security professionals pinpoint a target’s hidden weaknesses.
The upshot: more targeted attacks and an increased likelihood of uncovering vulnerabilities – as well as lucrative Bug Bounty rewards.
In this article, we’ll explore how HTTP fingerprinting reveals hidden components that may be vulnerable due to misconfigurations or outdated software. Vulnerabilities, especially those related to Common Vulnerability Enumeration (CVE) records, can be overlooked without robust fingerprinting – making this an essential skill for bug bounty hunters and security researchers.
Outline
- What is HTTP fingerprinting?
- HTTP header analysis
- Banner-grabbing and reverse proxy identification
- Inferring server banner based on header order
- Default error pages
- Malformed HTTP requests
- Identifying default file and directory structures
- Analysing cookie parameters
- Passive fingerprinting via third-party services
- Defences against HTTP fingerprinting
- Conclusion: a more targeted approach to bug hunting
- References
What is HTTP fingerprinting?
HTTP fingerprinting is a recon technique for gathering information about HTTP servers and their related components, such as programming languages, frameworks, proxies, content delivery networks (CDN) and web application firewalls (WAFs). Armed with this information, security researchers can learn how a target’s infrastructure is configured, which in turn sheds light on potential weaknesses.
Understanding the underlying technology stack often gives a direct path to known vulnerabilities or misconfigurations, since each technology comes with its own security history. By ascertaining which software version is in use, you can identify any related CVEs or vulnerability to publicly available exploits.
HTTP fingerprinting involves analysing different aspects of HTTP traffic to pinpoint unique attributes. Many technologies leave unique ‘fingerprints’ – such as special headers or default error pages – which can reveal clues about the environment. Knowing how to interpret these signs enables you to make your attack strategy more targeted and discover flaws more rapidly.
HTTP header analysis
You can learn a lot about a target’s technologies by analysing the HTTP response headers. Common headers that typically expose server or proxy details include Server
, X-Powered-By
, X-Powered-CMS
and Via
. Many technologies also use their own custom headers that expose the same details. For instance, with a header structure like X-<technology-name>
you might get headers such as X-Drupal-Cache
, X-Shopify-Stage
, X-Varnish
or X-Amz-Cf-Id
.
Using tools like Burp Suite, you can easily collect these types of headers using extensions or custom-written BChecks. This type of HTTP header analysis is invaluable, because even a single header can reveal a misconfiguration or an outdated and potentially exploitable software version.
Banner-grabbing and reverse proxy identification
Banner-grabbing, where servers are induced into revealing information about software banners, provides direct evidence of installed software and its version numbers – which helps security researchers assess which known vulnerabilities might be present in an environment. To grab the banner of a web server or reverse proxy, you can send a HTTP GET
request to the web server and then analyse the HTTP response.
Here are some HTTP responses generated by real Bug Bounty targets that expose their reverse proxy banner:
Apache HTTP responses
HTTP/1.1 200 OK
Date: Thu, 18 Jan 2025 18:21:12 GMT
Server: Apache/2.4.57 (Unix)
Last-Modified: Thu, 18 Jan 2025 19:21:12 GMT
Accept-Ranges: bytes
Content-Length: 1215
Connection: close
Content-Type: text/html
HTTP/2 200
date: Thu, 16 Jan 2025 18:28:54 GMT
server: Apache
content-length: 14162
cache-control: no-cache,no-store,must-revalidate
x-frame-options: SAMEORIGIN
accept-ranges: bytes
content-security-policy: frame-ancestors 'self';
x-content-type-options: nosniff
vary: Accept-Encoding
content-type: text/html;charset=utf-8
strict-transport-security: max-age=31536000; includeSubDomains;
Nginx HTTP responses
HTTP/2 200
date: Thu, 16 Jan 2025 18:32:21 GMT
content-type: text/html; charset=UTF-8
content-length: 722
server: nginx/1.21.6
last-modified: Wed, 12 Aug 2025 18:14:21 GMT
x-xss-protection: 1; mode=block
x-frame-options: SAMEORIGIN
access-control-expose-headers: OPTIONS, GET, POST
referrer-policy: strict-origin-when-cross-origin
accept-ranges: bytes
HTTP/2 200
cache-control: no-store, no-cache, no-transform, must-revalidate
content-type: text/html; charset=UTF-8
cross-origin-embedder-policy: require-corp
cross-origin-resource-policy: same-origin
date: Thu, 16 Jan 2025 18:38:34 GMT
expires: Thu, 16 Jan 2025 18:38:34 GMT
last-modified: Thu, 16 Jan 2025 18:38:34 GMT
referrer-policy: no-referrer
server: nginx
x-content-type-options: nosniff
x-frame-options: deny
x-xss-protection: 1; mode=block
In these scenarios, detecting the version of the reverse proxy being used for a specific web server was easy.
But what if this fingerprinting method fails to explicitly expose these details? Fortunately, subtle implementation details can give you additional clues about the proxy’s identity.
Inferring server banner based on header order
It might be possible to make an educated guess at the reverse proxy banner based on what order the HTTP response headers are presented in, although this depends on the proxy using default configurations.
Look again at the HTTP response examples given above – can you discern a pattern based on the order of the headers?
If we look closely, we can see that Apache responses put the date
header first, followed by the server
header, then content-type
and then content-length
headers.
As for the Nginx responses, we can see that the date
header, content-type
and content-length
appear above the server header. Nginx is less likely than the Apache reverse proxy to maintain a strict header order, but often follows the header order: date
, content-type
and server
.
Armed with this knowledge, we might be able to infer which reverse proxy is being used by a particular web server based on the response below – any ideas which one it could be?
HTTP/1.1 200 OK
Date: Thu, 16 Jan 2025 18:56:06 GMT
Server: *
Content-Length: 2325
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
However, since header order can be also affected by different configurations, software versions or security mechanisms that obscure server information, inferring banners based on header order should be combined with other fingerprinting techniques to obtain a clear picture.
Default error pages
We can also fingerprint a server’s proxy or web server by analysing its default error pages. These are revealed when you trigger HTTP status codes like 404 (not found), 500 (server error) or 403 (forbidden). Default error responses often expose distinctive elements from original templates used by the technology. For instance, Apache Tomcat, Spring Boot, Apache, Nginx and IIS all display default pages for specific HTTP status codes that reveal telltale signs about the technology stack.
Even with modest customisation, it is possible to fingerprint the technology used by analysing HTML tag structures, embedded comments or specific phrasing used within the page.
Examples of default pages:
Apache Tomcat (404 - not found)
Spring boot (404 - not found)
IIS (500 - Server error)
Malformed HTTP requests
Sending malformed HTTP requests to a web server typically triggers error responses that expose identifying details about the underlying technology. This tactic effectively turns an inadvertent server misconfiguration into a golden opportunity to discover vulnerabilities within that environment.
For example, a malformed HTTP request might involve sending a request with an invalid HTTP version or method, as shown in the examples below:
GET / HTTP/4.4
Host: example.com
XGET / HTTP/1.1
Host: example.com
If the web server is not configured to handle such requests, they can trigger an error in the technology used by the web server that potentially exposes the banner. As such, this technique can be an effective way to gather intelligence on the server’s configuration, despite its attempts to hide its identity.
Identifying default file and directory structures
Default file and directory structures also play a major role in HTTP fingerprinting. Analysing directory naming conventions is a great way to discover which framework the web server is using, while file extensions can reveal the underlying programming language.
Files like favicon.ico
, package.json
, changelog.txt
, README.txt
and LICENSE.txt
often contain version details that confirm elements of the technology stack. Once these structures are mapped, you can strategically focus your HTTP recon efforts on known vulnerabilities tied to the identified framework or language.
Analysing cookie parameters
Web servers and frameworks implement unique session cookie parameters that reliably indicate which programming language or framework is being used, even when hidden behind a reverse proxy. The complexity of configuring and masking these identifiers across diverse environments makes analysing cookie parameters a consistently reliable way to fingerprint a server.
By pinpointing the back-end architecture in place, you can focus your pentesting strategy on technology-specific vulnerabilities rather than wasting time on irrelevant exploits.
Examples of cookie parameter names used by various technologies:
PHPSESSID
(PHP)JSESSIONID
(Java)ASP.NET_SessionId
(ASP)CFTOKEN/CFID
(Adobe ColdFusion)
Passive fingerprinting via third-party services
Third-party services Wappalyzer and Shodan are excellent fingerprinting resources, in part thanks to their large databases of technology signatures collected from a wide range of web applications. They can be used to efficiently fingerprint your target's web server by aggregating historical data and patterns that might not otherwise be obviously visible, saving you significant time and effort.
Shodan provides useful insights about a web application’s open ports and running services. Also storing HTTP responses from the web application, Shodan is a great tool for leveraging the techniques described earlier to analyse HTTP response headers.
Wappalyzer, on the other hand, detects the underlying technology stack of a web application. This browser extension reveals frameworks, programming languages and third-party services powering websites. Lookups might confirm suspicions you had about the presence of certain technologies, or reveal components that would have stayed entirely hidden if you relied on manual inspection.
By combining this passively-collected data with your active findings, you gain a full-spectrum view of the target’s architecture and give yourself the best chance of spotting inconsistencies that signal potential misconfigurations.
Defences against HTTP fingerprinting
A crucial first step for protecting against HTTP fingerprinting is to suppress or customise server headers. Removing or rewriting version information in Server
or X-Powered-By
headers should hide the real server banner. Creating misleading values forces adversaries to expend additional effort guessing or probing for system details, and complicates any automated exploits that rely on easily accessible version data.
Default responses supplied by technologies such as Tomcat or Nginx often include logos or text that reveals a server’s identity. Replacing these default pages – or at least deleting the exposed version – makes HTTP fingerprinting significantly more difficult for the attacker.
To give some specific scenarios, with Apache, administrators can modify the httpd.conf
or .htaccess
files to remove or rewrite the Server
header, using directives such as ServerSignature Off
and ServerTokens Prod
. In Nginx, similar results can be achieved by setting the server_tokens
directive to off in the nginx.conf
file. For Tomcat, achieving this involves editing the server.xml
file to remove the server banner.
Another defensive layer can come in the form of web application firewalls that intercept and modify outbound HTTP headers.
Removing unnecessary files and directories is also advisable. Items such as changelog.txt
, README.txt
or LICENSE.txt
can contain explicit technology references or version details. If these files are no longer needed, they should be deleted; if they are needed, then an access control should be configured to prevent unauthorised users from viewing their contents.
Regularly reviewing server configurations and conducting periodic vulnerability assessments can ensure that header customisations and file restrictions remain effective against evolving fingerprinting techniques. Automated monitoring tools like intrusion detection systems and continuous compliance scanners can also alert administrators to any configuration drifts or the resurfacing of sensitive default responses.
Conclusion: A more targeted approach to bug hunting
So now you’re familiarised with an essential recon technique in the arsenal of security researchers and Bug Bounty hunters.
You’ve learned about various methods and tools for performing HTTP fingerprinting on web applications, in order to extract valuable information about the underlying technology stack.
Now you have some foundational knowledge about how different technologies respond to various techniques and what information they might expose.
Some methods work better than others in certain contexts. Being able to deploy them all, in relevant combinations for the environment, can help you build a profile of your target’s attack surface.
By mastering HTTP fingerprinting, you therefore gain a strategic advantage in identifying vulnerabilities, since you can tailor your exploitation techniques to the attack vectors you’ve uncovered. In other words, it means less wasted time and a quicker path to real vulnerabilities – and real Bug Bounty rewards.
*The Apache reverse proxy is the most likely