HTTP fingerprinting: reconning for web apps’ hidden flaws

HTTP fingerprinting is an invaluable way to discover the underlying technologies powering a web application.

From analysing HTTP headers to performing malformed HTTP requests, these reconnaissance techniques help offensive security professionals pinpoint a target’s hidden weaknesses.

The upshot: more targeted attacks and an increased likelihood of uncovering vulnerabilities – as well as lucrative Bug Bounty rewards.

In this article, we’ll explore how HTTP fingerprinting reveals hidden components that may be vulnerable due to misconfigurations or outdated software. Vulnerabilities, especially those related to Common Vulnerability Enumeration (CVE) records, can be overlooked without robust fingerprinting – making this an essential skill for bug bounty hunters and security researchers.

What is HTTP fingerprinting?

HTTP fingerprinting is a recon technique for gathering information about HTTP servers and their related components, such as programming languages, frameworks, proxies, content delivery networks (CDN) and web application firewalls (WAFs). Armed with this information, security researchers can learn how a target’s infrastructure is configured, which in turn sheds light on potential weaknesses.

Understanding the underlying technology stack often gives a direct path to known vulnerabilities or misconfigurations, since each technology comes with its own security history. By ascertaining which software version is in use, you can identify any related CVEs or vulnerability to publicly available exploits.

HTTP fingerprinting involves analysing different aspects of HTTP traffic to pinpoint unique attributes. Many technologies leave unique ‘fingerprints’ – such as special headers or default error pages – which can reveal clues about the environment. Knowing how to interpret these signs enables you to make your attack strategy more targeted and discover flaws more rapidly.

HTTP header analysis

You can learn a lot about a target’s technologies by analysing the HTTP response headers. Common headers that typically expose server or proxy details include Server, X-Powered-By, X-Powered-CMS and Via. Many technologies also use their own custom headers that expose the same details. For instance, with a header structure like X-<technology-name> you might get headers such as X-Drupal-Cache, X-Shopify-Stage, X-Varnish or X-Amz-Cf-Id.

Using tools like Burp Suite, you can easily collect these types of headers using extensions or custom-written BChecks. This type of HTTP header analysis is invaluable, because even a single header can reveal a misconfiguration or an outdated and potentially exploitable software version.

Banner-grabbing and reverse proxy identification

Banner-grabbing, where servers are induced into revealing information about software banners, provides direct evidence of installed software and its version numbers – which helps security researchers assess which known vulnerabilities might be present in an environment. To grab the banner of a web server or reverse proxy, you can send a HTTP GET request to the web server and then analyse the HTTP response.

Here are some HTTP responses generated by real Bug Bounty targets that expose their reverse proxy banner:

Apache HTTP responses

1HTTP/1.1 200 OK
2Date: Thu, 18 Jan 2025 18:21:12 GMT
3Server: Apache/2.4.57 (Unix)
4Last-Modified: Thu, 18 Jan 2025 19:21:12 GMT
5Accept-Ranges: bytes
6Content-Length: 1215
7Connection: close
8Content-Type: text/html

1HTTP/2 200
2date: Thu, 16 Jan 2025 18:28:54 GMT
3server: Apache
4content-length: 14162
5cache-control: no-cache,no-store,must-revalidate
6x-frame-options: SAMEORIGIN
7accept-ranges: bytes
8content-security-policy: frame-ancestors 'self';
9x-content-type-options: nosniff
10vary: Accept-Encoding
11content-type: text/html;charset=utf-8
12strict-transport-security: max-age=31536000; includeSubDomains;

Nginx HTTP responses

1HTTP/2 200
2date: Thu, 16 Jan 2025 18:32:21 GMT
3content-type: text/html; charset=UTF-8
4content-length: 722
5server: nginx/1.21.6
6last-modified: Wed, 12 Aug 2025 18:14:21 GMT
7x-xss-protection: 1; mode=block
8x-frame-options: SAMEORIGIN
9access-control-expose-headers: OPTIONS, GET, POST
10referrer-policy: strict-origin-when-cross-origin
11accept-ranges: bytes

1HTTP/2 200
2cache-control: no-store, no-cache, no-transform, must-revalidate
3content-type: text/html; charset=UTF-8
4cross-origin-embedder-policy: require-corp
5cross-origin-resource-policy: same-origin
6date: Thu, 16 Jan 2025 18:38:34 GMT
7expires: Thu, 16 Jan 2025 18:38:34 GMT
8last-modified: Thu, 16 Jan 2025 18:38:34 GMT
9referrer-policy: no-referrer
10server: nginx
11x-content-type-options: nosniff
12x-frame-options: deny
13x-xss-protection: 1; mode=block

In these scenarios, detecting the version of the reverse proxy being used for a specific web server was easy.

But what if this fingerprinting method fails to explicitly expose these details? Fortunately, subtle implementation details can give you additional clues about the proxy’s identity.

Inferring server banner based on header order

It might be possible to make an educated guess at the reverse proxy banner based on what order the HTTP response headers are presented in, although this depends on the proxy using default configurations.

Look again at the HTTP response examples given above – can you discern a pattern based on the order of the headers?

If we look closely, we can see that Apache responses put the date header first, followed by the server header, then content-type and then content-length headers.

As for the Nginx responses, we can see that the date header, content-type and content-length appear above the server header. Nginx is less likely than the Apache reverse proxy to maintain a strict header order, but often follows the header order: date, content-type and server.

Armed with this knowledge, we might be able to infer which reverse proxy is being used by a particular web server based on the response below – any ideas which one it could be?

1HTTP/1.1 200 OK
2Date: Thu, 16 Jan 2025 18:56:06 GMT
3Server: *
4Content-Length: 2325
5Keep-Alive: timeout=5, max=100
6Connection: Keep-Alive

However, since header order can be also affected by different configurations, software versions or security mechanisms that obscure server information, inferring banners based on header order should be combined with other fingerprinting techniques to obtain a clear picture.

Default error pages

We can also fingerprint a server’s proxy or web server by analysing its default error pages. These are revealed when you trigger HTTP status codes like 404 (not found), 500 (server error) or 403 (forbidden). Default error responses often expose distinctive elements from original templates used by the technology. For instance, Apache Tomcat, Spring Boot, Apache, Nginx and IIS all display default pages for specific HTTP status codes that reveal telltale signs about the technology stack.

Even with modest customisation, it is possible to fingerprint the technology used by analysing HTML tag structures, embedded comments or specific phrasing used within the page.

Examples of default pages:

Malformed HTTP requests

Sending malformed HTTP requests to a web server typically triggers error responses that expose identifying details about the underlying technology. This tactic effectively turns an inadvertent server misconfiguration into a golden opportunity to discover vulnerabilities within that environment.

For example, a malformed HTTP request might involve sending a request with an invalid HTTP version or method, as shown in the examples below:

1GET / HTTP/4.4
2Host: example.com

1XGET / HTTP/1.1
2Host: example.com

If the web server is not configured to handle such requests, they can trigger an error in the technology used by the web server that potentially exposes the banner. As such, this technique can be an effective way to gather intelligence on the server’s configuration, despite its attempts to hide its identity.

Identifying default file and directory structures

Default file and directory structures also play a major role in HTTP fingerprinting. Analysing directory naming conventions is a great way to discover which framework the web server is using, while file extensions can reveal the underlying programming language.

Files like favicon.ico, package.json, changelog.txt, README.txt and LICENSE.txt often contain version details that confirm elements of the technology stack. Once these structures are mapped, you can strategically focus your HTTP recon efforts on known vulnerabilities tied to the identified framework or language.

Web servers and frameworks implement unique session cookie parameters that reliably indicate which programming language or framework is being used, even when hidden behind a reverse proxy. The complexity of configuring and masking these identifiers across diverse environments makes analysing cookie parameters a consistently reliable way to fingerprint a server.

By pinpointing the back-end architecture in place, you can focus your pentesting strategy on technology-specific vulnerabilities rather than wasting time on irrelevant exploits.

Examples of cookie parameter names used by various technologies:

PHPSESSID (PHP)
JSESSIONID (Java)
ASP.NET_SessionId (ASP)
CFTOKEN/CFID (Adobe ColdFusion)

Passive fingerprinting via third-party services

Third-party services Wappalyzer and Shodan are excellent fingerprinting resources, in part thanks to their large databases of technology signatures collected from a wide range of web applications. They can be used to efficiently fingerprint your target's web server by aggregating historical data and patterns that might not otherwise be obviously visible, saving you significant time and effort.

Shodan provides useful insights about a web application’s open ports and running services. Also storing HTTP responses from the web application, Shodan is a great tool for leveraging the techniques described earlier to analyse HTTP response headers.

Wappalyzer, on the other hand, detects the underlying technology stack of a web application. This browser extension reveals frameworks, programming languages and third-party services powering websites. Lookups might confirm suspicions you had about the presence of certain technologies, or reveal components that would have stayed entirely hidden if you relied on manual inspection.

By combining this passively-collected data with your active findings, you gain a full-spectrum view of the target’s architecture and give yourself the best chance of spotting inconsistencies that signal potential misconfigurations.

Defences against HTTP fingerprinting

A crucial first step for protecting against HTTP fingerprinting is to suppress or customise server headers. Removing or rewriting version information in Server or X-Powered-By headers should hide the real server banner. Creating misleading values forces adversaries to expend additional effort guessing or probing for system details, and complicates any automated exploits that rely on easily accessible version data.

Default responses supplied by technologies such as Tomcat or Nginx often include logos or text that reveals a server’s identity. Replacing these default pages – or at least deleting the exposed version – makes HTTP fingerprinting significantly more difficult for the attacker.

To give some specific scenarios, with Apache, administrators can modify the httpd.conf or .htaccess files to remove or rewrite the Server header, using directives such as ServerSignature Off and ServerTokens Prod. In Nginx, similar results can be achieved by setting the server_tokens directive to off in the nginx.conf file. For Tomcat, achieving this involves editing the server.xml file to remove the server banner.

Another defensive layer can come in the form of web application firewalls that intercept and modify outbound HTTP headers.

Removing unnecessary files and directories is also advisable. Items such as changelog.txt, README.txt or LICENSE.txt can contain explicit technology references or version details. If these files are no longer needed, they should be deleted; if they are needed, then an access control should be configured to prevent unauthorised users from viewing their contents.

Regularly reviewing server configurations and conducting periodic vulnerability assessments can ensure that header customisations and file restrictions remain effective against evolving fingerprinting techniques. Automated monitoring tools like intrusion detection systems and continuous compliance scanners can also alert administrators to any configuration drifts or the resurfacing of sensitive default responses.

Conclusion: A more targeted approach to bug hunting

So now you’re familiarised with an essential recon technique in the arsenal of security researchers and Bug Bounty hunters.

You’ve learned about various methods and tools for performing HTTP fingerprinting on web applications, in order to extract valuable information about the underlying technology stack.

Now you have some foundational knowledge about how different technologies respond to various techniques and what information they might expose.

Some methods work better than others in certain contexts. Being able to deploy them all, in relevant combinations for the environment, can help you build a profile of your target’s attack surface.

By mastering HTTP fingerprinting, you therefore gain a strategic advantage in identifying vulnerabilities, since you can tailor your exploitation techniques to the attack vectors you’ve uncovered. In other words, it means less wasted time and a quicker path to real vulnerabilities – and real Bug Bounty rewards.

*The Apache reverse proxy is the most likely

Recon series #3: HTTP fingerprinting – sleuthing for a web application’s hidden vulnerabilities

What is HTTP fingerprinting?

HTTP header analysis

Banner-grabbing and reverse proxy identification

Apache HTTP responses

Nginx HTTP responses

Inferring server banner based on header order

Default error pages

Malformed HTTP requests

Identifying default file and directory structures

Passive fingerprinting via third-party services

Defences against HTTP fingerprinting

Conclusion: A more targeted approach to bug hunting

References & further reading

Products

Researchers

Resources

Company

Follow us

Recon series #3: HTTP fingerprinting – sleuthing for a web application’s hidden vulnerabilities

What is HTTP fingerprinting?

HTTP header analysis

Banner-grabbing and reverse proxy identification

Apache HTTP responses

Nginx HTTP responses

Inferring server banner based on header order

Default error pages

Malformed HTTP requests

Identifying default file and directory structures

Analysing cookie parameters

Passive fingerprinting via third-party services

Defences against HTTP fingerprinting

Conclusion: A more targeted approach to bug hunting

References & further reading