Recon series #5: A hacker’s guide to Google dorking

May 27, 2025

Bug Bounty recon series on information gathering with search engines – aka Google Dorking – which is illustrated with a magnifying glass zooming in on a web browser.

Google dorking is a comparatively simple yet invaluable reconnaissance technique for ethical hackers to learn.

Suitably customised ‘dorks’ (Google searches that reveal sensitive information about targets) can uncover hidden admin panels, misconfigured subdomains and exposed credentials within minutes – all without sending a single direct scan. Master common Google dorking operators like site:, inurl: and filetype: and this passive recon technique can provide intel that paves the way to finding hitherto overlooked vulnerabilities.

The power of Google dorking (aka Google hacking) was creatively demonstrated, for instance, by research about leveraging dorks to find zero-days from Suraj Khetani of Unit 42, which was a contender for PortSwigger’s web hacking techniques awards for 2017.

This article will elaborate on the value of dorking to Bug Bounty hunters, and explain how to conduct dorking queries in a multitude of effective ways.

Outline

  • What is Google dorking?
  • Why Google dorking is an essential skill for Bug Bounty hunters
  • Core Google dorking operators for passive reconnaissance
  • Basic Google dorking queries: hands-on examples
    • Subdomain discovery techniques
    • Hunting for hidden files
    • Identifying login portals
    • Harvesting exposed credentials
  • Tailoring Google dorks to your targets
  • Best practices for preventing Google dorking attackss
  • Next steps: from passive recon to exploit development
  • References

What is Google dorking?

Google dorking (aka Google hacking) leverages specialised search operators to uncover publicly indexed resources such as files, directories and login pages that organisations never intended to expose. Unlike active scanning techniques like web crawling or DNS brute-forcing, dorking is entirely passive – meaning that no trace of your activities is left on your target’s systems.

Why Google dorking is an essential skill for Bug Bounty hunters

For Bug Bounty hunters, mastering Google dorking means covering more ground with less effort and identifying vulnerabilities other hunters might miss.

This recon method empowers ethical hackers to quickly gather high-quality, publicly available intelligence about target applications. Mastering Google search operators can usefully augment recon techniques like fingerprinting, and helps uncover exposed assets such as forgotten subdomains, misconfigured directories and sensitive files.

ExploitDB’s community-driven Google dorks database serves as a particularly invaluable resource for staying up to date with evolving search techniques and attack vectors.

Core Google dorking operators for passive reconnaissance

Below is a list of Google search operators commonly used in advanced dorking:

  • define - Shows definition of a word
  • cache - Displays Google's cached version of a page
  • filetype - Finds files of a specific type (e.g., PDF)
  • ext - Same as filetype: (alternate form)
  • site - Searches within a specific website or domain
  • related - Finds sites similar to a given URL
  • intitle - Finds pages with the term in the title
  • allintitle - All terms must appear in the title
  • inurl - Finds pages with the term in the URL
  • allinurl - All terms must appear in the URL
  • intext - Finds pages with the term in the body text
  • allintext - All terms must appear in the body text
  • weather - Shows weather for a location
  • stocks - Displays stock info for a ticker
  • map - Shows map for a location
  • movie - Finds info about a movie or showtimes
  • source - Filters news by source (used in Google News)
  • before - Finds results published before a date
  • after - Finds results published after a date
  • inanchor - Finds pages with the term in anchor text
  • allinanchor - All terms must be in anchor text
  • loc - Limits results to a specific location
  • location - Similar to loc:, for news location filtering
  • daterange - Filters by Julian date range
  • AROUND(X) - Finds words near each other, within X words
  • OR - Searches for either term
  • AND - Implies both terms must appear
  • | - Same as OR
  • - - Excludes results containing the term
  • * - Acts as a wildcard for one or more unknown words
  • () - Groups terms or operators
  • "" - Searches for the exact phrase inside quotes

Basic Google dorking queries: hands-on examples

In this section, we'll dive into popular Google dorking techniques that can uncover valuable assets within your Bug Bounty target:

Subdomain discovery techniques

Subdomain enumeration is a key reconnaissance step for mapping an organisation's digital footprint. These searches for related subdomains should cover as much ground as possible in order to get a comprehensive view of your target’s infrastructure.

Basic subdomain discovery

The simplest query for discovering subdomains is:

site:*.google.com
  • Function: Returns all subdomains of the target domain that Google has crawled and indexed.
  • Recon value: Expands the attack surface by revealing additional entry points that may have been overlooked during initial scoping.

Targeted subdomain discovery

You can also combine multiple operators with double quotation marks to discover subdomains with a specific keyword in the URL:

site:*.google.com inurl:"developer"
  • Function: Searching for all subdomains associated with google.com that include a specified keyword (‘developer’ in the above example) in the URL. This can either be in the hostname itself or in the endpoint.
  • Recon value: Uncovers development and staging environments, which typically have weaker security controls, making them prime targets for initial compromise. These environments may contain debugging information, default credentials or unpatched vulnerabilities.

Hunting for hidden files

Discovering hidden files within domains can not only expose sensitive file content but also effectively fingerprint an application's programming language, framework or running services that are being used.

Programming language fingerprinting

The following dork discovers filenames with a specified extension in order to determine the programming language:

site:*.google.com ext:php
  • Function: Identifies files with specific extensions across all subdomains, revealing the programming languages and frameworks in use.
  • Recon value: Understanding the technology stack helps researchers focus efforts on language-specific vulnerabilities and common misconfigurations.

Framework and content management system (CMS) detection

We can also look for files that reveal the use of specific frameworks and CMS platforms. The following Google dork identifies WordPress installs by finding google.com subdomains with a php file extension and ‘wp-’ keyword in the URL:

site:*.google.com ext:php inurl:"wp-"
  • Function: Searches for content management system indicators within PHP files, specifically targeting WordPress, Drupal, and Joomla installations.
  • Recon value: CMS platforms have well-documented vulnerabilities. Dorks can identify which versions are running and help researchers determine if known CVEs apply to the target system.

Identifying login portals

Login interfaces represent critical security boundaries and are frequent targets for various attack vectors, with weak credentials or vulnerabilities enabling hackers to mount SQL, NoSQL, LDAP and other injection attacks.

Basic login portal discovery

A basic query for discovering logins:

site:*.google.com inurl:login
  • Function: Locates authentication-related URLs across the target domain.
  • Recon value: Each login portal represents a potential attack vector for credential stuffing or injection attacks. Authentication weaknesses can be leveraged to obtain direct access to your target applications.

Advanced login portal identification

A more targeted query might for instance search for all google.com domains with the keyword ‘login’, the extension ‘jsp’, and the keywords ‘username’ and ‘password’ within the webpage, while ignoring URLs containing the keyword ‘assets’, which typically indicates template login files rather than real login pages:

site:*.google.com inurl:login ext:jsp -inurl:assets intext:"username" AND intext:"password" -inurl:assets
  • Function: Identifies Java-based login forms containing username and password fields while filtering out static assets and template files.
  • Recon value: Login portals are prime contenders for compromising the system and this is a more precise, effective way to surface active or legacy login pages.

Harvesting exposed credentials

Exposed credentials, which can lead to immediate system compromise and privilege escalation, are always a high-impact Bug Bounty find. But remember: be careful not to leak or misuse any credentials uncovered.

Text file credential hunting

The following dork searches for all google.com-related domains with the txt file extension and the password or credentials keyword in the URL, excluding results containing the readme.txt filename:

site:*.google.com ext:txt inurl:password OR inurl:credentials -inurl:readme.txt
  • Function: Searches for text files containing credential-related keywords while excluding common documentation files.
  • Recon value: Exposed text files with credentials can improve our chances of gaining access to applications within our targets.

Environment file discovery

Another effective way to find exposed credentials is to target common filenames that might include passwords or sensitive credentials, such as the .env file:

site:*.google.com filetype:env intext:password
  • Function: Targets files that often store environment variables, including passwords, API keys and encryption secrets.
  • Recon value: Environment files often contain production credentials and sensitive configuration data that can facilitate lateral movement and privilege escalation.

Tailoring Google dorks to your targets

Customising google dorks to suit your Bug Bounty target will give you the most actionable results possible. Generic searches will typically only get you so far.

An effective workflow lies in iterative refinement: starting with broad searches; analysing your results to identify patterns, technologies or naming conventions specific to your target; and then progressively crafting more precise queries to improve the accuracy and relevance of your findings.

So keep track of your dorks and their results, then modify your searches accordingly, and you can achieve a more focused methodology that consistently delivers better results than one-size-fits-all approach.

Best practices for preventing Google dorking attacks

Organisations can reduce the risk of unwanted exposure of sensitive information by shifting from reactive to proactive indexing control. Rather than waiting for a sensitive endpoint to appear in search results before acting, defenders should prevent indexing from the outset by embedding noindex directives via robots.txt and meta tags, and at the protocol-level via the X-Robots-Tag header. These proactive indexing control measures block search engines from indexing confidential information such as admin panels, backups and configuration files in the first place.

Next steps: from passive recon to exploit development

After completing passive reconnaissance with Google dorking, the next crucial step is organising your data and automating workflows. Consolidate all discovered URLs, credentials and subdomains into a structured format, such as CSV files or a database. This centralised data store can then form the foundation for your active scanning, subdomain enumeration and exploit development.

Google dorking is an underrated reconnaissance skill, perhaps because it requires less technical expertise than other recon methods. Potentially uncovering forgotten subdomains, exposed login portals or even sensitive files without directly interacting with the target, it can give you a serious edge as a Bug Bounty hunter. It's passive, powerful, beginner-friendly and, honestly, kind of fun once you get the hang of it.

References