Skip to content

Services, Wiki-Artikel, Blog-Beiträge und Glossar-Einträge durchsuchen

↑↓NavigierenEnterÖffnenESCSchließen

OSINT Methods: Tools and Techniques for Open Source Intelligence

OSINT (Open Source Intelligence) refers to the systematic collection and analysis of publicly available information for security and reconnaissance purposes. This article explains OSINT methods for corporate research: DNS enumeration (dnsx, amass, subfinder), Google Dorking, Shodan/Censys, Certificate Transparency, social media OSINT, WHOIS analysis, and passive reconnaissance frameworks such as Maltego and SpiderFoot.

Table of Contents (7 sections)

OSINT (Open Source Intelligence) is the foundation of every professional penetration test: Before an attacker actively penetrates a system, they often spend hours or days gathering publicly available information. The goal: to obtain a complete picture of the attack surface without sending a single packet to the target network. What attackers see, defenders must also know.

OSINT Framework and Phases

OSINT Reconnaissance Framework:

Passive OSINT (without direct interaction with the target):
  → DNS information from public sources
  → WHOIS data
  → Certificate Transparency logs
  → Google Dorking
  → Shodan/Censys (cached scans)
  → Social media and company websites
  → Job postings (technology stack identifiable!)
  → Pastebin / Dark Web leaks
  → Code repositories (GitHub, GitLab)

Semi-Passive OSINT (not directly identifiable):
  → DNS resolution of public records (A, MX, SPF, DMARC)
  → Certificate Transparency log queries
  → Wayback Machine (archive.org)
  → Web crawling with passive fingerprinting

Active reconnaissance (direct, identifiable):
  → Port scanning (nmap)
  → Banner grabbing
  → Web technology fingerprinting
  → WAF detection
  → Subdomain brute force (active, sending DNS requests)

OSINT Framework Categories:
  osintframework.com: Categorized overview of all tools
  Categories: Username, Email, Domain, IP/Network, Social Media,
              Dark Web, Documents, Images, Phone, Business

DNS Enumeration

DNS Enumeration - Discovering subdomains and infrastructure:

Passive Subdomain Enumeration (without sending DNS queries to the target):

subfinder (Project Discovery):
  subfinder -d example.com -o subdomains.txt
  subfinder -d example.com -all -recursive -o subdomains.txt
  # Sources: Certificate Transparency, VirusTotal, Shodan, etc.

amass (OWASP):
  amass enum -passive -d example.com -o amass-passive.txt
  amass enum -active -d example.com -o amass-active.txt
  amass db -names -d example.com  # From amass's own database

Certificate Transparency Logs:
  # crt.sh - Query all certificates for a domain:
  curl -s "https://crt.sh/?q=%.example.com&output;=json" | \
    jq -r '.[].name_value' | \
    sort -u | grep -v "*"

  # crt.sh also for wildcard subdomains:
  curl -s "https://crt.sh/?q=%.%.example.com&output;=json" | \
    jq -r '.[].name_value' | sort -u

DNS Brute Force (active):
  dnsx -l subdomains.txt -a -resp -o resolved.txt
  # Tool: puredns (with wordlist)
  puredns bruteforce wordlist.txt example.com -r resolvers.txt

Analyze DNS record types:
  # MX records (email infrastructure):
  dig MX example.com
  # → Google Workspace? Microsoft 365? Own mail server?

  # SPF, DMARC, DKIM:
  dig TXT example.com | grep spf
  dig TXT _dmarc.example.com
  # → Shows email security configuration (or lack thereof!)

  # NS records (nameservers):
  dig NS example.com
  # → Cloudflare? AWS Route 53? Own NS?

  # Attempt an AXFR zone transfer (often not allowed, but try anyway):
  dig @ns1.example.com example.com AXFR
  # If successful: ALL DNS records for the domain!

Google Dorking

Google Dorking - Sensitive information via search engines:

Basic operators:
  site:example.com           → Only this domain
  filetype:pdf               → Only PDFs
  inurl:/admin               → URL contains /admin
  intitle:"Index of /"       → Directory listings
  intext:"confidential"      → Text in the document
  -site:www.example.com      → Exclude this subdomain

Practical dorking combinations:

  # Find subdomains:
  site:*.example.com -site:www.example.com

  # Login pages:
  site:example.com inurl:login OR inurl:signin OR inurl:auth

  # Configuration files:
  site:example.com filetype:env OR filetype:config OR filetype:cfg

  # Error messages with stack traces:
  site:example.com "stack trace" OR "exception" OR "debug"

  # Open redirects:
  site:example.com inurl:redirect= OR inurl:url= OR inurl:return=

  # Passwords (often in old files):
  site:example.com filetype:txt password OR username

  # Backup files:
  site:example.com filetype:bak OR filetype:backup OR filetype:sql

  # Exposed .git:
  site:example.com inurl:/.git/config

  # phpinfo():
  site:example.com inurl:phpinfo.php

  # Jenkins/CI-CD:
  site:example.com inurl:jenkins OR inurl:gitlab OR inurl:bitbucket

Google Dorks Automation:
  # ghdb-scraper (Google Hacking Database):
  python3 ghdb.py -q "site:example.com" -d "vulnerability"
  # GoogD0rker:
  python3 googd0rker.py -q "site:example.com" -t web

Shodan and Censys

Shodan - Search engine for connected devices:

Basic Shodan searches:
  # Direct IP address:
  host: 203.0.113.1

  # By ASN:
  org:"Example GmbH"
  asn:"AS12345"

  # By technology:
  product:"Microsoft IIS"
  product:"Apache httpd" version:"2.4.51"

  # By port/service:
  port:3389 org:"Example GmbH"    # RDP to org
  port:22 country:DE               # SSH in Germany

  # Certificate information:
  ssl.cert.subject.cn:"*.example.com"
  ssl.cert.expired:true org:"Example GmbH"  # Expired certificates!

Shodan CLI:
  # Installation:
  pip install shodan
  shodan init YOUR_API_KEY

  # Domain info:
  shodan domain example.com

  # IP info:
  shodan host 203.0.113.1

  # Download search:
  shodan search --fields ip_str,port,org "org:'Example GmbH'" \
    --limit 1000 > shodan_results.txt

Censys (Alternative, more SSL/TLS focus):
  search.censys.io or censys.io/api

  # Python API:
  from censys.search import CensysHosts
  h = CensysHosts()
  results = h.search("ip_addresses.reverse_dns.reverse_dns:'example.com'")

  # All IPs of an organization:
  results = h.search("autonomous_system.organization_id:'NNNNNN'")

  # Certificates:
  from censys.search import CensysCertificates
  c = CensysCertificates()
  certs = c.search("parsed.subject.common_name:'*.example.com'")

What you'll find:
  → Forgotten test servers (example-test.example.com:8080)
  → Outdated SSL certificates (known CVEs)
  → Exposed admin panels (Grafana, Jenkins, Kibana)
  → Default credentials on network devices
  → Exposed databases (MongoDB, Elasticsearch, Redis)

GitHub and Code Repositories

GitHub OSINT - Finding source code leaks:

GitHub search:
  # Search for organization:
  org:example-gmbh

  # Sensitive files:
  org:example-gmbh filename:.env
  org:example-gmbh filename:config.json password
  org:example-gmbh filename:docker-compose.yml

  # API keys in the code:
  org:example-gmbh "AWS_SECRET_ACCESS_KEY"
  org:example-gmbh "AKIA"               # AWS Key Prefix
  org:example-gmbh "-----BEGIN RSA PRIVATE KEY-----"
  org:example-gmbh "ghp_"               # GitHub Personal Access Token

Automated GitHub search:

  trufflehog (Secrets scanner):
    trufflehog github --org=example-gmbh
    trufflehog git https://github.com/example-gmbh/repo

  gitleaks:
    gitleaks detect --source . --report-path leaks.json
    gitleaks detect --source https://github.com/example-gmbh/repo

  gitrob:
    gitrob analyze --github-access-token TOKEN \
      --organization example-gmbh

Analyze commits:
  # Search git log (locally):
  git log --all -p | grep -i "password\|secret\|key\|token"

  # git-secrets (prevents secrets from being committed):
  git secrets --install
  git secrets --register-aws

Common Leaks:
  → AWS Access Keys (AKIA... prefix)
  → Private SSH/TLS keys
  → Database passwords in .env files
  → API keys (Stripe, Twilio, SendGrid, etc.)
  → JWT secrets (HMAC key in plain text)
  → Hardcoded production credentials in tests

Social Engineering Reconnaissance

Social Media and Corporate OSINT:

LinkedIn:
  → List of employees at the target organization
  → Technology stack from job postings!
    "Seeking Python developers with Django and AWS experience"
    → Reveals technology stack
  → Derive email format (vorname.nachname@example.com?)
  → Organizational chart / decision-making structures

LinkedIn OSINT Tools:
  linkedin2username: Employee lists → Generate email list
  ScrapedIn / ProspectIn: Automated LinkedIn scraping

Email Enumeration:
  hunter.io:
    # API:
    curl "https://api.hunter.io/v2/domain-search?domain=example.com&api;_key=KEY"
    → Email format + known email addresses

  emailhippo / verifalia:
    → Check email validity without sending

WHOIS / Domain Registration:
  whois example.com
  # → Registrant (often anonymized), registrar, creation date
  # → For older registrations: real contact information!

  # Reverse WHOIS (same owner → other domains):
  viewdns.info/reversewhois/?q=admin@example.com

  # Domain history:
  domaintools.com / whoisology.com

Wayback Machine:
  # Old versions of the website:
  web.archive.org/web/*/example.com

  # Older versions often contain:
  → Old employee pages with real email addresses
  → Previous technology versions (older CMS, etc.)
  → Deleted sensitive pages (internal, documentation)

Analyze job postings:
  # Information from job postings:
  "Experience with Fortinet FortiGate preferred"  → Firewall vendor known!
  "AWS Certified Solutions Architect"            → AWS as cloud provider
  "SIEM experience with Splunk"                   → SIEM system known
  "Knowledge of SAP ERP"                       → Business software
  → Attacker knows the technology stack before the first attack!

OSINT Frameworks and Automation

OSINT Platforms and Automation:

Maltego:
  → Graphical OSINT visualization and correlation
  → Transforms: automated data queries (Shodan, Censys, etc.)
  → Community Edition: free (limited transforms)
  → Ideal for: visualizing relationships, identifying attack paths

SpiderFoot (Open Source):
  # Installation:
  pip install spiderfoot
  python3 sf.py -l 127.0.0.1:5001

  # CLI scan:
  python3 sfcli.py -s example.com -t DOMAIN -m all -o json
  # → Automatically: subdomains, IPs, emails, social profiles, leaks

  Over 200 modules integrated:
  → Shodan, Censys, VirusTotal
  → HaveIBeenPwned, Dehashed
  → Certificate Transparency
  → Google, Bing, DuckDuckGo
  → GitHub, GitLab
  → LinkedIn, Twitter (limited)

Recon-ng:
  # Modular recon framework:
  recon-ng
  [recon-ng]> marketplace install all
  [recon-ng]> workspaces create example-gmbh
  [recon-ng]> db insert domains
  Domain: example.com
  [recon-ng]> modules load recon/domains-hosts/hackertarget
  [recon-ng]> run
  # → Subdomains found

theHarvester:
  # Quick email/subdomain enumeration:
  theHarvester -d example.com -l 500 -b google,bing,linkedin,shodan

OSINT Report Template:
  → Target and Scope
  → Passive Findings (without target system contact)
  → Subdomain List (+ active IPs)
  → Email addresses (for phishing simulation scope)
  → Technology stack analysis
  → Exposed services (Shodan)
  → Credential leaks (HIBP, DeHashed)
  → Recommendations: What should be addressed immediately?

Questions about this topic?

Our experts advise you free of charge and without obligation.

Free Consultation

About the Author

Vincent Heinen
Vincent Heinen

Abteilungsleiter Offensive Services

E-Mail

M.Sc. IT-Sicherheit mit über 5 Jahren Erfahrung in offensiver Sicherheitsanalyse. Leitet die Durchführung von Penetrationstests mit Spezialisierung auf Web-Applikationen, Netzwerk-Infrastruktur, Reverse Engineering und Hardware-Sicherheit. Verantwortlich für mehrere Responsible Disclosures.

OSCP+ OSCP OSWP OSWA
This article was last edited on 04.03.2026. Responsible: Vincent Heinen, Abteilungsleiter Offensive Services at AWARE7 GmbH. License: CC BY 4.0 - free use with attribution: "AWARE7 GmbH, https://a7.de"

Cookielose Analyse via Matomo (selbst gehostet, kein Tracking-Cookie). Datenschutzerklärung