Advanced SQL Injection vulnerability scanner using Google dorking techniques, powered by Serper.dev (free, no credit card required).
- Serper.dev Search Integration: Find vulnerable URLs using Google dorks — e.g.,
inurl:product?id= - Three-Stage SQLi Testing: Comprehensive checks for error-based, boolean-based, and time-based blind SQLi
- Per-Parameter Dynamic Gate: Each parameter is dynamism-checked independently — static params are skipped, active params are fully tested
- Baseline-Aware Time Detection: Measures baseline response time before injecting sleep payloads; avoids false positives on slow networks
- robots.txt Enforcement: Automatically fetches and caches
/robots.txtfor every target; disallowed URLs are skipped - Scanned URL Memory with Auto-Trim: Remembers previously scanned URLs across runs; trims to last 10 000 entries to prevent unbounded growth
- Concurrent Scanning: Multi-threaded architecture for efficient scanning (configurable
MAX_WORKERS) - Broad Parameter Coverage: 50+ commonly injectable parameter names recognised (id, q, search, sort, type, keyword, …)
- Secure API Key Input: Key hidden while typing via
getpass; never stored or logged - Verbose / Quiet CLI Flags:
-vfor per-URL status + debug logging;-qfor silent mode - CSV Reporting: Export results for further analysis
- Stealth Mode: Randomised delays and user-agent rotation
- Sign up at https://serper.dev — no credit card required
- You get 2,500 free queries on signup
- Copy your API key from the dashboard
git clone https://github.com/xfnx-17/DorkHunter.git
cd DorkHunterpython3 -m venv venv
source venv/bin/activatepython -m venv venv
venv\Scripts\activate
pip install -r requirements.txtpython DorkHunter.py [OPTIONS]| Flag | Description |
|---|---|
-v, --verbose |
Show per-URL status (CLEAN / SKIPPED / robots.txt blocks) and enable DEBUG-level file logging |
-q, --quiet |
Suppress all output except VULNERABLE findings |
- Run the script (optionally with
-vor-q) - Enter your Serper.dev API key (input is hidden)
- Input a search dork (e.g.,
inurl:login.php?id=) - Set the maximum number of vulnerable URLs to find
- Choose whether to save results to a CSV report
# Standard run
python DorkHunter.py
# Verbose — see every URL result + debug log entries
python DorkHunter.py -v
# Quiet — only print VULNERABLE hits
python DorkHunter.py -q📂 DorkHunter/
├── 📄 DorkHunter.py — Main scanner script
├── 📄 payloads.txt — SQL injection payload database
├── 📄 user_agents.txt — Browser User-Agent strings for rotation
├── 📄 requirements.txt — Pinned Python dependencies
├── 📄 README.md — This file
├── 📄 scanner.log — Auto-created: warnings, errors, debug entries
├── 📄 scanned_urls.txt — Auto-created: deduplication log (capped at 10 000 entries)
└── 📄 report.csv — Auto-created when you choose to save results
- 🔒 API key input is hidden — never visible in terminal or shell history
- 🔒 API keys are never stored on disk or written to logs
- 🤖 robots.txt is enforced automatically — URLs disallowed by the target's robots.txt are skipped
- ⚖️ Use only on systems you own or have explicit written permission to test
- 📉 Polite random delays between requests minimise Serper.dev quota usage and reduce server load
- 📌 This project is for educational purposes only — use responsibly
Key tunable constants at the top of DorkHunter.py:
| Constant | Default | Description |
|---|---|---|
MAX_WORKERS |
5 |
Concurrent scanning threads |
MAX_API_PAGES |
10 |
Maximum Serper.dev result pages to consume per dork |
DEFAULT_MAX_VULNERABLE_URLS |
10 |
Default cap on vulnerable URLs to find |
MAX_SCANNED_URLS |
10 000 |
Maximum entries kept in scanned_urls.txt |
BOOLEAN_RATIO_THRESHOLD |
0.15 |
SequenceMatcher ratio delta to flag boolean SQLi |
BOOLEAN_LENGTH_THRESHOLD |
50 |
Byte-length delta to flag boolean SQLi |
TIME_BASED_DELAY |
5 |
Sleep seconds injected into time-based payloads |
TIME_BASED_EXTRA_MARGIN |
2 |
Seconds above baseline required to confirm time-based SQLi |
DELAY_BETWEEN_REQUESTS |
(1, 3) |
Random polite delay range between API pages |
Found a bug? Have an improvement?
- Fork the repository
- Create your feature branch
- Submit a pull request
- 🔒 robots.txt enforcement — each target's
/robots.txtis fetched, cached, and respected; disallowed URLs are skipped - 🔒 Scanned URL auto-trim —
scanned_urls.txtis automatically capped at 10 000 entries on startup to prevent unbounded file growth - 🐛 Fixed per-parameter dynamic gate —
_is_parameter_dynamic()now runs inside_test_payloads()for each parameter independently, so a static first param no longer causes an entire URL to be skipped - 🐛 Fixed time-based false positives — baseline response time is measured before injecting sleep payloads; the threshold is
baseline + TIME_BASED_EXTRA_MARGINrather than a fixed 4 s - ⚙️ Wired verbose / quiet mode —
-v/--verboseand-q/--quietCLI flags control output granularity and DEBUG logging - ⚙️ argparse CLI — proper flag parsing replaces ad-hoc
sys.argvusage - ⚙️ Widened injectable parameter set — 50+ params now recognised (q, query, keyword, sort, type, lang, ref, …)
- ⚙️ Configurable detection thresholds —
BOOLEAN_RATIO_THRESHOLD,BOOLEAN_LENGTH_THRESHOLD,TIME_BASED_DELAY,TIME_BASED_EXTRA_MARGINare top-level constants - ⚙️ Pinned requirements — all dependencies now have
>=version bounds - 🧹 Cleaned payloads.txt — removed irrelevant NoSQL/JSON/LDAP/DOM entries; added stacked-query, PostgreSQL, SQLite, more WAF bypass, and boolean-blind payloads
- 🐛 Fixed:
payloads.txtcomment/section-header lines were being sent as live SQL payloads - 🐛 Fixed: SSL fallback used invalid
ssl_contextkwarg — now usesverify=Falsecorrectly - 🐛 Fixed:
_check_boolean_based()and_check_time_based()called inside payload loop causing massive redundant requests — now once per parameter - 🐛 Fixed: Boolean/time-based payload construction used brittle
split("=", 1)— now uses_with_param()throughout - 🔄 Switched search backend from Google Custom Search API (paid) to Serper.dev (free, no credit card)
- 🔒 Secure API key input via
getpass - ⚙️ Added
MAX_WORKERSconstant for configurable thread pool size