-
-
Notifications
You must be signed in to change notification settings - Fork 8.6k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Description
Problem
Sherlock currently uses requests-futures with a ThreadPoolExecutor capped at 20 workers. When scanning 400+ sites, requests are batched in groups of 20 with overhead from thread context switching and the GIL. A full scan typically takes 45-90 seconds.
Proposal
A new async_engine.py module using asyncio + aiohttp as a drop-in replacement for the synchronous sherlock() function:
aiohttp.ClientSessionwithTCPConnectorfor connection poolingasyncio.Semaphorefor configurable concurrency (default 100)limit_per_host=3to stay polite and avoid rate-limiting- DNS caching to reduce lookup overhead on repeated scans
New CLI flags
--workers N— max concurrent requests (default: 100)--sync— fall back to the legacy synchronous engine
Backwards compatibility
- Return value is identical (same dict structure, same QueryResult objects)
- All existing CLI flags work unchanged
- Default behavior switches to async, with
--syncto opt out
Expected performance
| Scan type | Current (20 threads) | Async (100 concurrent) |
|---|---|---|
| Full scan (478 sites) | ~45-90s | ~15-25s |
| Targeted (50 sites) | ~15-20s | ~5-8s |
New dependency
aiohttp ^3.9.0
I have a working implementation ready to PR if there's interest. Happy to adjust the approach based on feedback.
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request