-
Notifications
You must be signed in to change notification settings - Fork 0
HybridHashScanner
HybridHashScanner Wiki
HybridHashScanner is an advanced, modular command-line utility designed specifically for cybersecurity professionals, including threat intelligence analysts, incident responders, and security researchers. The tool excels in analyzing file hashes by querying multiple threat intelligence platforms—MISP (Malware Information Sharing Platform), CIRCL Hashlookup (a public hash database), AlienVault OTX (Open Threat Exchange), Kaspersky OpenTIP (a threat intelligence portal), and VirusTotal (a comprehensive file analysis service). It supports MD5, SHA1, SHA256, and SHA512 hash types, with automatic detection based on string length and hexadecimal validation.
The tool's architecture emphasizes efficiency, privacy, and scalability: it uses an SQLite cache to store results and minimize API overhead, multithreaded workers to handle concurrent queries while respecting rate limits, and optional Tor integration for anonymous requests to VirusTotal. Operational modes (normal, quick, extra) provide flexibility for various scenarios, from rapid scans to interactive, phased investigations. Outputs include colored terminal summaries with tables (using tabulate for readability), detailed logging in verbose mode, and text file exports for records.
This Wiki provides an in-depth exploration of the tool, covering installation, configuration, usage modes, code structure, and advanced topics. It is intended as a technical reference to help users maximize the tool's potential.
To run HybridHashScanner, you need:
- Python 3.12.3 or Higher: The tool leverages Python's standard libraries and external modules. Ensure Python is installed and added to your PATH.
-
Tor (Optional): Required only for anonymous VirusTotal queries in quick mode. Download and install Tor from the official Tor Project website (https://www.torproject.org/). On Linux, use package managers like apt (
sudo apt install tor); on Windows, install the Tor Browser bundle and note the tor.exe path.
HybridHashScanner requires several external Python modules for its functionality: pymisp (for MISP integration), requests (for HTTP API calls), tabulate (for tabular output), pyfiglet (for ASCII logos), colorama (for colored terminal text), stem (for Tor control), chardet (for file encoding detection), and pysocks (for SOCKS proxies with Tor).
To simplify setup, the tool includes an automated installation process. It checks for missing modules at runtime and installs them via pip if the --install switch is used. This eliminates the need for manual pip commands and handles any ImportErrors by attempting installations one-by-one.
Run the following command to install dependencies:
python3 /opt/HybridHashScanner/HybridHashScanner.py --install
-
How It Works: The script iterates through a list of required modules (
required_modules = ['pymisp', 'requests', 'tabulate', 'pyfiglet', 'colorama', 'stem', 'chardet', 'pysocks']). For each missing module, it executessubprocess.check_call([sys.executable, "-m", "pip", "install", module]). If successful, it prints confirmation; if failed, it exits with an error. - Output on First Run: If modules are missing, you'll see installation logs from pip, followed by "All modules installed successfully."
- Output if Already Installed: "All required modules are already installed."
- Important Notes: This requires internet access and pip permissions. Run with elevated privileges if needed (e.g., sudo on Linux). No additional packages can be installed beyond these, as the tool has no internet access for arbitrary installs.
After installation, clone the repository if not already done:
git clone https://github.com/AAtashGar/HybridHashScanner.git
cd HybridHashScanner
Test the setup by running the tool without arguments to see the logo and info box.
Configuration is handled exclusively through a config.json file in the project root. This file must be created manually before using the tool—there are no runtime prompts to generate it. If the file is missing or malformed, the tool will exit with an error: "[+] Configuration file 'config.json' not found. Please create and configure it."
Before creating config.json, you must prepare accounts and instances for each service:
- MISP (Malware Information Sharing Platform):
- Step 1: Set Up a Local MISP Instance: MISP requires a self-hosted server. Download and install MISP from the official repository (https://github.com/MISP/MISP). Follow the installation guide: install dependencies (e.g., Apache, MySQL, PHP), configure the database, and start the server. Access it via a web browser to create an admin account.
- Step 2: Generate API Key: Log in to your MISP instance, navigate to "Automation" in the user menu, and generate an API key. Note the key and your instance URL (e.g., https://localhost:8443/).
- Why Local?: MISP is designed for private or community sharing; public instances may restrict access. A local setup ensures control and privacy.
- AlienVault OTX (Open Threat Exchange):
- Create a free account on https://otx.alienvault.com/. After verification, go to your profile settings and generate an API key. OTX provides threat data like pulses and indicators.
- VirusTotal:
- Sign up for a free or paid account on https://www.virustotal.com/. Free accounts have rate limits (4 requests/minute); paid ones offer more. Generate an API key from the API section. You can add multiple keys to
vt_keysfor rotation and higher throughput.
- Kaspersky OpenTIP:
- Register on Kaspersky's threat intelligence portal (https://opentip.kaspersky.com/). After approval, generate an API key from your account dashboard. OpenTIP focuses on file reputation and threat classification.
-
CIRCL Hashlookup: No account or key needed; it's a public service. However, ensure your network allows access to https://hashlookup.circl.lu/.
-
Tor Path (Optional): If using Tor, specify the full path to the tor executable (e.g., /usr/bin/tor on Linux, C:\Tor\tor.exe on Windows).
Use a text editor to create config.json with the following structure. All fields are strings except vt_keys (a list). Leave unused fields empty or omit them—the tool will skip those services with a yellow warning (e.g., "[+] OTX is not configured in config.json. Skipping OTX search.").
Example config.json:
{
"misp_url": "https://your-misp-instance.com",
"misp_key": "your_misp_api_key_here_long_string",
"otx_key": "your_otx_api_key_here",
"vt_keys": ["vt_key1_here", "vt_key2_here_for_rotation"],
"kaspersky_key": "your_kaspersky_api_key_here",
"cache_db": "cache.db",
"tor_path": "/usr/bin/tor"
}- misp_url: Full URL of your MISP server, including protocol and port if non-standard (e.g., "https://192.168.1.100:8443"). Required for MISP; tool disables SSL verification for self-signed certs.
- misp_key: 40-character authentication key from MISP. Without it, MISP queries are skipped.
- otx_key: 64-character key from OTX. Used in headers for API authentication.
- vt_keys: Array of strings; the tool randomly selects one per query to balance load. Supports 1+ keys.
- kaspersky_key: Key from OpenTIP; prefixed with "x-api-key" in requests.
- cache_db: Path to SQLite file (relative or absolute). Defaults to "cache.db" in the current directory. The tool auto-creates the table if missing.
- tor_path: Full path to tor binary. If empty or invalid, Tor is skipped with a warning.
Validation and Best Practices:
- Test keys individually using curl or Postman to ensure they work.
- Store
config.jsonsecurely (e.g., chmod 600 on Linux) as it contains sensitive API keys. - If a service changes its API (e.g., endpoint updates), update the tool's code accordingly.
- For high-volume use, consider paid plans on VirusTotal/OTX to avoid rate limits.
HybridHashScanner's usage revolves around command-line arguments for input, mode, and output customization. Inputs can be a single hash (-hash) or a directory of files (-directory with -file_type). Outputs include terminal tables and a text file (-output, default: results.txt).
The tool supports three modes, selected via flags (-q for quick, -e for extra; default is normal). Modes cannot be combined (e.g., -q and -e together cause an error). Each mode calculates an estimated completion time based on hash count, services, and rate limits (e.g., 10s per OTX query).
- Normal Mode (Default):
-
Description: Performs comprehensive, parallel checks across selected services (
-service, default: all). Uses cache first; if missing, queries APIs. OTX and VirusTotal use dedicated worker threads with queues for concurrency. No user prompts—fully automated. -
Workflow:
- Load hashes and check cache.
- For each hash/service: If not cached, queue/query (with verbose logging if
-v). - Wait for queues to join (complete).
- Compile results from cache.
-
Use Case: Ideal for batch processing where all services are needed without interaction. Supports
--hash_typefiltering. - Rate Limits: OTX: ~0.36s delay; VT: 15s delay.
- Example Command: See README example 2.
- Pros/Cons: Fast for cached data; can hit rate limits on large batches.
-
Quick Mode (
-q):
-
Description: Optimized for speed and efficiency—sequentially checks services in order (cache > MISP > Hashlookup > OTX > Kaspersky) for each hash, stopping at the first "found" result. If found, confirms in VirusTotal (optionally via Tor with
-tor). Only processes unique hashes after duplicate removal. -
Workflow:
- Extract/filter hashes.
- For each hash: Check cache/services sequentially; if found, add to found_hashes and break.
- If found_hashes exist and VT configured: Start Tor (if
-tor), queue VT confirmations, join queue. - Display found counts per service and not found total.
- In output: Skip VT details if no malicious/suspicious detections.
- Use Case: Rapid triage of hashes to identify known threats quickly, with anonymity for sensitive confirmations. Great for initial scans.
- Tor Details: Launches Tor with SOCKS5 (9150) and ControlPort (9151), waits for 100% bootstrap, routes VT requests via proxies, then terminates.
- Example Command: See README example 3.
- Pros/Cons: Minimizes API calls; VT only on findings. Slower if Tor bootstrapping fails.
-
Extra Mode (
-e):
-
Description: Interactive, multi-phase mode for thorough analysis. Phase 1: Checks MISP, Hashlookup, OTX. If unfound hashes remain, prompts user to proceed to Phase 2 (Kaspersky) and Phase 3 (VirusTotal). Supports
--vt_confirm(not in code yet, but implied for found hashes). -
Workflow:
- Phase 1: Query initial services, count found per service.
- Display Phase 1 stats (e.g., found in MISP: X).
- If unfound: Prompt for Kaspersky (yes/no); if yes, query and update stats.
- If still unfound: Prompt for VT (yes/no); if yes, queue VT.
- Compile final results.
- Use Case: Deep dives where user decides escalation, e.g., avoiding VT costs unless necessary. Supports verbose for phase logging.
- Example Command: See README example 4.
- Pros/Cons: User-controlled; interactive prompts may interrupt automation.
Additional Flags:
-
-v: Logs every query (e.g., "[+] Searching in OTX for hash..."). -
--view <hash>: Displays full cached JSON for all services. -
--vt_view <hash>: Tabulated VT stats only (malicious, votes, etc.). -
--hash_type: Filters extraction to one type (e.g., sha256).
The code is structured for modularity and readability:
- Imports and Setup: Handles module checks/installs, imports (e.g., pymisp, requests), disables warnings, initializes queues/locks.
-
Utility Functions:
load_config(JSON parse),detect_hash_type(regex/length),detect_encoding(chardet),extract_hashes_from_directory(glob, CSV/txt parsing, stats). -
Cache Functions:
check_cache,save_to_cache(SQLite ops). -
Service Functions: Individual
search_<service>with verbose, timeouts, error handling (e.g.,search_mispuses PyMISP.search). -
Workers:
otx_worker,vt_worker(threaded queues with delays),start_<worker>(daemon threads). -
Tor Functions:
start_tor(launch_tor_with_config, bootstrap check),stop_tor. -
Core Logic:
process_hashes(mode-specific workflows),calculate_estimated_time(service-based math),get_summary_table(tabulate),get_vt_results_table. - Main Block: Argparse, validation, MISP init, cache setup, mode dispatch, output.
- Extraction: From CSV (skip header, read rows) or TXT (regex for tagged/untagged hashes). Removes duplicates via sets, filters by
--hash_type. - Stats: Prints total, duplicates, unique.
- Table: hash (TEXT), hash_type (TEXT), result (JSON TEXT).
- Ops: INSERT OR REPLACE for updates.
- Common: Check config, verbose log, timeout=10s, return None on no result/error.
- MISP: Search attributes by value/type.
- Hashlookup: GET /lookup//, JSON if 200.
- OTX: GET /indicators/file/ with key, check for relevant data (pulses, AV, IDS, YARA).
- Kaspersky: GET /search/hash with key, return if not 'Clean'.
- VT: GET /files/ with random key, proxies if Tor.
- Queues: otx_queue, vt_queue (FIFO).
- Workers: Infinite loop: get item, query, save cache, sleep for rate limit.
- Locks: results_lock for thread-safe dict updates.
- Logo: Pyfiglet 'small' font, centered info box.
- Tables: Tabulate with 'Yes/No' for presence.
- File: UTF-8 write, append VT if conditions met.
- Add Service: New search function, update process_hashes/services_to_search.
- Extend Cache: Add timestamps for expiration.
- Encoding Errors: Fallback to latin1.
- API Timeouts: Increase timeout in requests.
- Tor Failures: Check path, network; debug with stem logs.
- Estimated Time: Mode-specific (e.g., quick: sequential + VT confirm).
- Scaling: For 1000+ hashes, use extra mode to phase queries.
HybridHashScanner combines power and flexibility for hash analysis, with a focus on real-world usability. By following this Wiki, users can fully leverage its features for effective threat intelligence. Contributions welcome on GitHub!