Canopy is a Python-based OSINT framework designed to help individuals discover, organize, and analyze publicly available information about their own digital footprint.
Note: Canopy was developed and tested exclusively on my own publicly available data as a learning and portfolio project, canopy was made as part of my IT and Cyber-security journey and may contain bugs.
- Understand how publicly available information is indexed and exposed online
- Practice structured OSINT methodology using search engines
- Correlate results from multiple sources into meaningful categories
- Demonstrate ethical boundaries and legal awareness in OSINT work
- Multi-platform username enumeration across social media, coding sites, gaming platforms, and more.
- Avoids false positives using fingerprint-based validation.
- Supports local caching of platform fingerprints to speed up scans.
- Multi-threaded, high-performance scanning with optional rate-limiting and delays.
- Generates reports in JSON, CSV, HTML, or TXT formats.
- CLI interface for easy integration into scripts or automation workflows.
- Categorized results for better organization (e.g., social, professional, gaming).
pip install canopy-scanner
-----------------------------------
git clone https://github.com/guyvolvo/Canopy.git
cd Canopy
usage: canopy [-h] [-u USERNAME] [-U USERNAMES] [-t THREADS] [--timeout TIMEOUT] [--delay DELAY]
[--rate-limit RATE_LIMIT] [-c CATEGORIES] [-p PLATFORMS] [--exclude EXCLUDE] [--only-found]
[--list-categories] [-o OUTPUT] [-f {json,csv,html,txt}] [-v] [-q] [--print-found]
Canopy - Username Enumeration Tool
options:
-h, --help show this help message and exit
Target Options:
-u, --username USERNAME
Username to search for
-U, --usernames USERNAMES
File containing list of usernames (one per line)
Performance Options:
-t, --threads THREADS
Number of concurrent threads (default: 10)
--timeout TIMEOUT Request timeout in seconds (default: 10)
--delay DELAY Delay between requests in seconds (default: 0)
--rate-limit RATE_LIMIT
Max requests per second (default: unlimited)
Filtering Options:
-c, --categories CATEGORIES
Comma-separated categories to check (e.g., social,gaming)
-p, --platforms PLATFORMS
Comma-separated specific platforms to check
--exclude EXCLUDE Comma-separated platforms to exclude
--only-found Only show found accounts
--list-categories Show all available platform categories and exit
Output Options:
-o, --output OUTPUT Output file path
-f, --format {json,csv,html,txt}
Output format: json, csv, html, txt (default: json)
-v, --verbose Verbose output
-q, --quiet Minimal output (only results)
--print-found Print found accounts in real-time
Examples:
canopy -u johndoe
canopy -u johndoe -t 50 --timeout 15
canopy -u johndoe -o report.json --format json
canopy -u johndoe --categories social,gaming
canopy --list-categoriesThis framework is intended strictly for self-OSINT, educational use, or explicit consent-based research.
- Your own digital footprint
- Accounts, domains, and identifiers you own
- Targets for which you have explicit written permission
- Using this tool against private individuals without consent may violate privacy laws and platform Terms of Service.
- I have no responsibility for misuse of this software.
Canopy collects metadata only, such as:
- Page titles
- URLs
- Search snippets
- Source domain
It does not:
- Bypass CAPTCHAs
- Scrape authenticated content
- Harvest private data
- Enumerate personal contact lists
Theoretical Project Structure (Generated by ChatGPT and Cluade Made for reference so I can follow along and add or remove things as I see fit):
GPT Workflow :
canopy/
βββ README.md
βββ DISCLAIMER.md
βββ methodology/
β βββ osint_methodology.md
βββ canopy/
β βββ query_generator.py
β βββ collector.py
β βββ parser.py
β βββ correlator.py
βββ output/
β βββ sample_report.md
βββ lessons_learned.md\
Claude workflow :
Canopy/
βββ main.py # Entry point, CLI interface
βββ platforms.json # Platform database
βββ query_generator.py # Generate queries from usernames
βββ username_checker.py # Check if username exists on platforms
βββ data_collector.py # Collect and aggregate data
βββ report_generator.py # Format and export results
βββ config.py # Configuration settings
βββ utils.py # Helper functions
βββ requirements.txt # Dependencies\
platforms.json inspired by the Sherlock OSINT project :)
- Canopy uses a structured OSINT approach:
- Generate a list of platforms to query (social, professional, gaming).
- Create URL patterns for a target username.
- Validate account existence using HTTP responses, redirects, error messages, and HTML fingerprints.
- Aggregate results into structured reports.
- Optionally store fingerprints locally to avoid redundant requests.
- This ensures high accuracy while reducing false positives.
- Only scan accounts you own or have explicit permission to analyze.
- Use --threads and --rate-limit responsibly to avoid being blocked by platforms.
- Review your JSON/CSV/HTML reports for patterns before taking any action.
- Update platforms.json regularly to include new platforms.
- Periodically refresh fingerprints for platforms that change their 404 pages.