Skip to content

jtgsystems/free-sitemap-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Banner

πŸ—ΊοΈ Free Sitemap Generator - SOTA 2026 Edition

Python Version License PyQt6 Maintenance GitHub Issues GitHub Stars

πŸš€ The most advanced free sitemap generator - Now with async crawling, XML export, and professional SEO features!


✨ What's New in SOTA 2026 Edition

🏎️ Performance

  • Async Concurrent Crawling - Crawl up to 50 pages simultaneously
  • Connection Pooling - Reuse connections for faster crawling
  • Smart Rate Limiting - Respects server resources while maximizing speed

πŸ“„ Export Formats

  • XML Sitemap - Full sitemaps.org compliance
  • XML Sitemap Index - For large sites (50,000+ URLs)
  • GZip Compression - Reduce file sizes by 80%+
  • Text, CSV, JSON - Multiple export options

🎯 SEO Features

  • Automatic Priority Calculation - Based on page depth and importance
  • Change Frequency Detection - Smart changefreq assignment
  • Last Modified Extraction - From HTTP headers
  • Image Sitemap Support - Include page images
  • Robots.txt Compliance - Optional respect for crawl rules

πŸ“‹ Overview

The Free Sitemap Generator is a professional-grade desktop application for creating XML sitemaps. It crawls your website, discovers all pages, and generates standards-compliant sitemaps for search engine submission.

Perfect For:

  • 🌐 Website Owners - Improve SEO and search indexing
  • 🏒 SEO Professionals - Client site audits and optimization
  • πŸ‘¨β€πŸ’» Developers - Automated sitemap generation
  • 🎨 Agencies - Batch site processing

πŸš€ Quick Start

Installation

# Clone the repository
git clone https://github.com/jtgsystems/free-sitemap-generator.git
cd free-sitemap-generator

# Install dependencies
pip install -r requirements.txt

# Run the application
python main.py

Usage

  1. Enter your website URL (e.g., https://example.com)
  2. Configure crawl settings (optional)
  3. Click "Start Crawl"
  4. Export your sitemap in your preferred format

πŸ“¦ Features

πŸ” Crawling

Feature Description
Async Concurrent Crawl up to 50 pages simultaneously
Depth Control Limit crawl depth (1-10 levels)
URL Limits Configurable max URLs (100-50,000)
Smart Filtering Excludes non-HTML, external links
Robots.txt Optional compliance with crawl rules
Rate Limiting Configurable delay between requests

πŸ“€ Export Options

Format Description Best For
XML Sitemap Standard sitemaps.org format Google, Bing submission
XML + GZip Compressed XML Large sites, bandwidth saving
Sitemap Index Multiple sitemap files 50,000+ URLs
Text One URL per line Quick reviews
CSV Spreadsheet format Analysis in Excel
JSON Structured data API integration

πŸŽ›οΈ Configuration

  • Max Depth - How many link levels to follow
  • Max URLs - Limit total discovered URLs
  • Concurrency - Number of simultaneous requests
  • Crawl Delay - Seconds between requests
  • Respect robots.txt - Follow crawl rules
  • Include Images - Add image sitemap entries

πŸ“Š Sitemap Features

Automatic SEO Optimization

<url>
  <loc>https://example.com/</loc>
  <lastmod>2026-02-05T10:30:00+00:00</lastmod>
  <changefreq>daily</changefreq>
  <priority>1.0</priority>
</url>
  • Priority Calculation - Homepage gets 1.0, deeper pages get lower priority
  • Change Frequency - Detected from URL patterns (blog=weekly, about=monthly)
  • Last Modified - Extracted from HTTP Last-Modified headers

Image Sitemaps

<url>
  <loc>https://example.com/gallery</loc>
  <image:image>
    <image:loc>https://example.com/photo1.jpg</image:loc>
  </image:image>
</url>

πŸ—οΈ Building from Source

Build Executable

# Install PyInstaller
pip install pyinstaller

# Build
python setup.py

# Find executable in dist/ folder

πŸ“ Project Structure

free-sitemap-generator/
β”œβ”€β”€ sitemap_generator/      # Main package
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ crawler.py          # Async web crawler
β”‚   β”œβ”€β”€ exporter.py         # XML/export formats
β”‚   └── gui.py              # PyQt6 interface
β”œβ”€β”€ main.py                 # Entry point
β”œβ”€β”€ requirements.txt        # Dependencies
β”œβ”€β”€ setup.py               # Build script
β”œβ”€β”€ README.md              # This file
β”œβ”€β”€ LICENSE                # MIT License
└── .github/               # GitHub templates

βš™οΈ Technical Specifications

Sitemap Compliance

  • βœ… sitemaps.org protocol
  • βœ… Google Search Console compatible
  • βœ… Bing Webmaster Tools compatible
  • βœ… 50,000 URL limit per file
  • βœ… 50MB file size limit
  • βœ… UTF-8 encoding
  • βœ… Proper XML escaping

Performance

  • Crawling: Up to 500 pages/minute (depending on site)
  • Memory: Efficient streaming for large sites
  • CPU: Multi-threaded processing
  • Network: Connection pooling, keep-alive

πŸ› Troubleshooting

Common Issues

"Timeout fetching URL"

  • Check your internet connection
  • Increase timeout in settings
  • Website may be blocking crawlers

"No URLs found"

  • Ensure URL includes http:// or https://
  • Check that website has crawlable links
  • Verify robots.txt allows crawling

"Application freezes"

  • Reduce concurrency setting
  • Increase crawl delay
  • Check system resources

Error Codes

Code Meaning Solution
403 Access Forbidden Check robots.txt, add User-Agent
404 Page Not Found Check URL spelling
500 Server Error Try again later
Timeout Request Timeout Increase timeout setting

🀝 Contributing

Contributions welcome! Please read CONTRIBUTING.md for guidelines.

Development Setup

# Clone repo
git clone https://github.com/jtgsystems/free-sitemap-generator.git

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dev dependencies
pip install -r requirements.txt
pip install pytest black flake8

# Run tests
pytest

# Format code
black sitemap_generator/

πŸ“„ License

This project is licensed under the MIT License - see LICENSE file.


πŸ™ Acknowledgments


πŸ”— Links


Made with ❀️ by JTGSYSTEMS

SEO Keyword Cloud

sitemap generator crawler python pyqt6 gui web seo indexing discovery urls links analytics architecture structure navigation automation crawling spider performance lxml requests beautifulsoup multithreading progress reporting visualization export xml website audit diagnostics optimization accessibility compliance marketing content developers agencies freelancers startup enterprise batch queue resilience reliability uptime monitoring insights roadmap async concurrent gzip sitemap-index image-sitemap google bing search-console

About

πŸ—ΊοΈ Free sitemap generator - Create XML sitemaps for SEO

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages