Skip to content

sarperavci/CloudflareBypassForScraping

Repository files navigation

Cloudflare Bypass for Scraping

⭐ Thank you for 1,800+ stars! Introducing Version 2.0 with enhanced request mirroring, improved caching and better reliability for bypassing Cloudflare protection.

Bypass Cloudflare protection with ease. Supports cookie generation and request mirroring for any HTTP method.

Sponsors

Scrapeless

If you are looking for a solution focused on browser automation and anti-detection mechanisms, I recommend Scrapeless Browser.
It is a cloud-based, Chromium-powered headless browser cluster that enables developers to run large-scale concurrent browser instances and handle complex interactions on protected pages. Perfect for AI infrastructure, web automation, data scraping, page rendering, and automated testing.

The Scrapeless Browser provides a secure, isolated browser environment that allows you to interact with web applications while minimizing potential risks to your system.

If you're looking for powerful and scalable browser automation and real-time data acquisition capabilities, Scrapeless offers a high-performance, scalable, and cost-efficient cloud browser infrastructure as well as a global enterprise-grade proxy network, addressing the core needs of automated execution and stable IP access.

Scrapeless Browser – Enterprise Cloud Browser Infrastructure

  • Out-of-the-Box Ready: Natively compatible with Puppeteer and Playwright, supporting CDP connections. Migrate your projects with just one line of code.
  • Bulk Isolated Environment Creation: Each profile corresponds to an exclusive browser environment, enabling persistent login and identity isolation.
  • Unlimited Concurrent Scaling: A single task supports second-level launch of 50 to 1000+ browser instances. Auto-scaling is available with no server resource limits.
  • Real-time Signaling (MFA):Supports event-driven handling of asynchronous workflows, including SMS/Email/TOTP verification, ensuring stable sessions, and uninterrupted automation.
  • Edge Node Service (ENS) – Multiple nodes worldwide, offering 2–3Γ— faster launch speed and higher stability than other cloud browsers.
  • Flexible Fingerprint Customization: Generate random fingerprints or customize fingerprint parameters as needed.
  • Visual Debugging: Perform interactive debugging and real-time monitoring of proxy traffic through Live View, and quickly pinpoint issues and optimize actions by replaying sessions page by page with Session Recordings.
  • Enterprise Customization: Undertake customization of enterprise-level automation projects and AI Agent customization.

πŸ‘‰ Learn more: Scrapeless Scraping Browser Playground Scrapeless Browser| Documentation

Scrapeless Proxy Network – Unblockable, Large-Scale Data Extraction

  • 90+ million residential IPs worldwide, covering 195+ countries, starting at $1.80/GB, pay-per-GB with no traffic expiration.

  • Flexible Proxy Types: Choose residential, IPv6, static ISP, or datacenter proxies based on workload requirements.

  • Enterprise-Grade Reliability: 99.9% uptime with ultra-low latency (<0.5s).

  • Advanced targeting: City-level geolocation targeting with automatic IP rotation

  • High-Performance Scraping: Ideal for AI training data collection, web automation, and large-scale real-time extraction tasks.

  • πŸ‘‰ Learn more: Scrapeless Proxies| Documentation
    πŸ‘‰ Get it Now!


ThorData

ThorData Web Scraper provides unblockable proxy infrastructure and scraping solutions for reliable, real-time web data extraction at scale. Perfect for AI training data collection, web automation, and large-scale scraping operations that require high performance and stability.
Key Advantages of ThorData:

  • Massive proxy network: Access to 60M+ ethically sourced residential, mobile, ISP, and datacenter IPs across 190+ countries.
  • Enterprise-grade reliability: 99.9% uptime with ultra-low latency (<0.5s response time) for uninterrupted data collection.
  • Flexible proxy types: Choose from residential, mobile (4G/5G), static ISP, or datacenter proxies based on your needs.
  • Cost-effective pricing: Starting from $1.80/GB for residential proxies with no traffic expiration and pay-as-you-go model.
  • Advanced targeting: City-level geolocation targeting with automatic IP rotation and unlimited bandwidth options.
  • Ready-to-use APIs: 120+ scraper APIs and comprehensive datasets purpose-built for AI and data science workflows.

ThorData is SOC2, GDPR, and CCPA compliant, trusted by 4,000+ enterprises for secure web data extraction.
πŸ‘‰ Learn more: ThorData Web Scraper | Get Started


IPOasis

IPOasis is a trusted provider of high-quality proxy services, with over 90 million nodes distributed globally across more than 200 countries.

Our proxies are fresh, clean, fast, and have a high success rate.

Supporting both HTTP and SOCKS5 protocols, with session control and unlimited concurrency.

Our products are ideal for a variety of use cases including data monitoring, survey research, web scraping, SEO/ASO optimization, app simulation, gaming, business measurement, marketing, and more.

May IPOasis, this unique online 'oasis,' empower every user seeking high-quality residential proxies. 🩡

πŸš€ Quick Start

Docker (Recommended)

Using Docker Compose

git clone https://github.com/sarperavci/CloudflareBypassForScraping.git
cd CloudflareBypassForScraping
docker compose pull && docker compose up -d

Using Docker directly

# Pull and run the latest image
docker run -p 8000:8000 ghcr.io/sarperavci/cloudflarebypassforscraping:latest

Manual Installation

pip install -r requirements.txt
python server.py

Usage

Request Mirroring (Any HTTP Method)

Request mirroring is a new technique that allows you to forward any HTTP request through the Cloudflare bypass server. That lets you to handle seamlessly both clearance cookie generation and SSL/TLS fingerprinting challenges.

Simply, change your API base URL to point to the local server and add the x-hostname header with the target hostname. You can add other headers or body as needed.

# GET request
curl "http://localhost:8000/api/data" -H "x-hostname: example-site-protected-with-cf.com"

# POST request  
curl -X POST "http://localhost:8000/api/submit" \
  -H "x-hostname: cf-protected-website.com" \
  -H "Content-Type: application/json" \
  -d '{"key": "value"}'

Initial request will generate and cache Cloudflare cookies, subsequent requests will use cached cookies automatically.

Miscellaneous Headers

  • x-hostname: Target hostname (required)
  • x-proxy: Proxy URL (optional)
  • x-bypass-cache: Force fresh cookies (optional)

These three headers let you control the bypassing behavior per request. You can set them as needed.

curl "http://localhost:8000/api/data" \
  -H "x-hostname: protected-site.com" \
  -H "x-proxy: http://user:pass@proxyserver:port" \
  -H "x-bypass-cache: true"

Basic Cookie Extraction

The /cookies endpoint allows you to get Cloudflare cookies for a specific URL without mirroring a request. A random Firefox version on a random OS is used as the user agent.

$ curl "http://localhost:8000/cookies?url=https://nopecha.com/demo/cloudflare"
{
  "cookies": {
    "cf_clearance": "SJHuYhHrTZpXDUe8iMuzEUpJxocmOW8ougQVS0.aK5g-1723665177-1.0.1.1-5_NOoP19LQZw4TQ4BLwJmtrXBoX8JbKF5ZqsAOxRNOnW2rmDUwv4hQ7BztnsOfB9DQ06xR5hR_hsg3n8xteUCw"
  },
  "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:145.0) Gecko/20100101 Firefox/145.0"
}

HTML Content Extraction

The /html endpoint returns the full HTML content of a page after bypassing Cloudflare protection. The HTML is returned directly (not as JSON).

$ curl "http://localhost:8000/html?url=https://nopecha.com/demo/cloudflare"

This returns the raw HTML content with additional headers containing bypass information:

  • x-cf-bypasser-cookies: Number of cookies generated
  • x-cf-bypasser-user-agent: User agent used for bypass
  • x-cf-bypasser-final-url: Final URL after redirects
  • x-processing-time-ms: Time taken to process the request

Build from Source

# Build the image
docker build -t cloudflare-bypass .

# Run the container
docker run -p 8000:8000 cloudflare-bypass

Backward Compatibility

Existing integrations continue to work unchanged:

# Legacy endpoint still works
curl "http://localhost:8000/cookies?url=https://example.com"

# Old bypass server - I'm keeping it as alternative method
pip install -r old_server_requirements.txt
python old_server.py

Example Projects

Contributing

Contributions welcome! Submit PRs against the main codebase.

Releases

No releases published

Packages