Skip to content

Implement curl_cffi #1845

@Luis-manzur

Description

@Luis-manzur

Problem

Some court websites (e.g. Massachusetts Superior Court / socialaw.com) block requests from standard HTTP clients, returning 403 responses. These sites use TLS fingerprinting or
similar bot-detection techniques that requests cannot bypass.

Solution

Add curl_cffi as a dependency to enable browser TLS fingerprint impersonation. Scrapers that need it can set self.impersonate = True (or a specific browser string like "safari") to route requests through curl_cffi instead of requests.

This will allow us to continue scraping opinions from nh, masssuperct and lactapp_3

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions