Skip to content

Commit 1536563

Browse files
committed
feat: stealth improvements, signing port, scraper examples
- stygian-browser: patch Navigator.prototype.webdriver to defeat prototype-level bot checks (pixelscan, Akamai) - stygian-browser: spoof navigator.connection (Network Information API) — was null in headless - stygian-browser: spoof navigator.getBattery() — was null in headless - stygian-browser: fix outerWidth/outerHeight to match spoofed screen resolution - stygian-browser: spoof navigator.plugins with realistic 5-entry PluginArray - stygian-browser: add scraper_cli example (generic stealth scraper, NetworkIdle wait) - stygian-browser: add pixelscan_check example (targeted fingerprint scan checker) - stygian-graph: add SigningPort trait + NoopSigningAdapter + HttpSigningAdapter - book: update stealth guide with prototype webdriver patch, Network Info API, Battery API sections
1 parent 113f9cb commit 1536563

File tree

17 files changed

+1618
-6
lines changed

17 files changed

+1618
-6
lines changed

CHANGELOG.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,44 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.2.1] - 2026-03-17
11+
12+
### Added
13+
14+
- `stygian-browser`: `Navigator.prototype.webdriver` prototype-level patch — previously only the
15+
instance property was overridden; scanners such as pixelscan.net and Akamai probe
16+
`Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver')` directly,
17+
so the prototype getter is now also patched on every new document context
18+
- `stygian-browser`: Network Information API spoofing — `navigator.connection` (previously
19+
`null` in headless, an immediate detection signal) is replaced with a realistic
20+
`NetworkInformation`-like object (`effectiveType: "4g"`, `type: "wifi"`, seeded
21+
`downlink`/`rtt` values stable within a session)
22+
- `stygian-browser`: Battery Status API spoofing — `navigator.getBattery()` (previously
23+
`null` in headless) now resolves with a plausible disconnected-battery state; `level`,
24+
`dischargingTime` are seeded from `performance.timeOrigin` to vary across sessions
25+
- `stygian-browser`: `examples/scraper_cli.rs` — generic CLI scraper using `StealthLevel::Advanced`,
26+
`WaitUntil::NetworkIdle`; emits structured JSON (title, description, headings, links,
27+
text excerpt, timing); successfully scrapes Cloudflare-protected sites (CNN.com, etc.)
28+
- `stygian-browser`: `examples/pixelscan_check.rs` — targeted pixelscan.net fingerprint scan
29+
example; polls until client-side result cards settle; extracts verdict, per-card pass/fail
30+
status, hardware/font/UA detail sections, and live `nav_signals` for stealth regression testing
31+
- `stygian-graph`: `SigningPort` trait — request-signing abstraction for attaching HMAC tokens,
32+
AWS Signature V4, OAuth 1.0a, device attestation tokens, or any per-request auth material
33+
without coupling adapters to signing scheme
34+
- `stygian-graph`: `NoopSigningAdapter` — passthrough signer for testing and optional-signer defaults
35+
- `stygian-graph`: `HttpSigningAdapter` — delegates signing to any external sidecar over HTTP POST
36+
(e.g. a Frida RPC bridge exposing a `/sign` endpoint); configurable timeout and retries
37+
- `book`: stealth guide updated — prototype-level webdriver patch, Network Information API
38+
spoofing, and Battery Status API spoofing sections added
39+
40+
### Fixed
41+
42+
- `stygian-browser`: `outerWidth`/`outerHeight` now set via `screen_script` injection to match
43+
the spoofed screen resolution (headless Chrome returns `0` without this)
44+
- `stygian-browser`: `navigator.plugins` spoofed with a realistic 5-entry `PluginArray`
45+
(PDF Viewer entries + `navigator.mimeTypes` with 2 entries), eliminating the
46+
empty-plugins headless signal
47+
1048
## [0.2.0] - 2026-03-16
1149

1250
### Added

Cargo.lock

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ members = [
77
]
88

99
[workspace.package]
10-
version = "0.2.0"
10+
version = "0.2.1"
1111
edition = "2024"
1212
rust-version = "1.94.0"
1313
authors = ["Nick Campbell <s0ma@protonmail.com>"]

book/src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
- [Architecture](./graph/architecture.md)
1010
- [Building Pipelines](./graph/pipelines.md)
1111
- [Built-in Adapters](./graph/adapters.md)
12+
- [Request Signing](./graph/signing.md)
1213
- [GraphQL Plugins](./graph/graphql-plugins.md)
1314
- [Custom Adapters](./graph/custom-adapters.md)
1415
- [Distributed Execution](./graph/distributed.md)

book/src/browser/stealth.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,13 @@ Executed on every new document context before any page script runs.
6767
- Aligns `navigator.hardwareConcurrency` and `navigator.deviceMemory` with the
6868
chosen device fingerprint
6969

70+
Two layers of protection prevent `webdriver` detection:
71+
72+
1. **Instance patch**`Object.defineProperty(navigator, 'webdriver', { get: () => undefined })` hides the flag from direct access (`navigator.webdriver === undefined`).
73+
2. **Prototype patch**`Object.defineProperty(Navigator.prototype, 'webdriver', ...)` hides the underlying getter from `Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver')`, which some scanners (e.g. pixelscan.net, Akamai) probe directly.
74+
75+
Both patches are injected into every new document context before any page script runs.
76+
7077
The fingerprint is drawn from statistically-weighted device profiles:
7178

7279
```rust
@@ -168,6 +175,42 @@ typer.type_into(&page, "#search-input", "rust async web scraping").await?;
168175

169176
---
170177

178+
## Network Information API spoofing
179+
180+
`navigator.connection` (Network Information API) reveals connection quality and type.
181+
Headless browsers return `null` here, which is an immediate headless signal on connection-aware scanners.
182+
183+
`Advanced` stealth injects a realistic `NetworkInformation`-like object:
184+
185+
| Property | Spoofed value |
186+
| --- | --- |
187+
| `effectiveType` | `"4g"` |
188+
| `type` | `"wifi"` |
189+
| `downlink` | Seeded from `performance.timeOrigin` (stable per session, ≈ 10 Mbps range) |
190+
| `rtt` | Seeded jitter (50–100 ms range) |
191+
| `saveData` | `false` |
192+
193+
---
194+
195+
## Battery Status API spoofing
196+
197+
`navigator.getBattery()` returns `null` in headless Chrome — a clear automation signal
198+
for scanners that enumerate battery state.
199+
200+
`Advanced` stealth overrides `getBattery()` to resolve with a plausible disconnected-battery state:
201+
202+
| Property | Spoofed value |
203+
| --- | --- |
204+
| `charging` | `false` |
205+
| `chargingTime` | `Infinity` |
206+
| `dischargingTime` | Seeded (≈ 3600–7200 s) |
207+
| `level` | Seeded (0.65–0.95) |
208+
209+
The seed values are derived from `performance.timeOrigin` so they are stable within a page
210+
load but differ across sessions, preventing replay detection.
211+
212+
---
213+
171214
## Fingerprint consistency
172215

173216
All spoofed signals are derived from a single `DeviceProfile` generated at browser

book/src/graph/adapters.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -658,3 +658,31 @@ target = "https://docs.example.com"
658658
| Cloudflare API non-2xx | `ServiceError::Unavailable` (with CF error code) |
659659
| Job still pending after `job_timeout` | `ServiceError::Timeout` |
660660
| Unexpected response shape | `ServiceError::InvalidResponse` |
661+
662+
---
663+
664+
## Request signing adapters
665+
666+
The `SigningPort` trait lets any adapter attach signatures, HMAC tokens,
667+
device attestation headers, or OAuth material to outbound requests without
668+
coupling the adapter to the scheme.
669+
670+
| Adapter | Use case |
671+
| --- | --- |
672+
| `NoopSigningAdapter` | Passthrough — no headers added; useful as a default or in unit tests |
673+
| `HttpSigningAdapter` | Delegate to any external sidecar (Frida RPC bridge, AWS SigV4 server, OAuth 1.0a service, …) |
674+
675+
```rust
676+
use std::sync::Arc;
677+
use stygian_graph::adapters::signing::{HttpSigningAdapter, HttpSigningConfig};
678+
use stygian_graph::ports::signing::ErasedSigningPort;
679+
680+
let signer: Arc<dyn ErasedSigningPort> = Arc::new(
681+
HttpSigningAdapter::new(HttpSigningConfig {
682+
endpoint: "http://localhost:27042/sign".to_string(),
683+
..Default::default()
684+
})
685+
);
686+
```
687+
688+
See [Request Signing](./signing.md) for the full sidecar wire format, Frida RPC bridge example, and guide to implementing a pure-Rust `SigningPort`.

book/src/graph/custom-adapters.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ traits defined in `src/ports.rs`. The domain never imports adapters — only por
2727
| `ScrapingService` | Fetching or processing content in a new way |
2828
| `AIProvider` | Adding a new LLM or language model API |
2929
| `CachePort` | Adding a new cache backend (Redis, Memcached, …) |
30+
| `SigningPort` | Attaching signatures, HMAC tokens, or authentication material to outgoing requests |
3031

3132
`PlaywrightService` fetches rendered HTML, so it implements `ScrapingService`.
3233

0 commit comments

Comments
 (0)