Skip to content

Commit 5f35081

Browse files
committed
fix: address PR #16 copilot review — body_raw precedence, 429 retry-after, token-bucket panic guard, TLS init, docs
1 parent d5ac491 commit 5f35081

File tree

8 files changed

+60
-40
lines changed

8 files changed

+60
-40
lines changed

CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.1.18] - 2026-03-15
11+
12+
### Fixed
13+
14+
- `stygian-graph`: `RestApiAdapter` now checks `body_raw` before `body` when both are present, matching the documented precedence contract
15+
- `stygian-graph`: `RestApiAdapter` 429 responses now return `ServiceError::RateLimited` with the parsed `Retry-After` value; `send_one` honours the server-specified delay instead of blind exponential backoff
16+
- `stygian-graph`: token-bucket rate limiter guards against `max_requests = 0` or zero-duration window configs that previously caused a division-by-zero panic via `Duration::from_secs_f64(inf)`
17+
- `stygian-graph`: `CloudflareCrawlAdapter::with_config` now panics with a clear message on TLS init failure instead of silently falling back to a misconfigured default `reqwest::Client`
18+
- `book`: Cloudflare crawl adapter metadata example corrected to `job_id`, `pages_crawled`, `output_format` (was `pages`, `url_count`)
19+
- `book`: `HeadlessMode::Legacy` docs across configuration and env-vars pages corrected to "classic `--headless` for Chromium < 112" (was incorrectly referencing `chrome-headless-shell` and Chrome 132 removal)
20+
1021
## [0.1.17] - 2026-03-14
1122

1223
### Added

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ members = [
66
]
77

88
[workspace.package]
9-
version = "0.1.17"
9+
version = "0.1.18"
1010
edition = "2024"
1111
rust-version = "1.93.1"
1212
authors = ["Nick Campbell <s0ma@protonmail.com>"]

book/src/browser/configuration.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ let config = BrowserConfig::builder()
5050
| Field | Type | Default | Description |
5151
| --- | --- | --- | --- |
5252
| `headless` | `bool` | `true` | Run without visible window |
53-
| `headless_mode` | `HeadlessMode` | `New` | `New` = `--headless=new` (full Chromium rendering, default since Chrome 112, **only mode since Chrome 132**); `Legacy` = `chrome-headless-shell` / pre-112 `--headless` |
53+
| `headless_mode` | `HeadlessMode` | `New` | `New` = `--headless=new` (full Chromium rendering, default since Chrome 112); `Legacy` = classic `--headless` flag for Chromium < 112 |
5454
| `window_size` | `Option<(u32, u32)>` | `(1920, 1080)` | Browser viewport dimensions |
5555
| `chrome_path` | `Option<PathBuf>` | auto-detect | Path to Chrome/Chromium binary |
5656
| `stealth_level` | `StealthLevel` | `Advanced` | Anti-detection level |
@@ -92,7 +92,7 @@ All config values can be overridden without touching source code:
9292
| --- | --- | --- |
9393
| `STYGIAN_CHROME_PATH` | auto-detect | Path to Chrome/Chromium binary |
9494
| `STYGIAN_HEADLESS` | `true` | Set `false` for headed mode |
95-
| `STYGIAN_HEADLESS_MODE` | `new` | `new` (`--headless=new`) or `legacy` (`chrome-headless-shell`; old `--headless` removed in Chrome 132) |
95+
| `STYGIAN_HEADLESS_MODE` | `new` | `new` (`--headless=new`) or `legacy` (classic `--headless` for Chromium < 112) |
9696
| `STYGIAN_STEALTH_LEVEL` | `advanced` | `none`, `basic`, `advanced` |
9797
| `STYGIAN_POOL_MIN` | `2` | Minimum warm browsers |
9898
| `STYGIAN_POOL_MAX` | `10` | Maximum concurrent browsers |
@@ -161,16 +161,11 @@ let config = BrowserConfig::builder()
161161
```
162162

163163
For Chromium ≥ 112 (all modern Chrome / Chromium builds), `New` is the right
164-
choice. `Legacy` targets are rare: pre-112 Chromium or the separately distributed
165-
`chrome-headless-shell` binary for lightweight CI workloads where full rendering
166-
fidelity is not required.
167-
168-
> **Note:** As of Chrome 132 the old `--headless` flag is removed entirely.
169-
> `HeadlessMode::Legacy` now maps to `chrome-headless-shell` semantics — avoid it
170-
> unless you are explicitly targeting that binary.
164+
choice. `Legacy` falls back to the classic `--headless` flag which uses an older
165+
rendering pipeline — use it only when targeting Chromium < 112.
171166

172167
```rust,no_run
173-
// Only needed for Chromium < 112 or chrome-headless-shell
168+
// Only needed for Chromium < 112
174169
let config = BrowserConfig::builder()
175170
.headless_mode(HeadlessMode::Legacy)
176171
.build();

book/src/graph/adapters.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -457,9 +457,9 @@ Cloudflare API.
457457

458458
```json
459459
{
460-
"job_id": "some-uuid",
461-
"pages": 12,
462-
"url_count": 12
460+
"job_id": "some-uuid",
461+
"pages_crawled": 12,
462+
"output_format": "markdown"
463463
}
464464
```
465465

book/src/reference/env-vars.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ variables. No recompilation required.
1111
| --- | --- | --- |
1212
| `STYGIAN_CHROME_PATH` | auto-detect | Absolute path to Chrome or Chromium binary |
1313
| `STYGIAN_HEADLESS` | `true` | `false` for headed mode (displays browser window) |
14-
| `STYGIAN_HEADLESS_MODE` | `new` | `new` (`--headless=new`) or `legacy` (`chrome-headless-shell`; old `--headless` removed in Chrome 132) |
14+
| `STYGIAN_HEADLESS_MODE` | `new` | `new` (`--headless=new`) or `legacy` (classic `--headless` for Chromium < 112) |
1515
| `STYGIAN_STEALTH_LEVEL` | `advanced` | `none`, `basic`, or `advanced` |
1616
| `STYGIAN_POOL_MIN` | `2` | Minimum warm browser instances |
1717
| `STYGIAN_POOL_MAX` | `10` | Maximum concurrent browser instances |

crates/stygian-graph/src/adapters/cloudflare_crawl.rs

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -161,8 +161,7 @@ impl CloudflareCrawlAdapter {
161161
let client = Client::builder()
162162
.timeout(Duration::from_secs(60))
163163
.build()
164-
// reqwest::ClientBuilder::build only fails on TLS init; that is not recoverable.
165-
.unwrap_or_default();
164+
.expect("reqwest TLS backend failed to initialize");
166165
Self { client, config }
167166
}
168167

crates/stygian-graph/src/adapters/graphql_rate_limit.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,11 @@ impl RequestWindow {
120120
tokens,
121121
last_refill,
122122
} => {
123+
// Guard: zero max_requests or zero window would produce rate=0 → inf wait.
124+
if self.config.max_requests == 0 || self.config.window.is_zero() {
125+
return Some(max_delay);
126+
}
127+
123128
// Refill tokens proportional to elapsed time.
124129
let elapsed = now.duration_since(*last_refill);
125130
let rate = f64::from(self.config.max_requests) / self.config.window.as_secs_f64();

crates/stygian-graph/src/adapters/rest_api.rs

Lines changed: 33 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -330,12 +330,13 @@ impl RestApiAdapter {
330330
})
331331
.unwrap_or_default();
332332

333-
let body = if params["body"].is_null() {
334-
params["body_raw"]
335-
.as_str()
336-
.map(|raw| RequestBody::Raw(raw.to_owned()))
337-
} else {
333+
// body_raw takes precedence over body (raw string vs structured JSON).
334+
let body = if let Some(raw) = params["body_raw"].as_str().filter(|s| !s.is_empty()) {
335+
Some(RequestBody::Raw(raw.to_owned()))
336+
} else if !params["body"].is_null() {
338337
Some(RequestBody::Json(params["body"].clone()))
338+
} else {
339+
None
339340
};
340341

341342
let accept = params["accept"]
@@ -438,9 +439,15 @@ impl RestApiAdapter {
438439

439440
for attempt in 0..=self.config.max_retries {
440441
if attempt > 0 {
441-
let delay = self.config.retry_base_delay * 2u32.saturating_pow(attempt - 1);
442+
// Honour server Retry-After when available; otherwise exponential backoff.
443+
let delay = match &last_err {
444+
Some(StygianError::Service(ServiceError::RateLimited { retry_after_ms })) => {
445+
Duration::from_millis(*retry_after_ms)
446+
}
447+
_ => self.config.retry_base_delay * 2u32.saturating_pow(attempt - 1),
448+
};
442449
tokio::time::sleep(delay).await;
443-
debug!(url, attempt, "REST API retry");
450+
debug!(url, attempt, ?delay, "REST API retry");
444451
}
445452

446453
match self.do_send(url, spec, extra_query).await {
@@ -516,18 +523,18 @@ impl RestApiAdapter {
516523
.and_then(|v| v.to_str().ok())
517524
.map(ToOwned::to_owned);
518525

519-
// 429 — log retry-after hint
526+
// 429 — honour server Retry-After hint via dedicated error variant.
520527
if status.as_u16() == 429 {
521-
let retry_after = response
528+
let retry_after_secs = response
522529
.headers()
523530
.get("retry-after")
524531
.and_then(|v| v.to_str().ok())
525532
.and_then(|s| s.parse::<u64>().ok())
526533
.unwrap_or(5);
527-
warn!(url, retry_after, "REST API rate-limited (429)");
528-
return Err(StygianError::from(ServiceError::Unavailable(format!(
529-
"HTTP 429 rate-limited; retry-after={retry_after}s"
530-
))));
534+
warn!(url, retry_after_secs, "REST API rate-limited (429)");
535+
return Err(StygianError::from(ServiceError::RateLimited {
536+
retry_after_ms: retry_after_secs.saturating_mul(1000),
537+
}));
531538
}
532539

533540
if !status.is_success() {
@@ -565,16 +572,19 @@ impl Default for RestApiAdapter {
565572

566573
/// Returns `true` for transient errors that are worth retrying.
567574
fn is_retryable(err: &StygianError) -> bool {
568-
let StygianError::Service(ServiceError::Unavailable(msg)) = err else {
569-
return false;
570-
};
571-
msg.contains("429")
572-
|| msg.contains("500")
573-
|| msg.contains("502")
574-
|| msg.contains("503")
575-
|| msg.contains("504")
576-
|| msg.contains("connection")
577-
|| msg.contains("timed out")
575+
match err {
576+
StygianError::Service(ServiceError::RateLimited { .. }) => true,
577+
StygianError::Service(ServiceError::Unavailable(msg)) => {
578+
msg.contains("429")
579+
|| msg.contains("500")
580+
|| msg.contains("502")
581+
|| msg.contains("503")
582+
|| msg.contains("504")
583+
|| msg.contains("connection")
584+
|| msg.contains("timed out")
585+
}
586+
_ => false,
587+
}
578588
}
579589

580590
// ─── ScrapingService ──────────────────────────────────────────────────────────

0 commit comments

Comments
 (0)