Skip to content

Commit 26ce1c7

Browse files
jonmmeaseclaude
andauthored
Parallel Worker Pool for VlConverter + V8 Platform Init Safety (#242)
* feat: Implement parallel worker pool in VlConverter Replace the single-threaded VlConverterRuntime with a WorkerPool that supports configurable numbers of Deno workers. Each worker gets its own tokio LocalSet and JsRuntime, providing true parallelism for concurrent conversion requests. Key changes: - WorkerPool struct with round-robin sender selection via AtomicUsize - spawn_worker_pool(num_workers) creates N independent worker threads - VlConverter is now Clone (wraps Arc<VlConverterInner>) - with_num_workers() constructor and num_workers() accessor - V8 platform initialization is one-time via Once/call_once - handle_command() helper centralizes command dispatch in worker loop - Worker startup uses sync channel to propagate initialization errors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: Remove unnecessary `mut` from VlConverter usage VlConverter is now Clone and all conversion methods take `&self` since internal state is managed via Arc<VlConverterInner>. Drop the `mut` binding in tests and CLI callers to resolve unused_mut warnings. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: Add worker pool support to Python bindings Refactor the Python bindings to use a thread-safe, parallel VlConverter: - Switch VL_CONVERTER from Mutex<VlConverterRs> to RwLock<Arc<VlConverterRs>> so multiple conversion futures can run concurrently - Add converter_read_handle() and run_converter_future() helpers to centralize the read-lock + block_on pattern across all conversion functions - Allow Python's GIL to be released during blocking Deno calls via py.allow_threads() inside run_converter_future - Add set_num_workers(n) / get_num_workers() to configure and inspect the worker pool at runtime; set_num_workers replaces the global converter with a freshly-spawned one - Expose the new functions in vl_convert.pyi with NumPy-style docstrings Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: Add Python worker pool tests Verify set/get_num_workers behavior: - Default is 1 worker - set_num_workers rejects zero - Workers can be reconfigured while conversions are running - Parallel conversions succeed with multiple configured workers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: Document parallel worker configuration in Python README Add a "Parallel Workers" section showing how to use set_num_workers() and get_num_workers() to scale throughput for batch workloads. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * converter: move JSON/msgpack transfer state to per-worker OpState * feat: add opt-in worker warmup APIs * test: save failed PNG comparison images and upload as CI artifact On failure, check_png() now writes the actual and expected PNGs to vl-convert-python/tests/failed/ so rendering differences can be inspected. The macOS/Windows Python CI job uploads that directory as an artifact when tests fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: add exponential backoff retry for remote image loading Wraps the reqwest fetch in `custom_string_resolver` with `backon` exponential backoff (500ms–10s, up to 4 retries). Retries on network errors and transient HTTP failures (429, 5xx); permanent errors (404, 403, etc.) short-circuit immediately without retrying. Fixes intermittent CI failures where Wikimedia rate-limits a second image fetch within the same test run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: update bundled licenses for backon + gloo-timers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: disable backon default features to drop wasm-only gloo-timers dep We only need tokio-sleep + std; gloo-timers is the WASM timer backend and has no use in this crate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: remove gloo-timers from bundled licenses Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: replace round-robin dispatch with least-outstanding worker selection Replaces the AtomicBool dispatch_pending approach with a proper outstanding-request counter per worker. The OutstandingTicket RAII guard increments on selection and decrements on drop, so the counter accurately covers three lifecycle phases: waiting on a full channel, queued in the channel, and actively executing on the worker thread. Key design points: - QueuedCommand wraps VlConvertCommand + OutstandingTicket so the ticket travels with the command through the MPSC channel; the worker drops _ticket after handle_command completes, not when it dequeues - dispatch_cursor rotates the scan start to break ties without index-0 bias - Early-exit when outstanding == 0 avoids scanning remaining workers - Cancellation-safe: if a send future is dropped, the ticket drops with it Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 142e6fd commit 26ce1c7

File tree

18 files changed

+2331
-790
lines changed

18 files changed

+2331
-790
lines changed

.github/workflows/CI.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -485,6 +485,12 @@ jobs:
485485
run: pixi run dev-py
486486
- name: Run tests
487487
run: pixi run test-py
488+
- name: Upload failed images
489+
uses: actions/upload-artifact@v6
490+
if: failure()
491+
with:
492+
name: failed-images-python-${{ matrix.options[0] }}
493+
path: vl-convert-python/tests/failed/
488494

489495
# Build and test aarch64 Linux wheel using manylinux container
490496
vl-convert-python-tests-linux-aarch64:

Cargo.lock

Lines changed: 11 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ codegen-units = 1
2020
[workspace.dependencies]
2121
anyhow = "1.0"
2222
assert_cmd = "2.0"
23+
backon = { version = "1", default-features = false, features = ["tokio-sleep", "std"] }
2324
clap = { version = "4.5", features = ["derive"] }
2425

2526
# Deno dependencies - versions correspond to Deno v2.6.6

thirdparty_rust.yaml

Lines changed: 208 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

vl-convert-python/README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,22 @@ vg_spec = vlc.vegalite_to_vega(chart.to_json(), vl_version="4.17")
9191
with open("altair_chart.vg.json", "wt") as f:
9292
json.dump(vg_spec, f)
9393
```
94+
95+
## Configure Worker Parallelism
96+
By default, `vl-convert-python` uses `1` converter worker. You can configure this globally:
97+
98+
```python
99+
import vl_convert as vlc
100+
101+
vlc.get_num_workers() # 1
102+
vlc.set_num_workers(4) # enable parallel worker pool
103+
vlc.warm_up_workers() # optional: pre-initialize workers before first conversion
104+
```
105+
106+
This setting applies to subsequent conversions and enables parallel work across Python threads.
107+
Calling `warm_up_workers()` is optional and only needed if you want to avoid first-request
108+
worker startup latency.
109+
94110
# How it works
95111
This crate uses [PyO3](https://pyo3.rs/) to wrap the [`vl-convert-rs`](https://crates.io/crates/vl-convert-rs) Rust crate as a Python library. The `vl-convert-rs` crate is a self-contained Rust library for converting [Vega-Lite](https://vega.github.io/vega-lite/) visualization specifications into various formats. The conversions are performed using the Vega-Lite and Vega JavaScript libraries running in a v8 JavaScript runtime provided by the [`deno_runtime`](https://crates.io/crates/deno_runtime) crate. Font metrics and SVG-to-PNG conversions are provided by the [`resvg`](https://crates.io/crates/resvg) crate.
96112

0 commit comments

Comments
 (0)