Async parallel branches degrade to serial silently on non-MySQL Action Scheduler stores
Summary
The Action Scheduler branch executor (#392) enqueues N branch actions correctly, but whether they actually run concurrently depends entirely on the Action Scheduler data store. On a store that does not honor stake_claim's LIMIT or lacks SKIP LOCKED (notably the WordPress SQLite integration), the branches degrade to serial execution with zero error or warning — the run still SUCCEEDS and produces correct output, it's just not parallel. This is a silent footgun for anyone deploying the async parallel primitive on a non-MySQL store.
Evidence (measured on a native-PHP + SQLite Studio runtime)
Deterministic A/B, 4 branches × sleep(3) + aggregator, cap raised to 6, 4 concurrent OS processes launched within 0.8ms of each other, --batch-size=1 --batches=1 --force:
| run |
total wall-clock |
branch PIDs |
branch starts |
| sync (selector→null) |
12.71s |
all 56442 |
staggered 3s apart |
| async-parallel (4 procs, cap=6) |
12.67s |
all 75810 (one PID) |
staggered 3s apart |
Speedup: 1.00×. Despite 4 processes racing to claim, exactly one runner claimed the whole due queue and ran all 4 branches serially; the other three reported "0 batches executed."
Root cause (two mechanisms, both tested directly)
The runtime's AS store is the WordPress SQLite integration (sqlite-database-integration 3.0.0-rc.4; db_server_info() → 3.45.2, though SELECT VERSION() spoofs 8.0.38).
stake_claim($limit) ignores its LIMIT. Direct probe: stake_claim(1) returned 29 actions; stake_claim(2) also returned 29. AS's claim uses an UPDATE ... JOIN (SELECT ... LIMIT %d FOR UPDATE) pattern that the SQLite translation layer does not honor, so the first claimant swallows the entire due queue — leaving concurrent workers nothing to run. This alone forces serial regardless of process count.
SKIP LOCKED is disabled. ActionScheduler_DBStore::db_supports_skip_locked() gates on MySQL ≥ 8.0.1 / MariaDB ≥ 10.6 via $wpdb->db_version(), which returns a truncated 8.0 here → version_compare('8.0','8.0.1','>=') is false. Without SKIP LOCKED, concurrent claims would contend on row locks anyway — and SQLite serializes writes on a single DB-level write lock regardless.
The agents-api branch executor itself is correct — it enqueues 4 independent async actions and relies on "AS's cross-process atomic claim" (per class-wp-agent-workflow-action-scheduler-branch-executor.php header) for concurrency. That guarantee holds on MySQL 8.0.1+/MariaDB 10.6+ and collapses to serial on stores that don't honor it (SQLite, older MySQL).
Why this matters
agents-api is headed for wpcom/WP core, where the store may vary. A consumer that adopts async parallel steps expecting concurrency will get correct-but-serial behavior on the wrong store with no signal that anything is degraded. The failure is invisible: no error, correct output, just no speedup.
Proposed fix (design-level, not prescribing implementation)
- Detect + warn. At AS-executor selection time, probe the store's concurrency capability (e.g.
db_supports_skip_locked() and/or a stake_claim LIMIT sanity check) and surface a clear diagnostic when the store cannot parallelize — a _doing_it_wrong, an admin notice, a run-metadata flag, or a filterable warning. Silent degradation is the actual bug.
- Document the store requirement loudly in the branch-executor docblock and any consumer-facing docs: async parallel branches run concurrently only on MySQL 8.0.1+ / MariaDB 10.6+; on other stores (including the SQLite integration) they run correctly but serially.
- (Optional) Expose the detected capability so a consumer can choose to warn its user ("install a MySQL-backed store for concurrent generation") or fall back deliberately.
Not in scope
This is not asking agents-api to make SQLite parallel (it can't — SQLite serializes writes). It's asking the substrate to not degrade silently: detect, warn, and document, so the concurrency guarantee is honest about its store dependency.
Repro
Native-PHP Studio site on the SQLite integration; AS 4.0.0; run any async parallel-roles workflow with the AS branch executor + raised concurrency cap + concurrent action-scheduler run processes; observe all branches execute in one PID (see table above). The same specs parallelize on a MySQL 8.0.1+ store.
AI assistance
- AI assistance: Yes
- Tool(s): Claude Code (Claude Opus 4.8)
- Used for: Diagnosing the store-dependent concurrency degradation with PID/timestamp evidence and drafting this report.
Async parallel branches degrade to serial silently on non-MySQL Action Scheduler stores
Summary
The Action Scheduler branch executor (#392) enqueues N branch actions correctly, but whether they actually run concurrently depends entirely on the Action Scheduler data store. On a store that does not honor
stake_claim's LIMIT or lacksSKIP LOCKED(notably the WordPress SQLite integration), the branches degrade to serial execution with zero error or warning — the run still SUCCEEDS and produces correct output, it's just not parallel. This is a silent footgun for anyone deploying the async parallel primitive on a non-MySQL store.Evidence (measured on a native-PHP + SQLite Studio runtime)
Deterministic A/B, 4 branches ×
sleep(3)+ aggregator, cap raised to 6, 4 concurrent OS processes launched within 0.8ms of each other,--batch-size=1 --batches=1 --force:5644275810(one PID)Speedup: 1.00×. Despite 4 processes racing to claim, exactly one runner claimed the whole due queue and ran all 4 branches serially; the other three reported "0 batches executed."
Root cause (two mechanisms, both tested directly)
The runtime's AS store is the WordPress SQLite integration (
sqlite-database-integration3.0.0-rc.4;db_server_info()→3.45.2, thoughSELECT VERSION()spoofs8.0.38).stake_claim($limit)ignores its LIMIT. Direct probe:stake_claim(1)returned 29 actions;stake_claim(2)also returned 29. AS's claim uses anUPDATE ... JOIN (SELECT ... LIMIT %d FOR UPDATE)pattern that the SQLite translation layer does not honor, so the first claimant swallows the entire due queue — leaving concurrent workers nothing to run. This alone forces serial regardless of process count.SKIP LOCKEDis disabled.ActionScheduler_DBStore::db_supports_skip_locked()gates on MySQL ≥ 8.0.1 / MariaDB ≥ 10.6 via$wpdb->db_version(), which returns a truncated8.0here →version_compare('8.0','8.0.1','>=')is false. WithoutSKIP LOCKED, concurrent claims would contend on row locks anyway — and SQLite serializes writes on a single DB-level write lock regardless.The agents-api branch executor itself is correct — it enqueues 4 independent async actions and relies on "AS's cross-process atomic claim" (per
class-wp-agent-workflow-action-scheduler-branch-executor.phpheader) for concurrency. That guarantee holds on MySQL 8.0.1+/MariaDB 10.6+ and collapses to serial on stores that don't honor it (SQLite, older MySQL).Why this matters
agents-api is headed for wpcom/WP core, where the store may vary. A consumer that adopts async
parallelsteps expecting concurrency will get correct-but-serial behavior on the wrong store with no signal that anything is degraded. The failure is invisible: no error, correct output, just no speedup.Proposed fix (design-level, not prescribing implementation)
db_supports_skip_locked()and/or astake_claimLIMIT sanity check) and surface a clear diagnostic when the store cannot parallelize — a_doing_it_wrong, an admin notice, a run-metadata flag, or a filterable warning. Silent degradation is the actual bug.Not in scope
This is not asking agents-api to make SQLite parallel (it can't — SQLite serializes writes). It's asking the substrate to not degrade silently: detect, warn, and document, so the concurrency guarantee is honest about its store dependency.
Repro
Native-PHP Studio site on the SQLite integration; AS 4.0.0; run any async
parallel-roles workflow with the AS branch executor + raised concurrency cap + concurrentaction-scheduler runprocesses; observe all branches execute in one PID (see table above). The same specs parallelize on a MySQL 8.0.1+ store.AI assistance