Skip to content

Async parallel branches degrade to serial silently on non-MySQL Action Scheduler stores (SQLite) #393

Description

@chubes4

Async parallel branches degrade to serial silently on non-MySQL Action Scheduler stores

Summary

The Action Scheduler branch executor (#392) enqueues N branch actions correctly, but whether they actually run concurrently depends entirely on the Action Scheduler data store. On a store that does not honor stake_claim's LIMIT or lacks SKIP LOCKED (notably the WordPress SQLite integration), the branches degrade to serial execution with zero error or warning — the run still SUCCEEDS and produces correct output, it's just not parallel. This is a silent footgun for anyone deploying the async parallel primitive on a non-MySQL store.

Evidence (measured on a native-PHP + SQLite Studio runtime)

Deterministic A/B, 4 branches × sleep(3) + aggregator, cap raised to 6, 4 concurrent OS processes launched within 0.8ms of each other, --batch-size=1 --batches=1 --force:

run total wall-clock branch PIDs branch starts
sync (selector→null) 12.71s all 56442 staggered 3s apart
async-parallel (4 procs, cap=6) 12.67s all 75810 (one PID) staggered 3s apart

Speedup: 1.00×. Despite 4 processes racing to claim, exactly one runner claimed the whole due queue and ran all 4 branches serially; the other three reported "0 batches executed."

Root cause (two mechanisms, both tested directly)

The runtime's AS store is the WordPress SQLite integration (sqlite-database-integration 3.0.0-rc.4; db_server_info()3.45.2, though SELECT VERSION() spoofs 8.0.38).

  1. stake_claim($limit) ignores its LIMIT. Direct probe: stake_claim(1) returned 29 actions; stake_claim(2) also returned 29. AS's claim uses an UPDATE ... JOIN (SELECT ... LIMIT %d FOR UPDATE) pattern that the SQLite translation layer does not honor, so the first claimant swallows the entire due queue — leaving concurrent workers nothing to run. This alone forces serial regardless of process count.
  2. SKIP LOCKED is disabled. ActionScheduler_DBStore::db_supports_skip_locked() gates on MySQL ≥ 8.0.1 / MariaDB ≥ 10.6 via $wpdb->db_version(), which returns a truncated 8.0 here → version_compare('8.0','8.0.1','>=') is false. Without SKIP LOCKED, concurrent claims would contend on row locks anyway — and SQLite serializes writes on a single DB-level write lock regardless.

The agents-api branch executor itself is correct — it enqueues 4 independent async actions and relies on "AS's cross-process atomic claim" (per class-wp-agent-workflow-action-scheduler-branch-executor.php header) for concurrency. That guarantee holds on MySQL 8.0.1+/MariaDB 10.6+ and collapses to serial on stores that don't honor it (SQLite, older MySQL).

Why this matters

agents-api is headed for wpcom/WP core, where the store may vary. A consumer that adopts async parallel steps expecting concurrency will get correct-but-serial behavior on the wrong store with no signal that anything is degraded. The failure is invisible: no error, correct output, just no speedup.

Proposed fix (design-level, not prescribing implementation)

  1. Detect + warn. At AS-executor selection time, probe the store's concurrency capability (e.g. db_supports_skip_locked() and/or a stake_claim LIMIT sanity check) and surface a clear diagnostic when the store cannot parallelize — a _doing_it_wrong, an admin notice, a run-metadata flag, or a filterable warning. Silent degradation is the actual bug.
  2. Document the store requirement loudly in the branch-executor docblock and any consumer-facing docs: async parallel branches run concurrently only on MySQL 8.0.1+ / MariaDB 10.6+; on other stores (including the SQLite integration) they run correctly but serially.
  3. (Optional) Expose the detected capability so a consumer can choose to warn its user ("install a MySQL-backed store for concurrent generation") or fall back deliberately.

Not in scope

This is not asking agents-api to make SQLite parallel (it can't — SQLite serializes writes). It's asking the substrate to not degrade silently: detect, warn, and document, so the concurrency guarantee is honest about its store dependency.

Repro

Native-PHP Studio site on the SQLite integration; AS 4.0.0; run any async parallel-roles workflow with the AS branch executor + raised concurrency cap + concurrent action-scheduler run processes; observe all branches execute in one PID (see table above). The same specs parallelize on a MySQL 8.0.1+ store.

AI assistance

  • AI assistance: Yes
  • Tool(s): Claude Code (Claude Opus 4.8)
  • Used for: Diagnosing the store-dependent concurrency degradation with PID/timestamp evidence and drafting this report.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions