Skip to content

test(producer): add parallel capture regression test#1088

Merged
miguel-heygen merged 5 commits into
mainfrom
test/parallel-capture-regression
May 27, 2026
Merged

test(producer): add parallel capture regression test#1088
miguel-heygen merged 5 commits into
mainfrom
test/parallel-capture-regression

Conversation

@miguel-heygen
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen commented May 27, 2026

Summary

Adds a regression test that exercises the multi-worker parallel capture path (workers: 4), which previously had zero CI coverage. Every existing test either pinned workers: 1 or used compositions too short for parallel to activate.

This test guards against regressions in:

  • executeParallelCapture frame distribution across 4 workers
  • Frame reorder buffer (out-of-order → sequential for FFmpeg stdin)
  • Per-worker browser lifecycle (acquire/release/cleanup)
  • PSNR correctness of parallel-captured output

Scope note: The GPU-specific BeginFrame compositor race (#1087) only reproduces on hardware-GPU hosts. SwiftShader CI exercises the parallel code path but cannot trigger the compositor contention. A hardware-GPU CI runner would be needed to guard that specific crash — tracked separately.

Changes

  • packages/producer/tests/parallel-capture-regression/ — new fixture with 5s CSS animation composition (150 frames at 30fps), workers: 4
  • .github/workflows/regression.yml — fixture added to shard-5
  • Golden baseline generated in Dockerfile.test on amd64 Linux

Test plan

  • Golden baseline generated via Docker on amd64 (devbox)
  • Fixture added to regression shard-5
  • CI shard-5 passes with the new fixture

Add a regression fixture that forces workers: 2, ensuring the parallel
capture code path (browser-per-worker in BeginFrame mode) is exercised
in CI. All existing fixtures pin workers: 1, so this is the first test
that would catch a regression in the multi-worker pool isolation fix
from PR #1087.

The composition is 5s @ 30fps (150 frames), which exceeds both
MIN_FRAMES_PER_WORKER * 2 (60) and minParallelFrames (120), so the
parallel coordinator will always split work across workers.

Baseline output/output.mp4 must be generated inside Dockerfile.test
before the fixture can run in CI.
Matches realistic auto-mode worker counts (4-6 on typical machines),
not just the minimum (2) that triggers the bug.
Generated inside Dockerfile.test on amd64 Linux (Docker image
hyperframes-producer:test) to match the CI rendering environment.
Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested status: Request changes.

Findings:

  • Blocking: the new fixture is never run by CI. The regression workflow invokes the harness with explicit shard args (.github/workflows/regression.yml:67-82), and parallel-capture-regression is not listed in any shard. Because regression-harness.ts treats CLI names as a filter, this PR currently adds files but does not exercise the multi-worker path in CI, which is the stated purpose of the change. Please add the fixture to one shard (or change the workflow to discover all fixtures) and keep an eye on shard balance since this is a 150-frame, 4-worker render.

I did static review plus git diff --check; I did not run the Docker regression locally.

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good instinct closing the multi-worker coverage gap, and the fixture is clean. But I have one load-bearing concern about whether this test actually catches the #1087 bug — surfaced by the prod investigation happening in parallel.

The effectiveness problem: CPU CI can't reproduce a GPU-contention crash

Dockerfile.test runs on node:22-bookworm-slim with no GPU → SwiftShader (software) rendering. It installs chrome-headless-shell@148... so BeginFrame mode is active, and the fixture's workers: 4 will drive 4 capture workers sharing the pooled browser. So far so good — the multi-worker BeginFrame path is exercised.

But the #1087 crash appears to be GPU-compositor-contention-specific. Evidence gathered today:

  • Prod (service:producer) runs multi-worker + shared-pool + BeginFrame on CPU-only m6a.12xlarge nodes, and has zero Target closed / compositor crashes over 30 days.
  • The repro that motivated #1087 was on a --browser-gpu (hardware) setup; the PR root-cause itself notes the failure is worst "on hosts with hardware GPU + EGL."
  • Confirmed in the review thread: prod doesn't hit it "because we are using only cpu."

So on SwiftShader/CPU — which is exactly what this Docker test runs — the compositor race doesn't fire. This fixture will very likely render fine at 4 workers even without #1087's per-worker isolation, which means it would pass on un-fixed code and provide no regression protection for the bug it's named after.

The decisive check (free, already running): this PR is branched off main, which does not contain #1087 yet. If the new parallel-capture-regression shard goes green in this PR's own CI, that's proof the test does not catch the crash (it passed on un-fixed code). I'd hold the PR until that shard reports — if green, the fixture needs rework before it's meaningful. (Shards were still in-progress when I looked.)

What would make it actually guard the regression:

  1. A GPU CI runner for this fixture — the only environment where the race reproduces. May not exist in the hyperframes fleet; if not, that's a real constraint to call out rather than ship a test that can't fail.
  2. Failing that, be honest about scope: rename/reframe it as "multi-worker parallel-capture path coverage" (it does guard frame-ordering / reorder-buffer / PSNR regressions in executeParallelCapture, which is genuinely useful) — but it does not guard the BeginFrame compositor crash from #1087. Two different things.

Smaller notes

  • Doc drift: PR body says "forces workers: 2" and "assign work to both workers," but meta.json has "workers": 4. Harmless but reconcile so the next reader isn't misled.
  • Dead CSS: the @keyframes count-up { content: "1" ... } block doesn't do anything — content doesn't apply to a regular <div>, only ::before/::after. The actual counter is the rAF script (el.textContent = ...), which is fine. Drop the dead keyframes or move the counter to a pseudo-element.
  • Determinism: the rAF counter uses (ts - startTime)/1000. In BeginFrame mode ts is virtual so it's deterministic; just noting it relies on that. minPsnr: 30 is appropriately lenient for a crash/smoke fixture.

Net: the fixture is well-built and worth having for parallel-path coverage, but as currently wired (CPU CI) it almost certainly does not catch the #1087 crash. Confirm via the shard result; if green, it needs a GPU runner or an honest rescope.

— Rames Jusso (hyperframes)

- Add fixture to shard-5 in regression.yml so CI actually runs it
- Reframe description: multi-worker path coverage (frame distribution,
  reorder buffer, per-worker browser lifecycle), not GPU-specific crash
  guard — SwiftShader CI can't reproduce the hardware compositor race
- Remove dead @Keyframes count-up (content doesn't apply to div)
- Remove unused CSS animation reference on .counter
- Regenerate golden baseline with cleaned-up HTML
@miguel-heygen
Copy link
Copy Markdown
Collaborator Author

Addressed both reviews in bd93ae9:

Vance — added parallel-capture-regression to shard-5 in regression.yml. The fixture is 150 frames at 4 workers, should add ~15-20s to the shard.

Russo — spot-on analysis on the SwiftShader/CPU limitation. Updated:

  1. Reframed scope: description now says "multi-worker parallel capture path coverage" — frame distribution, reorder buffer, per-worker browser lifecycle. Explicitly notes that the GPU-specific BeginFrame compositor race (fix(engine): disable browser pool for parallel capture workers #1087) only reproduces on hardware-GPU hosts, not in SwiftShader CI.

  2. Removed dead CSS: dropped the @keyframes count-up block (content doesn't apply to a div, only ::before/::after) and the unused animation reference on .counter. Regenerated the golden baseline with the cleaned HTML.

  3. PR body: will reconcile the workers count reference (it's 4, not 2).

On the broader question: this test guards against regressions in the executeParallelCapturecaptureFrameRange → reorder buffer → streaming encoder pipeline at workers > 1. That's genuinely useful coverage we had zero of before. The GPU compositor race guard would need a hardware-GPU CI runner — worth tracking separately but shouldn't block this coverage.

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed bd93ae91 (delta since my last pass on 8f888aa6). Every prior finding is resolved — clean.

Prior findings — all addressed:

  1. Critical — rescope from "crash guard" to "path coverage" → ✅ resolved in bd93ae91. meta.json description now reads "Multi-worker parallel capture path coverage… Note: the GPU-specific BeginFrame compositor race (#1087) only reproduces on hardware-GPU hosts, not in SwiftShader CI." PR body has the matching Scope note. This is exactly the honest framing — the fixture guards executeParallelCapture frame-distribution / reorder-buffer / per-worker-lifecycle / PSNR, and explicitly does not claim to catch the GPU crash.

  2. Fixture wasn't wired into a shard → ✅ resolved. regression.yml shard-5 args now include parallel-capture-regression (name matches the fixture dir). Without this it wouldn't have run in CI at all.

  3. Doc drift (workers 2 vs 4) → ✅ resolved. meta.json workers: 4 and the PR body both say 4 consistently now.

  4. Dead @keyframes count-up + unused .counter animation → ✅ resolved. Both removed from src/index.html and output/compiled.html; the rAF script remains as the actual counter driver. Baseline regenerated (output.mp4 oid updated) to match the cleaned HTML — correct, since the CSS removal changes rendered output.

Nothing re-raised; no new concerns in the delta (it's purely the reframe + shard wiring + dead-CSS removal + baseline regen).

One note on what "green" means now: this PR is branched off main without #1087, and the bug is GPU-only — so shard-5 passing on CPU CI is the intended success signal (it proves the 4-worker parallel path renders correctly and the baseline matches), not a sign of ineffectiveness. That's the correct outcome for a path-coverage test. The test plan's last unchecked box (CI shard-5 passes) is the only remaining gate — CI is re-running on this SHA now.

Determinism holds: the rAF counter uses virtual ts under BeginFrame → deterministic, and minPsnr: 30 gives margin.

Net: all feedback incorporated, scope is now honest, shard wiring correct. Good to merge once shard-5 reports green.

— Rames Jusso (hyperframes)

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — all review feedback from the prior passes is addressed in bd93ae91: rescoped to multi-worker path coverage (with the explicit SwiftShader-can't-repro-the-GPU-crash note), wired into regression shard-5, workers:4 doc drift fixed, dead @keyframes count-up removed, baseline regenerated to match. Scope is now honest and the fixture meaningfully covers the executeParallelCapture path. Last gate is shard-5 reporting green on this SHA before merge.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested status: Approve from code review; wait for regression shards to finish, especially shard-5.

The previous blocker is fixed: parallel-capture-regression is now included in .github/workflows/regression.yml shard-5, so CI actually runs the new fixture.

No blocking issues found in the PR-scoped diff.

Non-blocking notes:

  • The fixture uses raw requestAnimationFrame, so the compiler routes it through screenshot capture mode. That is fine for multi-worker capture coverage, and the PR body now correctly scopes out the hardware-GPU BeginFrame race from #1087.
  • The PR body/meta still overstate streaming-encoder coverage: shouldUseStreamingEncode(..., workerCount=4) returns false, so this fixture covers disk-based parallel capture, not the FFmpeg stdin reorder buffer / streaming ordered-write path. I would tighten that wording before merge to avoid future confusion.
  • git diff --check reports trailing whitespace in generated output/compiled.html line 114; CI format is green, so I am treating this as generated-file noise rather than a blocker.

…ure test

The composition used requestAnimationFrame for a frame counter and CSS
@Keyframes for animations, which triggered screenshot capture mode
(non-deterministic across workers) and caused 29 PSNR failures in CI.
All animations now use the GSAP timeline, keeping the render in
deterministic BeginFrame mode. Baseline regenerated in Docker.
@miguel-heygen miguel-heygen merged commit 3cd6cd6 into main May 27, 2026
28 checks passed
@miguel-heygen miguel-heygen deleted the test/parallel-capture-regression branch May 27, 2026 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants