Skip to content

Conversation

benceruleanlu
Copy link
Member

@benceruleanlu benceruleanlu commented Oct 7, 2025

Note for reviewers: Contains a lot of bash scripting

Summary

Before, our snapshot update would run all playwright tests, even the ones that did not fail.

With this PR, we now only run the failed tests for snapshot updates.

Changes

  1. Generates a custom text manifest containing failed test_file:line, which playwright supports as a way to run that test in particular
  2. Update workflow finds this latest text manifest from the GH API, using it to update snapshots

┆Issue is synchronized with this Notion page by Unito

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Oct 7, 2025
Copy link

github-actions bot commented Oct 7, 2025

🎨 Storybook Build Status

Build completed successfully!

⏰ Completed at: 10/08/2025, 02:34:24 AM UTC

🔗 Links


🎉 Your Storybook is ready for review!

Copy link

github-actions bot commented Oct 7, 2025

🎭 Playwright Test Results

⚠️ Tests passed with flaky tests

⏰ Completed at: 10/08/2025, 02:44:39 AM UTC

📈 Summary

  • Total Tests: 488
  • Passed: 455 ✅
  • Failed: 0
  • Flaky: 3 ⚠️
  • Skipped: 30 ⏭️

📊 Test Reports by Browser

  • chromium: View Report • ✅ 446 / ❌ 0 / ⚠️ 3 / ⏭️ 30
  • chromium-2x: View Report • ✅ 2 / ❌ 0 / ⚠️ 0 / ⏭️ 0
  • chromium-0.5x: View Report • ✅ 1 / ❌ 0 / ⚠️ 0 / ⏭️ 0
  • mobile-chrome: View Report • ✅ 6 / ❌ 0 / ⚠️ 0 / ⏭️ 0

🎉 Click on the links above to view detailed test results for each browser configuration.

@benceruleanlu
Copy link
Member Author

@christian-byrne I manually wrote this PR description to be as helpful as possible, LMKWYT if you have time

@benceruleanlu benceruleanlu requested a review from snomiao October 7, 2025 02:14
Copy link
Contributor

@DrJKL DrJKL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, but for the record:
I can't block this, but I would strongly oppose adding such complexity before even trying simpler solutions like filtering the update runs using --last-failed with the results we're already storing.
I'm also worried about the mix in here of having: an inlined github-script, an inlined bash script, and a separate typescript script that collectively depend on one another by loose contract.

@benceruleanlu
Copy link
Member Author

Discussed offline, but for the record: I can't block this, but I would strongly oppose adding such complexity before even trying simpler solutions like filtering the update runs using --last-failed with the results we're already storing. I'm also worried about the mix in here of having: an inlined github-script, an inlined bash script, and a separate typescript script that collectively depend on one another by loose contract.

  • We considered --last-failed, but it relies on Playwright’s local state from the
    previous run. Our update job runs in a fresh workflow and is sharded, so we’d have to
    persist and merge Playwright’s internal “last failed” state, which isn’t exposed via
    the reporters we already store. That ends up being more brittle and doesn’t filter to
    screenshot-only failures.
  • The current approach uses artifacts we already produce (merged JSON) and targets only
    tests that have screenshot attachments, which reduces unnecessary re-runs.
  • I agree on keeping things simple. If helpful, I can collapse the bash loop and
    manifest handoff into a single Node step that reads the merged JSON and invokes
    Playwright per project directly. That removes the extra artifact/loop while keeping
    the same precision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants