-
Notifications
You must be signed in to change notification settings - Fork 162
Add performance regression tests to CI #2548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
206f511
to
c0aeb4a
Compare
c0aeb4a
to
b0568fc
Compare
955e6f5
to
ee33eaa
Compare
51b3fdb
to
f5f1909
Compare
17774b6
to
c83ae06
Compare
c83ae06
to
e512d76
Compare
b2ff575
to
437f588
Compare
437f588
to
4702ace
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actions workflow suffers from something I call "doing too much"
Actions philosophy usually goes something like this:
If you want to change how the action behaves, just use git
to change what the underlying repository looks like. The action should follow "open-closed" principle, so most of the logic should live in your perf-regression.ts
, rather than out here.
Adding a case to Build-and-Test-Server
is a bit redundant if you introduce a separate workflow, but makes sense if you want to keep them under the same umbrella.
The pattern I expect to see here usually goes something like:
- Add test case to
Build-and-Test-Server
andrun-ci-tests.sh
run-ci-tests.sh
calls your test or runs a bash script (checkcache-regression
case for an example
Your bash script should:
- do the download, and / or be responsible for the artifact location (looks like its
tests/perf-regression/perf-regression.json
). - run the specific tests.
This way we don't have an action responsible for "Dump or Check", we have two locations where we simply "call" your script to allow it to be dumped properly.
suggestions:
- drop: this workflow. do not look for other JSON blobs that may or may not exist. only this copy of the repo exists as far as this script is concerned
- drop: a workflow that runs the perf-regression dumping (we have it in npm already). You might add one that runs the
dump
without actually dumping anything, just to make sure the flow works, but not required. - procedure: if you want to update
perf-regression
, just run the script locally and check it into git - procedure: run-ci-tests should just run the perf regression test, without having to introduce another git action step
question: how large are the perf regression JSONs? we might be able to upload them to GCP
if they're quite large?
The final size is 1826 bytes. |
well, never mind then, that is quite small |
agree with the exception of
because I think we will have to dump the data on the runners where we will also check the tests otherwise we might get too big of a variance |
I also agree with Florian, that's why I took the approach to do both dump and check in CI to eliminate notable variance from having different performance results run from different machines. |
ah of course, this makes sense |
Closes #2545.
How it works
The performance regression workflow in CI ensures that compile, prove, and verify times remain consistent across commits. It uses a single composite action,
./.github/actions/perf-regression
, which handles both creating new performance baselines and checking current results against previous runs.When a pull request is labeled with
dump-performance
, the workflow enters dump mode, generating a newperf-regression.json
file that contains updated performance metrics for all registered benchmarks. This file is then uploaded as an artifact and stored for inspection (not as the official baseline). If the label is absent, the action automatically switches to check mode. In this case, it locates the most recent completed run from the branchmain
, downloads the latest baseline artifact, normalizes its path, and compares it against freshly collected results to detect regressions.This process allows the baseline to evolve only when explicitly requested by maintainers while ensuring that every other PR run validates its performance against an established reference from
main
. All of this logic is handled internally by the composite action, which determines whether to dump or check based on the PR label, making it transparent for contributors.Within the main
checks.yml
workflow, the performance regression suite is simply one entry in the test matrix. All standard tests are run as usual, while the performance regression entry invokes the composite action withmode: auto
. The action runs the samerun-ci-tests.sh
script used by other tests but injects the correctPERF_MODE
environment variable (--dump
or--check
) based on the mode. This keeps the testing interface unified while allowing the underlying script to branch internally.The artifact name (
perf-regression-json
) and storage path (tests/perf-regression/perf-regression.json
) are consistent across runs. When in dump mode, the artifact is uploaded with a 30-day retention window; when in check mode, it is fetched from the last available successful run frommain
.How to use
1) Open a PR without the
dump-performance
labelperf-regression-json
artifact.tests/perf-regression/perf-regression.json
, and performance tests are run in--check
.2) Open a PR with the
dump-performance
label--dump
and uploads a newperf-regression.json
artifact attached to this PR run (not tomain
).main
).3) Merge the PR into
main
mode: auto
.dump-performance
label → dump mode on main.main
.Notes