-
Notifications
You must be signed in to change notification settings - Fork 162
Add performance regression tests to CI #2548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
206f511
to
c0aeb4a
Compare
c0aeb4a
to
b0568fc
Compare
955e6f5
to
ee33eaa
Compare
51b3fdb
to
f5f1909
Compare
17774b6
to
c83ae06
Compare
c83ae06
to
e512d76
Compare
b2ff575
to
437f588
Compare
437f588
to
4702ace
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actions workflow suffers from something I call "doing too much"
Actions philosophy usually goes something like this:
If you want to change how the action behaves, just use git
to change what the underlying repository looks like. The action should follow "open-closed" principle, so most of the logic should live in your perf-regression.ts
, rather than out here.
Adding a case to Build-and-Test-Server
is a bit redundant if you introduce a separate workflow, but makes sense if you want to keep them under the same umbrella.
The pattern I expect to see here usually goes something like:
- Add test case to
Build-and-Test-Server
andrun-ci-tests.sh
run-ci-tests.sh
calls your test or runs a bash script (checkcache-regression
case for an example
Your bash script should:
- do the download, and / or be responsible for the artifact location (looks like its
tests/perf-regression/perf-regression.json
). - run the specific tests.
This way we don't have an action responsible for "Dump or Check", we have two locations where we simply "call" your script to allow it to be dumped properly.
suggestions:
- drop: this workflow. do not look for other JSON blobs that may or may not exist. only this copy of the repo exists as far as this script is concerned
- drop: a workflow that runs the perf-regression dumping (we have it in npm already). You might add one that runs the
dump
without actually dumping anything, just to make sure the flow works, but not required. - procedure: if you want to update
perf-regression
, just run the script locally and check it into git - procedure: run-ci-tests should just run the perf regression test, without having to introduce another git action step
question: how large are the perf regression JSONs? we might be able to upload them to GCP
if they're quite large?
The final size is 1826 bytes. |
well, never mind then, that is quite small |
agree with the exception of
because I think we will have to dump the data on the runners where we will also check the tests otherwise we might get too big of a variance |
I also agree with Florian, that's why I took the approach to do both dump and check in CI to eliminate notable variance from having different performance results run from different machines. |
ah of course, this makes sense |
8b45c6d
to
666c0c8
Compare
666c0c8
to
cad4af9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this much better! "It is CURRENT_REF" type 💩
I dislike gh
as a dependency, but since it's already here I guess we can let it hang out a bit longer.
Small nit about a script that only calls one command, at that point do we really need a script?
otherwise, 🚢 it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is interesting, but is there a reason we need it?
My understanding of GH actions is like this:
We trigger the action -> it calls the bash script
not:
we run the bash code -> it calls the gh workflow
CI is meant to make sure that the scripts we can run locally stays running, hence "Continuous" of "Continuous Integration"
request: delete this file, since all it does it call gh workflow run
(with preset flags, but man gh
should take care of that if necessary).
if it's meant to be manual, just direct people to the web interface and have them call the action manually (since workflow_dispatch is available as a trigger)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an anecdotal aside just to make sure I'm thinking about it the right way:
given the question: would you write a script to encode how we call git
on our repository so everyone can do it the same way?
my answer would be: no, unless you need to chain multiple brittle git
invocations in a row.
so in this case, I would probably only write a script for it if you need to run multiple git
commands in a row.
I think this PR is fine without this script at all 📦
Closes #2545.
Depends on #2586.
How it works
The performance regression workflow in CI ensures that compile, prove, and verify times remain consistent across commits. It is driven by two workflows with a single source-of-truth baseline stored in the repo at
tests/perf-regression/perf-regression.json
.PR checks (
checks.yml
): the matrix includes “Performance Regression”. The shared scriptrun-ci-tests.sh
runs in check mode (defaults toPERF_MODE=--check
) and reads the baseline file directly from the repository. It then runs the benchmarks and compares current results to the stored values to detect regressions. No artifacts are downloaded, and no labels are consulted.Baseline updates (
dump-perf-baseline.yml
): this is a separate, manual workflow (workflow_dispatch
) that runs on the same branch. It executes the same performance suite in dump mode (PERF_MODE=--dump
), regeneratesperf-regression.json
using GitHub runners, and **commits the updated file to the same branch. This becomes the canonical baseline for subsequent PR checks.How to use
1) Open a PR (check mode)
run-ci-tests.sh
executes in--check
, readingtests/perf-regression/perf-regression.json
directly from the repo.2) Update the baseline (dump mode)
npm run regression:ci:dump-perf
).--dump
on stable GitHub runners and **commits the refreshedperf-regression.json
to the same branch.Notes
tests/perf-regression/perf-regression.json
is the canonical baseline; PR checks read it directly.