-
Notifications
You must be signed in to change notification settings - Fork 798
[CI][Bench] Create summary reports for benchmarking CI run results #19733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
Test runs, feel free to get an idea of what the summaries look like here:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yaml lgtm
@intel/llvm-reviewers-benchmarking Friendly ping |
devops/scripts/benchmarks/compare.py
Outdated
if args.produce_github_summary: | ||
gh_summary.println("### Regressions") | ||
gh_summary.println( | ||
f"<details><summary>{len(regressions_ignored)} non CI-failing regressions:</summary>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, could you clarify what you meant by non CI-failing regressions
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey thanks for taking a look Udit,
The benchmark CI now also runs UR benchmarks, L0 benchmarks, etc.; regressions in e.g. L0 should not cause the nightly benchmarking CI for SYCL to fail, thus they are filtered out and categorized differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. In that case, should we rename it to "non-SYCL regressions"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we don't list "regressions" in the summary where delta is less than the noise threshold, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. In that case, should we rename it to "non-SYCL regressions"?
I feel like that would be less confusing, but I am also aware that other projects use these series of benchmarking scripts as well (i.e. UMF), so I was hesitant to hardcode "non-SYCL" into the titles/descriptions. In hindsight this should perhaps be a customizable option.
Also, we don't list "regressions" in the summary where delta is less than the noise threshold, correct?
That is correct. Noise is ignored.
Amended PR based on reviewer comments. New test runs:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds summary reports for benchmarking CI runs to improve visibility into benchmark failures. The change creates a GitHub CI summary file that provides a formatted overview of benchmark results, including improvements, filtered regressions, and concerning regressions.
- Adds new command-line arguments for producing GitHub summary reports
- Enhances the comparison script to generate markdown summaries alongside console output
- Updates the benchmark action to create and display the summary in GitHub CI
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
devops/scripts/benchmarks/compare.py | Adds GitHub summary generation functionality with new CLI arguments and markdown formatting for benchmark results |
devops/actions/run-tests/benchmark/action.yml | Updates benchmark action to generate summary file and display it in GitHub CI step summary |
The summary looks good now. 👍🏻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
Much easier to figure out what's wrong with the benchmarking CI runs when it tells you what's wrong immediately: https://github.com/intel/llvm/actions/runs/16789472825