Skip to content

Conversation

@ianayl
Copy link
Contributor

@ianayl ianayl commented Aug 6, 2025

Much easier to figure out what's wrong with the benchmarking CI runs when it tells you what's wrong immediately: https://github.com/intel/llvm/actions/runs/16789472825

@ianayl
Copy link
Contributor Author

ianayl commented Aug 11, 2025

Test runs, feel free to get an idea of what the summaries look like here:

@ianayl ianayl marked this pull request as ready for review August 11, 2025 18:56
@ianayl ianayl requested review from a team as code owners August 11, 2025 18:56
Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml lgtm

@ianayl
Copy link
Contributor Author

ianayl commented Aug 13, 2025

@intel/llvm-reviewers-benchmarking Friendly ping

if args.produce_github_summary:
gh_summary.println("### Regressions")
gh_summary.println(
f"<details><summary>{len(regressions_ignored)} non CI-failing regressions:</summary>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, could you clarify what you meant by non CI-failing regressions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey thanks for taking a look Udit,
The benchmark CI now also runs UR benchmarks, L0 benchmarks, etc.; regressions in e.g. L0 should not cause the nightly benchmarking CI for SYCL to fail, thus they are filtered out and categorized differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. In that case, should we rename it to "non-SYCL regressions"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we don't list "regressions" in the summary where delta is less than the noise threshold, correct?

Copy link
Contributor Author

@ianayl ianayl Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. In that case, should we rename it to "non-SYCL regressions"?

I feel like that would be less confusing, but I am also aware that other projects use these series of benchmarking scripts as well (i.e. UMF), so I was hesitant to hardcode "non-SYCL" into the titles/descriptions. In hindsight this should perhaps be a customizable option.

Also, we don't list "regressions" in the summary where delta is less than the noise threshold, correct?

That is correct. Noise is ignored.

@ianayl
Copy link
Contributor Author

ianayl commented Aug 14, 2025

Amended PR based on reviewer comments. New test runs:

@uditagarwal97 uditagarwal97 requested a review from Copilot August 15, 2025 16:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds summary reports for benchmarking CI runs to improve visibility into benchmark failures. The change creates a GitHub CI summary file that provides a formatted overview of benchmark results, including improvements, filtered regressions, and concerning regressions.

  • Adds new command-line arguments for producing GitHub summary reports
  • Enhances the comparison script to generate markdown summaries alongside console output
  • Updates the benchmark action to create and display the summary in GitHub CI

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
devops/scripts/benchmarks/compare.py Adds GitHub summary generation functionality with new CLI arguments and markdown formatting for benchmark results
devops/actions/run-tests/benchmark/action.yml Updates benchmark action to generate summary file and display it in GitHub CI step summary

@uditagarwal97
Copy link
Contributor

Amended PR based on reviewer comments. New test runs:

The summary looks good now. 👍🏻

Copy link
Contributor

@uditagarwal97 uditagarwal97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@ianayl
Copy link
Contributor Author

ianayl commented Aug 19, 2025

@intel/llvm-gatekeepers PR is ready for merge, thanks!

CI failure is unrelated and tracked in #19354 (Currently looking into this right now)

@uditagarwal97 uditagarwal97 merged commit 6c7bc5c into sycl Aug 19, 2025
31 of 33 checks passed
@uditagarwal97 uditagarwal97 deleted the ianayl/benchmark-ci-summaries branch August 19, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants