-
Notifications
You must be signed in to change notification settings - Fork 33
Add regression CI #206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add regression CI #206
Changes from 1 commit
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
b631c73
Add regression CI
mawad-amd 1d00333
Fix bad command line option
mawad-amd 900932a
Add `set -e` and fix test
mawad-amd 9b350e3
Modify the threshold
mawad-amd 25fef4c
Merge branch 'main' into muhaawad/regression-ci-1
mawad-amd bbee75c
Run all gemm + all scatters
mawad-amd 6bf1f2f
Add missing script
mawad-amd File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,93 @@ | ||
| name: Iris Performance Regression Test | ||
|
|
||
| on: | ||
| push: | ||
| branches: [ main ] | ||
| pull_request: | ||
| branches: [ main ] | ||
| workflow_dispatch: | ||
|
|
||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.head_ref || github.ref }} | ||
| cancel-in-progress: ${{ github.ref != 'refs/heads/main' }} | ||
|
|
||
| jobs: | ||
| performance-test: | ||
| name: GEMM All-Scatter Performance Test | ||
| runs-on: [self-hosted, mi3008x] | ||
| timeout-minutes: 30 | ||
|
|
||
| steps: | ||
| - name: Checkout repository | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Setup Apptainer | ||
| run: | | ||
| apt-get update && apt-get install -y software-properties-common | ||
| add-apt-repository -y ppa:apptainer/ppa | ||
| apt-get update && apt-get install -y apptainer | ||
|
|
||
| - name: Build Iris Apptainer container | ||
| run: | | ||
| # Create persistent Apptainer directory | ||
| mkdir -p ~/apptainer | ||
|
|
||
| # Build Apptainer image from definition file (only if it doesn't exist) | ||
| if [ ! -f ~/apptainer/iris-dev.sif ]; then | ||
| echo "Building new Apptainer image..." | ||
| apptainer build ~/apptainer/iris-dev.sif apptainer/iris.def | ||
mawad-amd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| else | ||
| echo "Using existing Apptainer image" | ||
| fi | ||
|
|
||
| - name: Run GEMM All-Scatter WG Specialization Benchmark (8 ranks) | ||
| run: | | ||
| # Create overlay image in workspace (will be auto-cleaned by GitHub Actions) | ||
| OVERLAY="iris_overlay_perf.img" | ||
|
|
||
| echo "::group::Creating overlay image" | ||
| apptainer overlay create --size 1024 --create-dir /var/cache/iris "${OVERLAY}" | ||
| echo "::endgroup::" | ||
|
|
||
| echo "::group::Running performance benchmark" | ||
| apptainer exec --overlay "${OVERLAY}" --no-home --cleanenv --env HIP_VISIBLE_DEVICES="0,1,2,3,4,5,6,7" \ | ||
| --bind "${PWD}:/iris_workspace" --cwd /iris_workspace \ | ||
| ~/apptainer/iris-dev.sif bash -c " | ||
| pip install -e . | ||
| python examples/10_gemm_all_scatter_wg_specialization/benchmark.py \ | ||
| --benchmark \ | ||
| -m 16384 \ | ||
| -n 16384 \ | ||
| -k 16384 \ | ||
| --BLK_M 128 \ | ||
| --BLK_N 128 \ | ||
| --BLK_K 64 \ | ||
| --gsize_m 6 \ | ||
| --gemm_sms 256 \ | ||
| --validate \ | ||
| -r 8 \ | ||
| -o perf_result.json | ||
| " | ||
| echo "::endgroup::" | ||
|
|
||
| # Parse JSON and check performance | ||
| echo "::group::Validating performance" | ||
| TFLOPS=$(jq -r '.tflops' perf_result.json) | ||
|
|
||
| if [ -z "$TFLOPS" ] || [ "$TFLOPS" = "null" ]; then | ||
| echo "::error::Failed to extract tflops from benchmark output" | ||
| jq '.' perf_result.json | ||
| exit 1 | ||
| fi | ||
|
|
||
| echo "::notice::Achieved TFLOPs: $TFLOPS" | ||
|
|
||
| if (( $(echo "$TFLOPS < 2000" | bc -l) )); then | ||
mawad-amd marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| echo "::error::Performance regression detected! TFLOPs ($TFLOPS) is below threshold (2000)" | ||
| jq '.' perf_result.json | ||
| exit 1 | ||
| fi | ||
|
|
||
| echo "✅ Performance test passed! TFLOPs: $TFLOPS (threshold: >2000)" | ||
| echo "::endgroup::" | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.