Skip to content

Comments

Add regression CI#206

Merged
mawad-amd merged 7 commits intomainfrom
muhaawad/regression-ci-1
Oct 9, 2025
Merged

Add regression CI#206
mawad-amd merged 7 commits intomainfrom
muhaawad/regression-ci-1

Conversation

@mawad-amd
Copy link
Collaborator

@mawad-amd mawad-amd commented Oct 9, 2025

Motivation

Add regression CI to avoid performance bugs in examples with known high performance.

Technical Details

Checks if GEMM All-Scatter performance is greater than some threshold for 8 GPUs.

Test Plan

Test Result

Submission Checklist

Copilot AI review requested due to automatic review settings October 9, 2025 06:03
@mawad-amd mawad-amd requested review from BKP and neoblizz as code owners October 9, 2025 06:03
@github-actions github-actions bot added in-progress We are working on it iris Iris project issue labels Oct 9, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a performance regression CI workflow to prevent performance degradation in the Iris multi-GPU framework. The workflow specifically tests GEMM All-Scatter performance using 8 GPUs to ensure it maintains at least 2000 TFLOPs.

Key changes:

  • Introduces automated performance testing for critical GPU operations
  • Sets up Apptainer-based containerized testing environment
  • Implements threshold-based validation with detailed error reporting

@mawad-amd mawad-amd merged commit cdc05dc into main Oct 9, 2025
17 of 18 checks passed
@mawad-amd mawad-amd deleted the muhaawad/regression-ci-1 branch October 9, 2025 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

in-progress We are working on it iris Iris project issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant