Skip to content

[ROCM-1846] Device Metrics Exporter amdsmi CI integration#3848

Open
yalmusaf wants to merge 4 commits intodevelopfrom
users/yalmusaf/PR_ROCM-1846
Open

[ROCM-1846] Device Metrics Exporter amdsmi CI integration#3848
yalmusaf wants to merge 4 commits intodevelopfrom
users/yalmusaf/PR_ROCM-1846

Conversation

@yalmusaf
Copy link
Contributor

@yalmusaf yalmusaf commented Mar 6, 2026

Motivation

The purpose of this PR is to have Device Metrics Exporter integrated into amdsmi CI workflow.

Technical Details

Created new file: dme-amdsmi-ci.yml
Inside: rocm-systems/.github/workflows
For DME (Device Metrics Exporter) amdsmi CI integration.

JIRA ID

https://amd-hub.atlassian.net/browse/ROCM-1846

Test Plan

N/A

Test Result

N/A

Submission Checklist

N/A

@yalmusaf yalmusaf requested a review from a team as a code owner March 6, 2026 22:05
Copilot AI review requested due to automatic review settings March 6, 2026 22:05
@github-actions github-actions bot added the github actions Pull requests that update GitHub Actions code label Mar 6, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new GitHub Actions workflow to integrate Device Metrics Exporter (DME) validation into the AMDSMI CI pipeline by building AMDSMI from this super-repo, building gpu-agent + DME, and smoke-testing the Prometheus /metrics endpoint.

Changes:

  • Introduces a new .github/workflows/dme-amdsmi-ci.yml workflow triggered on projects/amdsmi/** changes (and manual dispatch).
  • Builds and installs AMDSMI into /opt/rocm, then prepares gpu-agent to consume the locally-built libamd_smi.so.
  • Builds gpu-agent and DME, starts both processes, and verifies the metrics endpoint is reachable; uploads logs as artifacts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@yalmusaf yalmusaf force-pushed the users/yalmusaf/PR_ROCM-1846 branch 3 times, most recently from 070a500 to 71d9b27 Compare March 10, 2026 14:33
yalmusaf and others added 4 commits March 16, 2026 13:32
Signed-off-by: yalmusaf <Yazen.ALMusaffar@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@yalmusaf yalmusaf force-pushed the users/yalmusaf/PR_ROCM-1846 branch from 71d9b27 to 775fd10 Compare March 16, 2026 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

github actions Pull requests that update GitHub Actions code organization: ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants