Skip to content

Add test to check that access-counter based page migration is disabled#496

Merged
jgphpc merged 5 commits intoeth-cscs:mainfrom
msimberg:uvm-access-counter-migration-disabled
Jan 12, 2026
Merged

Add test to check that access-counter based page migration is disabled#496
jgphpc merged 5 commits intoeth-cscs:mainfrom
msimberg:uvm-access-counter-migration-disabled

Conversation

@msimberg
Copy link
Copy Markdown
Contributor

@msimberg msimberg commented Jan 9, 2026

Also fixes the mixin path in slurm.py. Related to https://jira.cscs.ch/browse/PA-1395.

This is currently incomplete becase the test runs on all systems. I know how to check with uenv environments if a test is running on a gh200 system, but how do I check the same with the builtin environment? Or do you have other suggestions for skipping the check if the file doesn't exist?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a test to verify that the NVIDIA UVM (Unified Virtual Memory) performance access counter migration feature is disabled on GPU systems, as it's known to be buggy in older drivers. Additionally, it fixes the mixin import path in slurm.py.

Changes:

  • Fixed the import path for uenv_slurm_mpi_options mixin to correctly navigate from checks/system/slurm/ to checks/mixins/
  • Added new test SlurmNoUvmPerfAccessCounterMigration to check that the NVIDIA UVM parameter is set to 0

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jgphpc
Copy link
Copy Markdown
Collaborator

jgphpc commented Jan 12, 2026

cscs-ci run alps-daint-uenv;MY_UENV=prgenv-gnu/25.11:v1

@jgphpc
Copy link
Copy Markdown
Collaborator

jgphpc commented Jan 12, 2026

cscs-ci run alps-beverin-uenv;MY_UENV=prgenv-gnu/25.07-6.3.3:v11

@jgphpc
Copy link
Copy Markdown
Collaborator

jgphpc commented Jan 12, 2026

cscs-ci run alps-eiger-uenv;MY_UENV=prgenv-gnu/25.11:v1

Copy link
Copy Markdown
Collaborator

@jgphpc jgphpc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked that SlurmNoUvmPerfAccessCounterMigration runs only on cn with /sys/module/nvidia_uvm/...

@jgphpc jgphpc merged commit 0f99c77 into eth-cscs:main Jan 12, 2026
2 of 4 checks passed
@msimberg msimberg deleted the uvm-access-counter-migration-disabled branch January 12, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants