Skip to content

Conversation

jovial
Copy link
Collaborator

@jovial jovial commented Apr 23, 2025

See docs/mig.md.

The stackhpc.openhpc role has been bumped to support the NVIDIA GPU autodection required for MIG configuration.

NB: This role bump also means parameters can be removed from slurm.conf, see stackhpc/ansible-role-openhpc#184

Copy link
Collaborator

@sjpb sjpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly comments re. where stuff is, and some minor typos etc

@jovial jovial force-pushed the feature/mig branch 3 times, most recently from 994d8f6 to abf35e5 Compare April 25, 2025 20:54
@jovial jovial marked this pull request as ready for review April 28, 2025 08:34
@jovial jovial requested a review from a team as a code owner April 28, 2025 08:34
Copy link
Collaborator

@sjpb sjpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realised a few other changes required from the stackhpc.openhpc bump - we might want to factor those out, potentially - you won't really care about them for client.

@jovial jovial force-pushed the feature/mig branch 3 times, most recently from a4823a3 to 83ec813 Compare May 24, 2025 14:42
sjpb
sjpb previously approved these changes May 27, 2025
Copy link
Collaborator

@sjpb sjpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jovial jovial force-pushed the feature/mig branch 3 times, most recently from 9560d96 to 1c2d07d Compare May 28, 2025 09:08
@sjpb
Copy link
Collaborator

sjpb commented May 28, 2025

Failing builds appear to be #685

@jovial jovial requested a review from sjpb June 11, 2025 09:58
@sjpb sjpb changed the title Adds support for configuring MIG Adds support for configuring Multi-Instance GPUs (MIG) Jun 17, 2025
Copy link
Collaborator

@sjpb sjpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than a worry about the branch being out of date and hence testing applicability to main.

@jovial jovial requested a review from sjpb June 23, 2025 11:28
Copy link
Collaborator

@sjpb sjpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sjpb sjpb changed the title Adds support for configuring Multi-Instance GPUs (MIG) Add support for configuring Multi-Instance GPUs (MIG) Jun 24, 2025
@sjpb sjpb merged commit 7509986 into main Jun 24, 2025
4 checks passed
@sjpb sjpb deleted the feature/mig branch June 24, 2025 08:30
sjpb added a commit that referenced this pull request Jun 24, 2025
@sjpb sjpb restored the feature/mig branch June 24, 2025 08:42
@sjpb
Copy link
Collaborator

sjpb commented Jun 24, 2025

@sjpb
Copy link
Collaborator

sjpb commented Jun 24, 2025

Actually I'm going to do this in a new branch, the merge squash means this branch will conflict.

@sjpb sjpb deleted the feature/mig branch June 24, 2025 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants