feat: shared annotation by cgetzen · Pull Request #14 · SlinkyProject/slurm-bridge

cgetzen · 2026-02-03T10:04:51Z

Summary

Problem

Slurm-bridge does not support colocating multiple pods on a single multi-GPU node, resulting in underutilization when workloads require fewer GPUs than the node provides.

Solution

This adds an optional workload annotation slurmjob.slinky.slurm.net/shared accepting a subset of Slurm shared policy values (none, user) on workloads that have a 1:1 relationship between slurm jobs and pods. This excludes PodGroup and LeaderWorkerSet resources.

The admission controller ensures correctness:

validates the annotation value
ensures the annotation is immutable once the placeholder slurm job is running
ensures that it is only applied onto accepted workloads

The scheduler then applies the "shared" setting when creating the slurm job.

Limitations

Allowing group workloads to use the shared annotation is out of scope.

Group workloads use a single placeholder job for multiple pods with a fixed node count and one-node-per-pod assignment. Allowing shared on them would require supporting Slurm packing (fewer nodes than pods), which would require changes to PostFilter, submitJob node count, and annotatePodsWithNodes.

Using group workloads with DRA poses additional challenges. Slurm-bridge currently assumes one pod per node per job: PreBind is called per-pod with (pod, nodeName), and GetResources(ctx, pod, nodeName) returns the job’s allocation on that node from Slurm’s NodeResourceLayout. One ResourceClaim is created per pod for that full allocation. With multiple pods on the same node, each pod should only receive a portion of the job's allocation.

Breaking Changes

All existing behavior is maintained by default. Only workloads that opt in to using slurmjob.slinky.slurm.net/shared are affected.

Testing Notes

Unit tests have been added, and manual tests have been performed to confirm scheduling placement.

Additional Context

…t-shared-annotation

vivian-hafener · 2026-02-17T21:00:38Z

Good afternoon @cgetzen,

Slurm-bridge does not support colocating multiple pods on a single multi-GPU node, resulting in underutilization when workloads require fewer GPUs than the node provides.

This is correct, and is a known limitation of Slurm-bridge that the Slinky team is actively working to resolve. Setting OverSubscribe=YES has severe implications for the quality of service provided to end-users when using Slurm, especially on multitenant, highly utilized systems. As such, I am not comfortable with officially recommending this parameter as the means by which multiple Slurm jobs should be run on a single node on production clusters.

Presently, the Slinky team is working on integrating DRA capabilities into Slurm-bridge. In doing so, we will have the capability to accurately de-conflict the resource requirements of Kubernetes and Slurm workloads for both CPUs and GPUs. This capability will enable multiple Kubernetes or Kubernetes/Slurm workloads on a node, without the degradation in the end-user experience that would be provided through the use of the OverSubscribe configuration parameter. Additionally, this should enable "group workloads" to take advantage of this capability. At the time that these integrations are complete, the ability to enable node-packing/sharing in Slurm will be exposed on a system level. However, we do not at this time intend to expose the shared policy via an annotation.

Please let me know if you have any further questions on the matter.

Best regards,
Vivian Hafener

cgetzen · 2026-02-17T22:23:24Z

I appreciate the detailed response. This PR has two errors:

OVERSUBSCRIBE is not required on the partition in order for shared=user to schedule multiple workloads on a pod. This is a documentation error.
Setting shared to mcs, oversubscribe, topo has risks when not running with DRA, as slurm and kubernetes workloads can be scheduled on the same node. In the scope of this PR, these options should be removed.

I have updated the branch accordingly.

@vivian-hafener DRAExtendedResources are still in alpha. I agree that it’s important for all code paths to support bin-packing, and believe this PR takes an incremental step by safely enabling the non-DRA paths that are already usable in production. Does the Slinky team plan to add (kubernetes-only) bin-packing support for the non-DRA code paths? If so, I am happy to maintain this branch until that feature is developed. Otherwise, it may be beneficial to reconsider this PR with the fixes I have added.

Non-DRA, kubernetes-only bin-packing is needed for our use-case: A cluster where researchers use slurm to schedule whole-node/multi-node workloads, and SRE use kubernetes to schedule autoscaling inference workloads consuming single GPUs.

Thanks for your work on this project!

vivian-hafener · 2026-02-20T21:32:27Z

Good afternoon @cgetzen,

I apologize for closing your PR prematurely.

At the time that the Slinky team implements node sharing with the CPU DRA driver, we intend to fully drop our existing limitation that forces the application of the --exclusive flag to all workloads.

After the integration of Slurm-bridge with DRA-Driver-CPU, I will re-evaluate this PR and discuss it with the rest of the team. I think that adding an annotation as you have done here may indeed make sense.

Best regards,
Vivian Hafener

feat: shared annotation

62efa0b

cgetzen force-pushed the feat-shared-annotation branch from eca58f2 to 62efa0b Compare February 3, 2026 10:10

cgetzen added 4 commits February 4, 2026 21:03

Merge branch 'main' of github.com:SlinkyProject/slurm-bridge into fea…

6bc960e

…t-shared-annotation

tmp

520f703

Merge branch 'main' into feat-shared-annotation

f5688ba

readd

b63288e

cgetzen force-pushed the feat-shared-annotation branch from ba2f425 to b63288e Compare February 16, 2026 04:56

vivian-hafener self-assigned this Feb 17, 2026

vivian-hafener closed this Feb 17, 2026

feedback

80fd4e1

vivian-hafener reopened this Feb 20, 2026

Merge branch 'main' into feat-shared-annotation

75fcf47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: shared annotation#14

feat: shared annotation#14
cgetzen wants to merge 7 commits intoSlinkyProject:mainfrom
taichi-dev:feat-shared-annotation

cgetzen commented Feb 3, 2026 •

edited

Loading

Uh oh!

vivian-hafener commented Feb 17, 2026 •

edited

Loading

Uh oh!

cgetzen commented Feb 17, 2026 •

edited

Loading

Uh oh!

vivian-hafener commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cgetzen commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Breaking Changes

Testing Notes

Additional Context

Uh oh!

vivian-hafener commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cgetzen commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vivian-hafener commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cgetzen commented Feb 3, 2026 •

edited

Loading

vivian-hafener commented Feb 17, 2026 •

edited

Loading

cgetzen commented Feb 17, 2026 •

edited

Loading