-
Notifications
You must be signed in to change notification settings - Fork 0
GPU load balancing #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
looks good on staging:
|
hiroTamada
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solid implementation of GPU-aware load balancing. The TTL-based caching with double-checked locking is well done, and the VRAM-based selection heuristic is a reasonable approach. One minor nit about the config wiring, but nothing blocking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| parentGPU := vfToParent[mdev.VFAddress] | ||
| if parentGPU == "" { | ||
| continue | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VFs without parent GPU have VRAM usage ignored
Medium Severity
In calculateGPUVRAMUsage, mdevs on VFs with empty ParentGPU are skipped (if parentGPU == "" { continue }), so their VRAM is never counted. However, in selectLeastLoadedVF, these same VFs ARE included in allGPUs and freeVFsByGPU for selection. This means VFs without a physfn symlink are grouped under an empty-string "GPU" that always appears to have 0 VRAM usage, making them preferentially selected even when they already have active mdevs. This could cause load imbalance.
Note
Introduces VRAM-aware vGPU allocation with TTL-cached profile metadata and a configurable cache TTL.
devices/mdev.go(SetGPUProfileCacheTTL,getCachedProfiles) and parses framebuffer sizes for profilescalculateGPUVRAMUsage,selectLeastLoadedVF);CreateMdevnow picks a VF from the least-loaded GPUavailable_instancesGPU_PROFILE_CACHE_TTLto config and wires it inmain.goviadevices.SetGPUProfileCacheTTLWritten by Cursor Bugbot for commit d7e7aaa. This will update automatically on new commits. Configure here.