-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
nvidia-container-toolkit: reintroduce nvidia runtime wrappers #421088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nvidia-container-toolkit: reintroduce nvidia runtime wrappers #421088
Conversation
68c45d3 to
96de344
Compare
96de344 to
f89700c
Compare
|
I'm not opposed to this, but from previous experience this did tend to break quite frequently. Is there some way to attach your handle to this so people know you're the code owner or maintainer? I personally don't have the experience necessary to know how to fix or debug issues with this. |
|
Hi @ConnorBaker! I don’t know if there is a way to do so. If there is, I am completely fine getting tagged with things like this. I have been maintaining the GPU support through CDI for some time now. This should not be as brittle as before, since Docker is doing some of the heavy lifting nowadays, even with the nvidia-container-cli. Things are not as brittle as they were with the docker-nvidia wrapper. |
|
And to clarify, this shouldn't conflict with using the CDI options with Docker or Podman, correct? |
No, not at all. Both |
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/prs-ready-for-review/3032/5648 |
8288b53 to
49643e0
Compare
49643e0 to
60797c3
Compare
ConnorBaker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this took so long to get back to you!
nixos/modules/services/hardware/nvidia-container-toolkit/default.nix
Outdated
Show resolved
Hide resolved
nixos/modules/services/hardware/nvidia-container-toolkit/default.nix
Outdated
Show resolved
Hide resolved
nixos/modules/services/hardware/nvidia-container-toolkit/default.nix
Outdated
Show resolved
Hide resolved
nixos/modules/services/hardware/nvidia-container-toolkit/default.nix
Outdated
Show resolved
Hide resolved
60797c3 to
6381723
Compare
nixos/modules/services/hardware/nvidia-container-toolkit/default.nix
Outdated
Show resolved
Hide resolved
6381723 to
c643e15
Compare
This allows users to keep using `docker run --gpus`. Despite CDI is the recommended way to expose GPU's to containers nowadays, allow users to keep using the old `--gpus` method.
c643e15 to
117bbae
Compare
|
Nicely done! Thank you again for working on this -- if you're satisfied with it, let me know and I'll merge it. |
|
Slightly related @ereslibre -- working on #425862 to fix a few CVEs, I found the tests were broken. I'm not going to get the chance to fix them any time soon, and you're familiar with the module -- would you mind fixing them up? More than happy to review and merge a fix! |
Thank you! Yes, please, go ahead and let's merge this one :) |
I'm going to check it out when I have some time, it could span a couple of weeks though, I will ping you if/when I have a fix for them. |
|
@ConnorBaker I have created a couple PR against your repo to fix tests, both for master and for 25.05: |
|
Hi @ereslibre I'm on nixos-unstable d0fc308 and --gpus all does not seem to work. this works: this not: these two not as well: just wanted to know if this is supposed to work before filing an issue. thanks |
|
Hi @gregorburger!
Can you share your configuration?
This is expected, as the However, this should work: |
|
@ereslibre Sorry for the late response! ... NixCon This one works for me as well: My config is essentially this: Relevant users are in the docker group. |
Some folks really know how to take good care of themselves! Ha! Hope you enjoyed it! :) Ah, sorry, alright, I missed the
This is the one that puzzles me based on your config; are you targeting rootful docker?
Expected.
Already tracked in #441227, thanks for opening the issue! |
This allows users to keep using
docker run --gpus. Despite CDI is the recommended way to expose GPU's to containers nowadays, allow users to keep using the old--gpusmethod.Fixes: #419597
Fixes: #241316
Things done
nix.conf? (See Nix manual)sandbox = relaxedsandbox = truenix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)Add a 👍 reaction to pull requests you find important.