Replies: 1 comment
-
Have you tried using CDI as described here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#container-device-interface-cdi-support In general I recommend you ask nvidia about this, we have no idea about what the hook does. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Issue Description
I installed Podman on my Ubuntu 20.04 machine, as well as nvidia-container-toolkit according to the official instructions.
I got two "types" of users on my system, namely "normal" local users (e.g.
user1_local
(UID 1001)) and users mapped from LDAP (using SSSD) (e.g.user1
(UID 246740)).I'm trying to run the following command:
As one of the local users, everything works fine. As a "non-local" user, I'm running into this error, though:
This is the according
nvidia-container-toolkit.log
:Note the
driver rpc service terminated with signal 15
.What I did in advance:
no-cgroups = true
in/etc/nvidia-container-runtime/config.toml
user1:1000000:65536
rootless_storage_path
to a folder on the local file system (user home is on NFS by default)As
user1
, I can runnvidia-smi
on the host or any other, non-GPU container (e.g.hello-world
) in Podman without any problems.Any ideas how to resolve this?
Might as well be an issue related to nvidia-container-toolkit, rather than Podman, but I feel like the community is a lot more vibrant here.
Steps to reproduce the issue
Hard to reproduce, because you'd need LDAP users for this. See my configuration details above, though.
Describe the results you received
Can't run NVIDIA containers as externally mapped users.
Describe the results you expected
Be able to run them.
podman info output
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
No
Additional environment details
Additional environment details
Additional information
Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting
Beta Was this translation helpful? Give feedback.
All reactions