Skip to content

Commit ebe250a

Browse files
authored
Revert "Add documentation for the systemd nvidia-container-toolkit.service (#…" (#215)
This reverts commit 01f0197.
1 parent 01f0197 commit ebe250a

File tree

4 files changed

+7
-101
lines changed

4 files changed

+7
-101
lines changed

container-toolkit/cdi-support.md

Lines changed: 4 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
% Date: November 11 2022
22

3-
% Author: elezar ([email protected])
4-
% Author: ArangoGutierrez ([email protected])
3+
% Author: elezar
54

65
% headings (h1/h2/h3/h4/h5) are # * = -
76

@@ -30,99 +29,7 @@ CDI also improves the compatibility of the NVIDIA container stack with certain f
3029

3130
- You installed an NVIDIA GPU Driver.
3231

33-
### Automatic CDI Specification Generation
34-
35-
As of NVIDIA Container Toolkit `v1.18.0`, the CDI specification is automatically generated and updated by a systemd service called `nvidia-cdi-refresh`. This service:
36-
37-
- Automatically generates the CDI specification at `/var/run/cdi/nvidia.yaml` when NVIDIA drivers are installed or upgraded
38-
- Runs automatically on system boot to ensure the specification is up to date
39-
40-
```{note}
41-
The automatic CDI refresh service does not handle:
42-
- Driver removal (the CDI file is intentionally preserved)
43-
- MIG device reconfiguration
44-
45-
For these scenarios, you may still need to manually regenerate the CDI specification. See [Manual CDI Specification Generation](#manual-cdi-specification-generation) for instructions.
46-
```
47-
48-
#### Customizing the Automatic CDI Refresh Service
49-
50-
You can customize the behavior of the `nvidia-cdi-refresh` service by adding environment variables to `/etc/nvidia-container-toolkit/cdi-refresh.env`. This file is read by the service and allows you to modify the `nvidia-ctk cdi generate` command behavior.
51-
52-
Example configuration file:
53-
```bash
54-
# /etc/nvidia-container-toolkit/cdi-refresh.env
55-
NVIDIA_CTK_DEBUG=1
56-
# Add other nvidia-ctk environment variables as needed
57-
```
58-
59-
For a complete list of available environment variables, run `nvidia-ctk cdi generate --help` to see the command's documentation.
60-
61-
```{important}
62-
After modifying the environment file, you must reload the systemd daemon and restart the service for changes to take effect:
63-
64-
```console
65-
$ sudo systemctl daemon-reload
66-
$ sudo systemctl restart nvidia-cdi-refresh.service
67-
```
68-
69-
#### Managing the CDI Refresh Service
70-
71-
The `nvidia-cdi-refresh` service consists of two systemd units:
72-
73-
- `nvidia-cdi-refresh.path` - Monitors for changes to driver files and triggers the service
74-
- `nvidia-cdi-refresh.service` - Executes the CDI specification generation
75-
76-
You can manage these services using standard systemd commands:
77-
78-
```console
79-
# Check service status
80-
$ sudo systemctl status nvidia-cdi-refresh.path
81-
● nvidia-cdi-refresh.path - Trigger CDI refresh on NVIDIA driver install / uninstall events
82-
Loaded: loaded (/etc/systemd/system/nvidia-cdi-refresh.path; enabled; preset: enabled)
83-
Active: active (waiting) since Fri 2025-06-27 06:04:54 EDT; 1h 47min ago
84-
Triggers: ● nvidia-cdi-refresh.service
85-
86-
$ sudo systemctl status nvidia-cdi-refresh.service
87-
○ nvidia-cdi-refresh.service - Refresh NVIDIA CDI specification file
88-
Loaded: loaded (/etc/systemd/system/nvidia-cdi-refresh.service; enabled; preset: enabled)
89-
Active: inactive (dead) since Fri 2025-06-27 07:17:26 EDT; 34min ago
90-
TriggeredBy: ● nvidia-cdi-refresh.path
91-
Process: 1317511 ExecStart=/usr/bin/nvidia-ctk cdi generate --output=/var/run/cdi/nvidia.yaml (code=exited, status=0/SUCCESS)
92-
Main PID: 1317511 (code=exited, status=0/SUCCESS)
93-
CPU: 562ms
94-
95-
Jun 27 00:04:30 ipp2-0502 nvidia-ctk[1623461]: time="2025-06-27T00:04:30-04:00" level=info msg="Selecting /usr/bin/nvidia-smi as /usr/bin/nvidia-smi"
96-
Jun 27 00:04:30 ipp2-0502 nvidia-ctk[1623461]: time="2025-06-27T00:04:30-04:00" level=info msg="Selecting /usr/bin/nvidia-debugdump as /usr/bin/nvidia-debugdump"
97-
Jun 27 00:04:30 ipp2-0502 nvidia-ctk[1623461]: time="2025-06-27T00:04:30-04:00" level=info msg="Selecting /usr/bin/nvidia-persistenced as /usr/bin/nvidia-persistenced"
98-
Jun 27 00:04:30 ipp2-0502 nvidia-ctk[1623461]: time="2025-06-27T00:04:30-04:00" level=info msg="Selecting /usr/bin/nvidia-cuda-mps-control as /usr/bin/nvidia-cuda-mps-control"
99-
Jun 27 00:04:30 ipp2-0502 nvidia-ctk[1623461]: time="2025-06-27T00:04:30-04:00" level=info msg="Selecting /usr/bin/nvidia-cuda-mps-server as /usr/bin/nvidia-cuda-mps-server"
100-
Jun 27 00:04:30 ipp2-0502 nvidia-ctk[1623461]: time="2025-06-27T00:04:30-04:00" level=warning msg="Could not locate nvidia-imex: pattern nvidia-imex not found"
101-
Jun 27 00:04:30 ipp2-0502 nvidia-ctk[1623461]: time="2025-06-27T00:04:30-04:00" level=warning msg="Could not locate nvidia-imex-ctl: pattern nvidia-imex-ctl not found"
102-
Jun 27 00:04:30 ipp2-0502 nvidia-ctk[1623461]: time="2025-06-27T00:04:30-04:00" level=info msg="Generated CDI spec with version 1.0.0"
103-
Jun 27 00:04:30 ipp2-0502 systemd[1]: nvidia-cdi-refresh.service: Succeeded.
104-
Jun 27 00:04:30 ipp2-0502 systemd[1]: Started Refresh NVIDIA CDI specification file.
105-
```
106-
107-
You can enable/disable the automatic CDI refresh service using the following commands:
108-
109-
```console
110-
$ sudo systemctl enable --now nvidia-cdi-refresh.path
111-
$ sudo systemctl enable --now nvidia-cdi-refresh.service
112-
$ sudo systemctl disable nvidia-cdi-refresh.service
113-
$ sudo systemctl disable nvidia-cdi-refresh.path
114-
```
115-
116-
You can also view the service logs to see the output of the CDI generation process.
117-
118-
```console
119-
# View service logs
120-
$ sudo journalctl -u nvidia-cdi-refresh.service
121-
```
122-
123-
### Manual CDI Specification Generation
124-
125-
If you need to manually generate a CDI specification, for example, after MIG configuration changes or if you are using a Container Toolkit version before v1.18.0, follow this procedure:
32+
### Procedure
12633

12734
Two common locations for CDI specifications are `/etc/cdi/` and `/var/run/cdi/`.
12835
The contents of the `/var/run/cdi/` directory are cleared on boot.
@@ -132,10 +39,10 @@ However, the path to create and use can depend on the container engine that you
13239
1. Generate the CDI specification file:
13340

13441
```console
135-
$ sudo nvidia-ctk cdi generate --output=/var/run/cdi/nvidia.yaml
42+
$ sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
13643
```
13744

138-
The sample command uses `sudo` to ensure that the file at `/var/run/cdi/nvidia.yaml` is created.
45+
The sample command uses `sudo` to ensure that the file at `/etc/cdi/nvidia.yaml` is created.
13946
You can omit the `--output` argument to print the generated specification to `STDOUT`.
14047

14148
*Example Output*
@@ -170,8 +77,6 @@ You must generate a new CDI specification after any of the following changes:
17077
- You use a location such as `/var/run/cdi` that is cleared on boot.
17178
17279
A configuration change can occur when MIG devices are created or removed, or when the driver is upgraded.
173-
174-
**Note**: As of NVIDIA Container Toolkit v1.18.0, the automatic CDI refresh service handles most of these scenarios automatically.
17580
```
17681

17782
## Running a Workload with CDI

container-toolkit/install-guide.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -229,6 +229,7 @@ See also the [nerdctl documentation](https://github.com/containerd/nerdctl/blob/
229229

230230
For Podman, NVIDIA recommends using [CDI](./cdi-support.md) for accessing NVIDIA devices in containers.
231231

232+
232233
## Next Steps
233234

234235
- [](./sample-workload.md)

container-toolkit/release-notes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,7 @@ The following packages are included:
255255
- `libnvidia-container-tools 1.17.2`
256256
- `libnvidia-container1 1.17.2`
257257

258-
The following `container-toolkit` containers are included:
258+
The following `container-toolkit` conatiners are included:
259259

260260
- `nvcr.io/nvidia/k8s/container-toolkit:v1.17.2-ubi8`
261261
- `nvcr.io/nvidia/k8s/container-toolkit:v1.17.2-ubuntu20.04` (also as `nvcr.io/nvidia/k8s/container-toolkit:v1.17.2`)

container-toolkit/sample-workload.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ you can verify your installation by running a sample workload.
2121

2222
## Running a Sample Workload with Podman
2323

24-
After you install and configure the toolkit (including [generating a CDI specification](cdi-support.md)) and install an NVIDIA GPU Driver,
24+
After you install and configura the toolkit (including [generating a CDI specification](cdi-support.md)) and install an NVIDIA GPU Driver,
2525
you can verify your installation by running a sample workload.
2626

2727
- Run a sample CUDA container:

0 commit comments

Comments
 (0)