Skip to content

Add driver rootfs and /host to vgpu-device-manager#1474

Merged
tariq1890 merged 1 commit intoNVIDIA:mainfrom
mresvanis:add-mig-backed-vgpu-support
Jul 18, 2025
Merged

Add driver rootfs and /host to vgpu-device-manager#1474
tariq1890 merged 1 commit intoNVIDIA:mainfrom
mresvanis:add-mig-backed-vgpu-support

Conversation

@mresvanis
Copy link
Copy Markdown
Contributor

@mresvanis mresvanis commented Jun 5, 2025

This PR mounts the NVIDIA driver rootfs, the host rootfs and the gpu-clients ConfigMap to the vgpu-device-manager container to enable MIG configuration before the vGPU configuration when MIG-backed vGPU types are included in the selected vGPU configuration.

nvidia-mig-parted requires NVML to configure MIG and the host rootfs to start/stop systemd services.

Related vGPU device manager PR: NVIDIA/vgpu-device-manager#93

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Jun 5, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@mresvanis
Copy link
Copy Markdown
Contributor Author

/cc @cdesiniotis

hostPath:
path: /run/nvidia/validations
type: DirectoryOrCreate
- name: mig-parted-config
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding these two configMap references here, we could just update the unmarshaled ConfigMap object with these configMap references when we detect that the clusterpolicy custom resource includes additional configMaps

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out, it slipped my TODOs when drafting this PR. What do you think about implementing the same logic found in the respective MIG Manager transformation?

I also tried to reduce the duplicate checks that came up, but let me know if you think that's less readable and/or not maintainable and I'll go back to simple checks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still relevant?

Copy link
Copy Markdown
Contributor Author

@mresvanis mresvanis Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have removed both the mig-parted-config and the gpu-clients ConfigMaps, as neither is currently relevant to these changes.

mig-parted-config is not needed since the vgpu-device-manager handles the conversion of of the vGPU config to the respective MIG config.

gpu-clients is not needed since there is currently no support for pre-installed vGPU host drivers, which is the purpose of this ConfigMap in the first place.

Thank you both for the detailed explanations and suggestions!

@mresvanis mresvanis force-pushed the add-mig-backed-vgpu-support branch 3 times, most recently from 8e36bda to 51d9d07 Compare June 6, 2025 10:18
@mresvanis mresvanis force-pushed the add-mig-backed-vgpu-support branch 2 times, most recently from ae945ae to b9b9b60 Compare June 13, 2025 14:35
@mresvanis mresvanis changed the title Add driver rootfs, mig-parted, gpu-clients and LD_LIBRARY_PATH to vgpu-device-manager Add driver rootfs, mig-parted, gpu-clients and LD_PRELOAD to vgpu-device-manager Jun 13, 2025
Comment on lines +48 to +49
- name: LD_PRELOAD
value: "/driver-root/usr/lib64/libnvidia-ml.so.1"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we wouldn't set this in the spec. Instead, we would discover the path to libnvidia-ml.so.1 in the vgpu-device-manager code and set the LD_PRELOAD envvar when invoking the mig-parted binaries. We have precedent for this in other components. For example, here is our code in the gpu-operator-validator that validates a driver container installation:

validateDriver := func(silent bool) error {
driverLibraryPath, err := driverRoot.getDriverLibraryPath()
if err != nil {
return fmt.Errorf("failed to locate driver libraries: %w", err)
}
nvidiaSMIPath, err := driverRoot.getNvidiaSMIPath()
if err != nil {
return fmt.Errorf("failed to locate nvidia-smi: %w", err)
}
cmd := exec.Command(nvidiaSMIPath)
// In order for nvidia-smi to run, we need to update LD_PRELOAD to include the path to libnvidia-ml.so.1.
cmd.Env = setEnvVar(os.Environ(), "LD_PRELOAD", prependPathListEnvvar("LD_PRELOAD", driverLibraryPath))
if !silent {
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
}
return cmd.Run()
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, thank you for pointing out the use of such code in other components. I refactored the vgpu-device-manager PR to search for the library and setup LD_PRELOAD for the mig-parted commands.

@mresvanis mresvanis force-pushed the add-mig-backed-vgpu-support branch 2 times, most recently from 9e8e269 to ea05313 Compare June 26, 2025 09:07
@mresvanis mresvanis changed the title Add driver rootfs, mig-parted, gpu-clients and LD_PRELOAD to vgpu-device-manager Add driver rootfs, host rootfs and gpu-clients to vgpu-device-manager Jun 26, 2025
if config.MIGManager.Config != nil && config.MIGManager.Config.Name != "" && config.MIGManager.Config.Name != MigPartedDefaultConfigMapName {
logger.Info(fmt.Sprintf("Not creating resource, custom ConfigMap provided: %s", config.MIGManager.Config.Name))
if name, isCustom := gpuv1.GetConfigMapName(config.MIGManager.Config, MigPartedDefaultConfigMapName); isCustom {
logger.Info(fmt.Sprintf("Not creating resource, custom ConfigMap provided: %s", name))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need fmt.Sprintf as logger.Info already supports formatted messages

See the method signature below

func (l Logger) Info(msg string, keysAndValues ...any) {
	if l.sink == nil {
		return
	}
	if l.sink.Enabled(l.level) { // see comment in Enabled
		if withHelper, ok := l.sink.(CallStackHelperLogSink); ok {
			withHelper.GetCallStackHelper()()
		}
		l.sink.Info(l.level, msg, keysAndValues...)
	}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of a logger.Info() call

		w.log.Info("Operands are disabled for node", "NodeName", w.node, "Label", commonOperandsLabelKey, "Value", "false")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out and the detailed explanation! I blindly followed the existing log message. I have updated these changes.


// set ConfigMap name for "gpu-clients" Volume
for i, vol := range obj.Spec.Template.Spec.Volumes {
if !strings.Contains(vol.Name, "gpu-clients") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer an exact string match instead of a Contains

Copy link
Copy Markdown
Contributor Author

@mresvanis mresvanis Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can definitely update this, I just wasn't sure about why Contains is used in other such cases, e.g. here and thought I'd adhere to what is already used.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason AFAIK. Someone probably made the decision to use Contains a while ago and it just stuck throughout. We can update the conditional here to check for an exact string match.

@mresvanis mresvanis force-pushed the add-mig-backed-vgpu-support branch 3 times, most recently from b8c7421 to 205e07f Compare June 27, 2025 08:59
break
}

// set ConfigMap name for "gpu-clients" Volume
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general question / comment - what happens if the MIG Manager component is explicitly disabled in ClusterPolicy? In this case, the default-gpu-clients ConfigMap would not have been created by the GPU Operator, and thus, the vGPU Device Manager would fail to run (due to the missing volume).

We don't have any precedent for this (so I am not sure if there are any adverse effects), but what about creating a symlink to the default-gpu-clients ConfigMap contents?

assets/state-vgpu-device-manager/0510_configmap.yaml --> assets/state-mig-manager/0410_configmap.yaml

This way, when reconciling the vGPU Device Manager component, the GPU Operator will always attempt to create the default-gpu-clients ConfigMap when the user has not specified a custom ConfigMap -- regardless if the MIG Manager is enabled or not.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, a symlink will not work. If the MIG Manager OR vGPU Device Manager is disabled in ClusterPolicy, the ConfigMap will get deleted here:

if !n.isStateEnabled(n.stateNames[n.idx]) {
err := n.client.Delete(ctx, obj)
if err != nil && !apierrors.IsNotFound(err) {
logger.Info("Couldn't delete", "Error", err)
return gpuv1.NotReady, err
}
return gpuv1.Disabled, nil
}

We might need to add a copy of this ConfigMap in assets/state-vgpu-device-manager (same contents, different name) unless one of us thinks of a more clever solution.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to note, the gpu-clients configuration file is only relevant when drivers are pre-installed on the host (not installed via the driver container) -- the gpu-clients configuration file contains a list of systemd services that use the GPU and need to be restarted across a MIG re-configuration.

We technically don't support pre-installed vGPU manager (aka the host driver) today. This becomes apparent when reading this code snippet / TODO item: https://github.com/NVIDIA/gpu-operator/blob/main/assets/state-vgpu-device-manager/0600_daemonset.yaml#L25-L28

Because we don't support pre-installed vGPU Manager, I would be open to removing the gpu-clients configuration handling from this PR entirely. It may be more appropriate to add the gpu-clients configuration handling in the same PR that adds preinstalled vGPU Manager support (whenever this becomes a requirement and prioritized).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the detailed explanation! I agree with you that we should not include the gpu-clients ConfigMap as it's not currently relevant. I have updated the PR accordingly.


// set ConfigMap name for "gpu-clients" Volume
for i, vol := range obj.Spec.Template.Spec.Volumes {
if !strings.Contains(vol.Name, "gpu-clients") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason AFAIK. Someone probably made the decision to use Contains a while ago and it just stuck throughout. We can update the conditional here to check for an exact string match.

@mresvanis mresvanis force-pushed the add-mig-backed-vgpu-support branch 2 times, most recently from 4aa0825 to cc107ac Compare July 8, 2025 09:34
@mresvanis mresvanis marked this pull request as ready for review July 8, 2025 13:50
@cdesiniotis cdesiniotis changed the title Add driver rootfs, host rootfs and gpu-clients to vgpu-device-manager Add driver rootfs and /host to vgpu-device-manager Jul 8, 2025
@cdesiniotis
Copy link
Copy Markdown
Contributor

/ok to test cc107ac

@cdesiniotis cdesiniotis requested a review from tariq1890 July 8, 2025 15:48
@mresvanis mresvanis force-pushed the add-mig-backed-vgpu-support branch from cc107ac to 70662bb Compare July 9, 2025 08:17
@cdesiniotis
Copy link
Copy Markdown
Contributor

/ok to test 70662bb

@cdesiniotis
Copy link
Copy Markdown
Contributor

@tariq1890 can you take another look at this PR?

@tariq1890
Copy link
Copy Markdown
Contributor

Can you rebase this PR @mresvanis ?

@mresvanis mresvanis force-pushed the add-mig-backed-vgpu-support branch from 70662bb to 5385337 Compare July 17, 2025 09:39
This change enables the vgpu-device-manager to run nvidia-mig-parted to
configure MIG when MIG-backed vGPU types are included in the selected
vGPU configuration before applying the latter.

nvidia-mig-parted requires NVML to configure MIG and /host to
start/stop systemd services.

Signed-off-by: Michail Resvanis <mresvani@redhat.com>
@mresvanis mresvanis force-pushed the add-mig-backed-vgpu-support branch from 5385337 to 04bb02f Compare July 17, 2025 09:44
@tariq1890
Copy link
Copy Markdown
Contributor

/ok to test 04bb02f

@tariq1890 tariq1890 enabled auto-merge July 18, 2025 02:20
@tariq1890 tariq1890 merged commit 94520bc into NVIDIA:main Jul 18, 2025
15 of 16 checks passed
@mresvanis mresvanis deleted the add-mig-backed-vgpu-support branch July 18, 2025 06:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants