Skip to content

Add driver digest verification in confidential GPU driver installation flow#567

Merged
meetrajvala merged 2 commits intocs_cgpu_h100from
mhvcgpu2
May 2, 2025
Merged

Add driver digest verification in confidential GPU driver installation flow#567
meetrajvala merged 2 commits intocs_cgpu_h100from
mhvcgpu2

Conversation

@meetrajvala
Copy link
Contributor

@meetrajvala meetrajvala commented Apr 23, 2025

This PR contains the following changes:

  • Adds the verification of driver (nvidia .run file) digest in driver installation flow.
  • Reads the run file digest from well known public GCS bucket and stores it under OEM partition as reference driver digest.
  • Required utility functions for parsing and hash calculation.

Testing:

  • Existing image tests for confidential GPU ran successfully.
  • Unit tests for newly added functions

Changes which would be part of follow-up PR:

  • Measure GPU CC mode status.

@meetrajvala meetrajvala force-pushed the mhvcgpu2 branch 3 times, most recently from 203eb99 to 571c5e8 Compare April 23, 2025 07:07
@meetrajvala
Copy link
Contributor Author

/gcbrun

readonly GPU_REF_VALUES_PATH="${CS_PATH}/gpu"
readonly COS_GPU_INSTALLER_IMAGE_REF="${GPU_REF_VALUES_PATH}/cos_gpu_installer_image_ref"
readonly COS_GPU_INSTALLER_IMAGE_DIGEST="${GPU_REF_VALUES_PATH}/cos_gpu_installer_image_digest"
readonly DRIVER_DIGEST_SHA256SUM="${GPU_REF_VALUES_PATH}/driver_digest_sha256sum"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we store by driver version? For example, ${GPU_REF_VALUES_PATH}/${driver_version}/driver_digest_sha256sum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we have only default driver version support (for H100 gpu only) so it might just one more directory in path. I can accomodate this when we extend support for multiple cGPU families requiring different driver versions (There is a tracking ticket for this).

@meetrajvala meetrajvala force-pushed the mhvcgpu2 branch 2 times, most recently from bbc5e6d to 9b1268b Compare May 2, 2025 06:20
@meetrajvala meetrajvala merged commit 263a935 into cs_cgpu_h100 May 2, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants