Skip to content

Fine-tune Gemma image has issues, not detecting GPU #1743

@kenthua

Description

@kenthua

https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/blob/main/ai-ml/llm-finetuning-gemma/Dockerfile

The dependabot update causes the fine-tune job to fail due to not detecting the GPU.

GKE Autopilot 1.33.x
The updates causes the issue on the following image versions tested:

  • nvidia/cuda:12.9.1-runtime-ubuntu22.04
  • nvidia/cuda:12.9.1-runtime-ubuntu24.04

The original version nvidia/cuda:12.2.0-runtime-ubuntu22.04 still works.

Metadata

Metadata

Assignees

Labels

🚨This issue needs some love.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions