Skip to content

Conversation

@koct9i
Copy link
Collaborator

@koct9i koct9i commented Nov 3, 2025

  • test/r8r: check nvidia container runtime
  • Add nvidia container runtime for CRI-O

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for CRI-O (Container Runtime Interface - O) as an alternative to containerd for running containerized jobs in YTsaurus exec nodes. It includes configuration generation for CRI-O with NVIDIA GPU runtime support.

Key changes:

  • Implemented GetCRIOConfig() method to generate CRI-O configuration in TOML format
  • Refactored CRI service configuration handling to support both containerd and CRI-O through a unified NewJobsSidecarConfig() function
  • Added test coverage for CRI-O with and without NVIDIA container runtime

Reviewed Changes

Copilot reviewed 64 out of 64 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pkg/ytconfig/cri.go Added CRI-O configuration generation with support for multiple runtimes (runc, crun, nvidia) and replaced environment variable-based configuration with TOML config file
pkg/components/exec_node_base.go Created unified NewJobsSidecarConfig() function to handle both containerd and CRI-O configurations, added CRI-O volume mount and command-line arguments
pkg/components/exec_node.go Refactored to use new NewJobsSidecarConfig() function instead of containerd-specific logic
pkg/components/exec_node_remote.go Applied same refactoring as exec_node.go and added monitoring port configuration for CRI service
pkg/consts/cmd.go Added CRI-O-specific constants for config volume, mount point, and file name
pkg/testutil/spec_builders.go Added WithNvidiaContainerRuntime flag to test builder for NVIDIA runtime testing
test/r8r/components_test.go Added test cases for CRI-O with and without NVIDIA container runtime
Comments suppressed due to low confidence (1)

pkg/components/exec_node_base.go:102

  • The volume is always created with consts.ContainerdConfigVolumeName, but when CRI-O is used, the volume mount in addCRIServiceConfig() expects consts.CRIOConfigVolumeName (line 116). This mismatch will cause the volume mount to fail for CRI-O. The volume name should be determined based on the CRI service type, similar to how volume mounts are handled. Consider using a switch statement or variable to select the appropriate volume name based on n.criConfig.Service.
		podSpec.Volumes = append(podSpec.Volumes, createConfigVolume(consts.ContainerdConfigVolumeName,
			n.sidecarConfig.labeller.GetSidecarConfigMapName(consts.JobsContainerName), nil))

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Konstantin Khlebnikov <[email protected]>
Add configuration file for CRI-O, because configuration via
environment variables is not flexible enough.

Signed-off-by: Konstantin Khlebnikov <[email protected]>
@koct9i koct9i force-pushed the khlebnikov/crio-add-nvidia-runtime branch from 7ecdb64 to 8cf530a Compare November 10, 2025 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants