arm64(cuda): try building remaining non-ROCm images for arm64 #2439

jiridanek · 2025-09-12T11:08:58Z

Description

Build trustyai and tensorflow cuda for arm64 red-hat-data-services/notebooks#1550

How Has This Been Tested?

Self checklist (all need to be checked):

Ensure that you have run make test (gmake on macOS) before asking for review
Changes to everything except Dockerfile.konflux files should be done in odh/notebooks and automatically synced to rhds/notebooks. For Konflux-specific changes, modify Dockerfile.konflux files directly in rhds/notebooks as these require special attention in the downstream repository and flow to the upcoming RHOAI release.

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

New Features
- Added ARM64 (aarch64) build targets for related PyTorch pipelines and Jupyter workbench images.
- Introduced platform-aware torchvision dependencies: CUDA wheels on x86_64 and CPU wheels on aarch64.
Chores
- Updated dependency constraints to improve cross-platform compatibility.
- Upgraded PyNaCl to 1.6.0.
- Added a dedicated PyTorch CPU wheel index and refined sources for torchvision.
- Applied platform markers to select packages to prevent incompatible installs.

coderabbitai · 2025-09-12T11:09:07Z

Walkthrough

Adds arm64 build-platforms to two Tekton PipelineRun YAMLs. Updates jupyter and PyTorch runtime pylock.toml constraints (markers), bumps pynacl to 1.6.0 with new wheels, adds a torchvision x86_64 marker, and splits torchvision dependency and sources per architecture in pyproject.toml, adding a new CPU UV index.

Changes

Cohort / File(s)	Summary
Tekton PipelineRun platform expansion `.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml`, `.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml`	Add `linux-m2xlarge/arm64` to `spec.build-platforms` alongside `linux/x86_64`.
Jupyter datascience lock markers `jupyter/datascience/ubi9-python-3.12/pylock.toml`	Add `marker = "platform_machine != 's390x'"` to `annotated-types==0.7.0`, `pydantic-core==2.33.2`, `typing-inspection==0.4.1`.
PyTorch runtime lock updates `runtimes/pytorch/ubi9-python-3.12/pylock.toml`	Add cffi marker: `python_full_version < '3.14' or platform_python_implementation != 'PyPy'`. Update pycparser marker to combined PyPy/python version condition. Bump pynacl 1.5.0 → 1.6.0 with new wheels and sdist. Add torchvision marker `platform_machine == 'x86_64'`.
PyTorch runtime project deps and indexes `runtimes/pytorch/ubi9-python-3.12/pyproject.toml`	Split torchvision deps by platform: `0.21.0+cu126` for x86_64; `0.21.0` for aarch64. Update `[tool.uv.sources]` for per-platform indexes. Add new `[[tool.uv.index]]` named `pytorch-cpu` pointing to CPU wheels.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks (2 passed, 1 warning)

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The current description only references an upstream PR link and leaves "How Has This Been Tested?" empty aside from the template checklist, so it lacks a detailed summary of changes, concrete testing steps, and verification results required by the repository template; PR metadata and the provided raw_summary show multiple file and dependency changes but the description does not explain their intent or test outcomes. Because required sections are missing or incomplete, reviewers cannot reliably assess the change set or test coverage. Therefore the description is incomplete for review.	Update the Description to list the key changes (files modified and rationale), add explicit testing instructions and results (commands run, environment, and outcomes), and mark the self‑checklist items as completed after running tests. Also include notes about odh/notebooks vs rhds/notebooks placement and confirm commit squashing or provide a commit strategy per the merge criteria.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title clearly identifies the primary intent to enable arm64 builds for non‑ROCm images and aligns with the changes in the PR (pipeline YAML additions for linux‑m2xlarge/arm64 and platform‑specific dependency adjustments for aarch64). It is concise, specific, and avoids noisy file lists or emojis. A reviewer scanning history will understand the main change from the title alone.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2025-09-12T11:09:45Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ide-developer
Once this PR has been reviewed and has the lgtm label, please assign paulovmr for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (1)

14-16: Noisy trigger paths still present (tracked separately).

The on‑cel‑expression still lists unrelated Minimal/DataScience paths. This was already tracked for cleanup in a prior issue; just noting it remains unchanged here.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d709fab and 4b93f3d.

📒 Files selected for processing (5)

.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (1 hunks)
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (1 hunks)
jupyter/datascience/ubi9-python-3.12/pylock.toml (3 hunks)
runtimes/pytorch/ubi9-python-3.12/pylock.toml (4 hunks)
runtimes/pytorch/ubi9-python-3.12/pyproject.toml (2 hunks)

🧰 Additional context used

🧠 Learnings (12)

📓 Common learnings

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-08-05T17:24:08.616Z
Learning: jiridanek requested PR review for #1521 covering s390x architecture support improvements, demonstrating continued focus on systematic multi-architecture compatibility enhancements in the opendatahub-io/notebooks repository through clean implementation with centralized configuration, proper CI integration, and architecture-aware testing patterns.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1320
File: jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda:42-52
Timestamp: 2025-07-09T12:29:56.162Z
Learning: jiridanek requested GitHub issue creation for OpenShift client architecture mapping problem affecting 29 Dockerfiles during PR #1320 review. Issue was created with comprehensive analysis covering all affected files using $(uname -m) returning 'aarch64' but OpenShift mirror expecting 'arm64', systematic solution using BuildKit TARGETARCH mapping with proper amd64→x86_64 and arm64→arm64 conversion, detailed acceptance criteria, and implementation guidance, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/rocm-tensorflow/ubi9-python-3.12/Dockerfile.rocm:50-50
Timestamp: 2025-07-08T19:30:01.738Z
Learning: jiridanek requested GitHub issue creation for multi-architecture support in ROCm TensorFlow image during PR #1333 review. Issue #1346 was created with comprehensive problem description covering hardcoded x86_64 architecture breaking multi-arch support, detailed impact analysis, three solution options (runtime detection, BuildKit TARGETARCH integration, hybrid approach) with pros/cons analysis, comprehensive acceptance criteria covering core requirements and testing, phased implementation guidance, related files identification, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/service.yaml:5-15
Timestamp: 2025-07-02T18:59:15.788Z
Learning: jiridanek creates targeted GitHub issues for specific test quality improvements identified during PR reviews in opendatahub-io/notebooks. Issue #1268 demonstrates this by converting a review comment about insufficient tf2onnx conversion test validation into a comprehensive improvement plan with clear acceptance criteria, code examples, and ROCm-specific context.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1909
File: runtimes/pytorch+llmcompressor/ubi9-python-3.11/Dockerfile.cuda:11-15
Timestamp: 2025-08-12T08:40:55.286Z
Learning: jiridanek requested GitHub issue creation for redundant CUDA upgrade optimization during PR #1909 review. Analysis revealed all 14 CUDA Dockerfiles contain redundant `yum upgrade -y` commands in cuda-base stages that execute after base stages already performed comprehensive `dnf upgrade` via pre-upgrade blocks, causing unnecessary CI latency and build inefficiency. Issue includes complete scope analysis with specific line numbers, investigation framework requiring NVIDIA upstream documentation review, multiple solution options, comprehensive acceptance criteria covering systematic testing and performance measurement, and proper context linking to PR #1909 review comment.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1909
File: runtimes/pytorch+llmcompressor/ubi9-python-3.11/Dockerfile.cuda:11-15
Timestamp: 2025-08-12T08:40:55.286Z
Learning: jiridanek requested GitHub issue creation for redundant CUDA upgrade optimization during PR #1909 review. Issue covers duplicate yum/dnf upgrade commands in cuda-base stages that execute after base stages already performed comprehensive upgrades, causing unnecessary CI latency and build inefficiency across multiple CUDA Dockerfiles. The solution requires investigating NVIDIA upstream documentation requirements before removing redundant upgrades, with systematic testing of all CUDA variants and performance measurement. Issue follows established pattern of comprehensive problem analysis, multiple solution options, detailed acceptance criteria, and proper context linking.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1995
File: jupyter/rocm/tensorflow/ubi9-python-3.11/Dockerfile.rocm:0-0
Timestamp: 2025-08-19T06:31:05.424Z
Learning: jiridanek requested GitHub issue creation for ROCm device detection test during PR #1995 review. The issue covers comprehensive test requirements for validating that device_lib.list_local_devices() successfully detects ROCm/HIP devices after ROCm package optimizations, including multiple solution options (pytest, shell script, notebook-based), detailed acceptance criteria for core requirements and test coverage, implementation guidance for test framework selection and CI integration, example test code for TensorFlow device detection validation, and proper context linking to PR #1995 and review comment, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda:38-38
Timestamp: 2025-07-08T19:30:20.513Z
Learning: jiridanek requested GitHub issue creation for multi-architecture support in TensorFlow CUDA runtime image during PR #1333 review. Issue was created with comprehensive problem description covering hardcoded NVARCH limitation, multiple solution options using TARGETARCH build argument with architecture mapping, acceptance criteria for multi-architecture builds, implementation guidance with code examples, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1320
File: jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda:65-66
Timestamp: 2025-07-09T12:31:02.033Z
Learning: jiridanek requested GitHub issue creation for MSSQL repo file hardcoding problem during PR #1320 review. Issue #1363 was created and updated with comprehensive problem description covering hardcoded x86_64 MSSQL repo files breaking multi-architecture builds across 10 affected Dockerfiles (including datascience, CUDA, ROCm, and TrustyAI variants), detailed root cause analysis, three solution options with code examples, clear acceptance criteria for all image types, implementation guidance following established multi-architecture patterns, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/rocm-tensorflow/ubi9-python-3.12/pyproject.toml:75-77
Timestamp: 2025-08-27T15:33:28.871Z
Learning: jiridanek requested GitHub issue creation for multi-architecture environment exploration during PR #2145 review, specifically to investigate architecture-specific environments for ROCm TensorFlow wheel optimization across x86_64, aarch64, ppc64le, and s390x architectures, noting uncertainty about implementation benefits but wanting systematic exploration through issue tracking. Issue #2158 was created with comprehensive analysis covering current limitations, investigation areas, multiple solution options, and clear acceptance criteria.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2180
File: .tekton/odh-base-image-cuda-py312-ubi9-pull-request.yaml:36-39
Timestamp: 2025-08-28T12:42:23.404Z
Learning: Konflux build system in opendatahub-io/notebooks repository requires "linux/x86_64" platform identifier format for Tekton PipelineRun build-platforms parameter, not the OCI-standard "linux/amd64". This is consistently used across all .tekton/ pipeline files and enforced by scripts/generate_pull_request_pipelineruns.py type definitions. Docker/buildah contexts use "linux/amd64" while Konflux/Tekton contexts use "linux/x86_64".

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1396
File: runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda:124-127
Timestamp: 2025-07-20T20:47:36.509Z
Learning: jiridanek identified that ARM64 wheels for h5py 3.14.0 are available on PyPI but being ignored due to AMD64-only dependency locking with --platform=linux/amd64. This causes unnecessary hdf5-devel package installation in ARM64 TensorFlow images when the ARM64 wheel h5py-3.14.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl could be used instead. The Pipfile.lock only contains 2 hashes for h5py, confirming limited platform consideration during lock generation.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue #2148 was created addressing the mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. The issue includes comprehensive problem description covering affected files (runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda and pylock.toml), detailed solution with PyTorch index URL update from cu124 to cu126, lock regeneration steps using uv, clear acceptance criteria for wheel alignment verification, and proper context linking to PR #2145 review comment, assigned to jiridanek.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue addresses mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. Comprehensive issue created with detailed problem description covering affected files, current state analysis, step-by-step solution including PyTorch index URL update and lock regeneration, clear acceptance criteria for cu126 wheel verification, and proper context linking to PR #2145 review comment.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2000
File: runtimes/rocm-pytorch/ubi9-python-3.11/requirements.txt:807-809
Timestamp: 2025-08-20T12:39:54.827Z
Learning: Issue #2055 tracks the remaining runtimes/rocm-tensorflow/ubi9-python-3.12/requirements.txt file that stayed at jupyter-core==5.7.2 during PR #2000, which will be addressed separately due to specific challenges with that ROCm TensorFlow image.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/pylock.toml:110-110
Timestamp: 2025-08-29T15:18:43.229Z
Learning: When uv compiles dependencies with platform restrictions (like `platform_machine != 'ppc64le'`), it automatically propagates these markers to transitive dependencies in the generated lock files. This is expected behavior and doesn't require manual intervention in source files.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2185
File: jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda:109-117
Timestamp: 2025-08-29T08:48:55.985Z
Learning: jiridanek prefers to implement systematic cleanup improvements through dedicated GitHub issues (like #2076) rather than applying individual point fixes during PR reviews, maintaining consistency with their established pattern of comprehensive code quality improvements.

📚 Learning: 2025-08-28T12:42:23.404Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2180
File: .tekton/odh-base-image-cuda-py312-ubi9-pull-request.yaml:36-39
Timestamp: 2025-08-28T12:42:23.404Z
Learning: Konflux build system in opendatahub-io/notebooks repository requires "linux/x86_64" platform identifier format for Tekton PipelineRun build-platforms parameter, not the OCI-standard "linux/amd64". This is consistently used across all .tekton/ pipeline files and enforced by scripts/generate_pull_request_pipelineruns.py type definitions. Docker/buildah contexts use "linux/amd64" while Konflux/Tekton contexts use "linux/x86_64".

Applied to files:

.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml

📚 Learning: 2025-08-27T14:49:24.112Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue #2148 was created addressing the mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. The issue includes comprehensive problem description covering affected files (runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda and pylock.toml), detailed solution with PyTorch index URL update from cu124 to cu126, lock regeneration steps using uv, clear acceptance criteria for wheel alignment verification, and proper context linking to PR #2145 review comment, assigned to jiridanek.

Applied to files:

.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
runtimes/pytorch/ubi9-python-3.12/pyproject.toml
runtimes/pytorch/ubi9-python-3.12/pylock.toml

📚 Learning: 2025-08-27T14:49:24.112Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue addresses mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. Comprehensive issue created with detailed problem description covering affected files, current state analysis, step-by-step solution including PyTorch index URL update and lock regeneration, clear acceptance criteria for cu126 wheel verification, and proper context linking to PR #2145 review comment.

Applied to files:

.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
runtimes/pytorch/ubi9-python-3.12/pyproject.toml
runtimes/pytorch/ubi9-python-3.12/pylock.toml

📚 Learning: 2025-09-05T12:10:28.916Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2265
File: .tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml:16-16
Timestamp: 2025-09-05T12:10:28.916Z
Learning: jiridanek requested GitHub issue creation for trigger path cleanup in pytorch+llmcompressor pipeline during PR #2265 review. The issue addresses copy-paste errors where irrelevant Minimal/DataScience trigger paths were included in the pytorch+llmcompressor pipeline on-cel-expression, causing unnecessary pipeline triggers. Solution involves removing unrelated paths and keeping only pytorch+llmcompressor-specific paths, build-args/cuda.conf, jupyter/utils, and the pipeline YAML itself, with comprehensive acceptance criteria and proper context linking.

Applied to files:

.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml

📚 Learning: 2025-09-12T09:51:55.382Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-09-12T09:51:55.382Z
Learning: jiridanek identified orphaned TensorFlow ROCm Python 3.12 entries in manifests/base/params-latest.env during PR #2103 review. The params file references odh-workbench-jupyter-tensorflow-rocm-py312-ubi9 and odh-pipeline-runtime-tensorflow-rocm-py312-ubi9 images with 2025a-v1.35 tags, but the corresponding source directories (jupyter/rocm/tensorflow/ubi9-python-3.12/ and runtimes/rocm-tensorflow/ubi9-python-3.12/) don't exist in the repository, creating a broken build situation where Tekton pipelines exist but have no source to build from.

Applied to files:

.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml

📚 Learning: 2025-08-29T15:18:43.229Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/pylock.toml:110-110
Timestamp: 2025-08-29T15:18:43.229Z
Learning: The file `runtimes/datascience/ubi9-python-3.12/pylock.toml` is a generated file and should not be edited directly. Changes should be made to the source pyproject.toml file instead.

Applied to files:

jupyter/datascience/ubi9-python-3.12/pylock.toml
runtimes/pytorch/ubi9-python-3.12/pyproject.toml
runtimes/pytorch/ubi9-python-3.12/pylock.toml

📚 Learning: 2025-09-12T08:38:43.685Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2432
File: jupyter/datascience/ubi9-python-3.12/pylock.toml:105-111
Timestamp: 2025-09-12T08:38:43.685Z
Learning: The pylock.toml file in jupyter/datascience/ubi9-python-3.12/ is autogenerated by uv and should not be manually edited. Changes should be made to the source files like pyproject.toml instead.

Applied to files:

jupyter/datascience/ubi9-python-3.12/pylock.toml

📚 Learning: 2025-09-02T07:55:31.824Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2233
File: tests/test_main.py:46-56
Timestamp: 2025-09-02T07:55:31.824Z
Learning: jiridanek deferred a suggestion for enhanced pyproject.toml/pylock.toml validation in PR #2233 tests/test_main.py. The current test implementation works fine, but future enhancements may need: name normalization using canonicalize_name() to handle case/separator mismatches, environment markers evaluation to skip dependencies not applicable to the image's Python/OS, and handling non-PEP440 versions from VCS/URL dependencies by checking if locked_version exists before asserting specifier.contains(). The deferred implementation includes evaluating markers against image environment (_env = {"python_version": python, "sys_platform": "linux", "platform_system": "Linux"}) and using tomllib.load() with binary mode for consistency.

Applied to files:

jupyter/datascience/ubi9-python-3.12/pylock.toml

📚 Learning: 2025-09-05T10:05:35.575Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1513
File: runtimes/datascience/ubi9-python-3.12/pylock.toml:180-180
Timestamp: 2025-09-05T10:05:35.575Z
Learning: In Python lock files for the datascience runtime, both bcrypt and paramiko packages are excluded from s390x platform using the marker "platform_machine != 's390x'" due to compatibility issues on IBM System z mainframe architecture.

Applied to files:

jupyter/datascience/ubi9-python-3.12/pylock.toml

📚 Learning: 2025-07-20T20:47:36.509Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1396
File: runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda:124-127
Timestamp: 2025-07-20T20:47:36.509Z
Learning: jiridanek identified that ARM64 wheels for h5py 3.14.0 are available on PyPI but being ignored due to AMD64-only dependency locking with --platform=linux/amd64. This causes unnecessary hdf5-devel package installation in ARM64 TensorFlow images when the ARM64 wheel h5py-3.14.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl could be used instead. The Pipfile.lock only contains 2 hashes for h5py, confirming limited platform consideration during lock generation.

Applied to files:

jupyter/datascience/ubi9-python-3.12/pylock.toml

📚 Learning: 2025-08-29T15:17:41.893Z

Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/pylock.toml:10-10
Timestamp: 2025-08-29T15:17:41.893Z
Learning: Files named pylock.toml in the notebooks repository are autogenerated by uv from pyproject.toml files and should not be directly modified. Changes should be made to the source pyproject.toml file instead.

Applied to files:

jupyter/datascience/ubi9-python-3.12/pylock.toml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (runtime-cuda-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-datascience-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request

🔇 Additional comments (9)

.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (1)

36-40: Build readiness check for aarch64 CUDA image.

Since this is a CUDA runtime, double‑check the Dockerfile and build args aren’t hardcoding x86_64-only artifacts (e.g., NVARCH, MSSQL repo files, OpenShift client arch mapping). This prevents arm64 pipeline red builds.
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (1)
36-40: Same platform identifier concern: ensure Konflux + generator alignment.

Mirrors the runtime pipeline. Please verify linux-m2xlarge/arm64 is the intended value and that the autogenerator emits it; otherwise this edit will be lost on regeneration. If your cluster expects linux/arm64, switch as below.
   - name: build-platforms
     value:
     - linux/x86_64
-    - linux-m2xlarge/arm64
+    - linux/arm64
You can reuse the script from the other comment to scan conventions and the generator.
jupyter/datascience/ubi9-python-3.12/pylock.toml (2)
2778-2784: No action required — dependents already exclude pydantic-core on s390x.
Verified by running the provided script against jupyter/datascience/ubi9-python-3.12/pylock.toml; output: OK: all dependents carry s390x marker.

3804-3810: ```shell
#!/bin/bash
set -euo pipefail

echo "PWD: $(pwd)"
echo

Find pylock.toml files and inspect occurrences of typing-inspection without using process substitution

echo "Searching for pylock.toml files..."
found=0
for f in $(find . -type f -name 'pylock.toml' 2>/dev/null); do
found=1
echo
echo "=== $f ==="
if command -v rg >/dev/null 2>&1; then
rg -n --hidden -S "typing-inspection|typing_inspection" "$f" -C4 || true
else
grep -nE "typing-inspection|typing_inspection" "$f" || true
fi

echo "--- parsed block(s) that mention typing-inspection ---"
python3 - "$f" <<'PY'
import sys,re
fname=sys.argv[1]
txt=open(fname,'rb').read().decode('utf-8', errors='replace')
blocks=txt.split("\n[[packages]]\n")
for b in blocks:
if 'typing-inspection' in b or 'typing_inspection' in b:
m=re.search(r'name\s*=\s*"([^"]+)"', b)
name=m.group(1) if m else ''
print("Block package:", name)
for line in b.splitlines():
if any(k in line for k in ('marker','requires','dependencies','version','name','requires =','dependencies =')):
print(" "+line.strip())
for line in b.splitlines():
if 'typing-inspection' in line or 'typing_inspection' in line:
print(" ->", line.strip())
print("--")
PY
done

if [ $found -eq 0 ]; then
echo "No pylock.toml files found."
exit 0
fi

echo
echo "Collecting package names whose pylock blocks reference typing-inspection (direct dependents)..."
dependents=$(python3 - <<'PY'
import re,glob
deps=set()
for fname in glob.glob("**/pylock.toml", recursive=True):
txt=open(fname,'rb').read().decode('utf-8', errors='replace')
blocks=txt.split("\n[[packages]]\n")
for b in blocks:
m=re.search(r'name\s*=\s*"([^"]+)"', b)
name=m.group(1) if m else None
if not name:
continue
# skip the typing-inspection package itself
if name in ("typing-inspection","typing_inspection"):
continue
# if typing-inspection appears anywhere in this block (dependencies/requires)
if 'typing-inspection' in b or 'typing_inspection' in b:
deps.add(name)
print("\n".join(sorted(deps)))
PY
)

if [ -z "$dependents" ]; then
echo "No dependent packages found in pylock files."
else
echo "Dependents found:"
printf "%s\n" "$dependents"
echo
echo "Searching for those dependents in top-level files (pyproject.toml / Pipfile / requirements*):"
IFS=$'\n'
for name in $dependents; do
echo
echo "== Top-level search for: $name =="
if command -v rg >/dev/null 2>&1; then
rg -n --hidden -S --glob '/pyproject.toml' --glob '/Pipfile' --glob '**/requirements*' "\b${name}\b" -C3 || true
else
grep -RIn --include=pyproject.toml --include=Pipfile --include="requirements*" "\b${name}\b" || true
fi
done
fi

echo
echo "Repository-wide occurrences of 'platform_machine' and 's390x':"
if command -v rg >/dev/null 2>&1; then
rg -n --hidden -S "platform_machine" -C2 || true
rg -n --hidden -S "s390x" -C2 || true
else
grep -RIn "platform_machine" || true
grep -RIn "s390x" || true
fi

echo
echo "Finished checks."
</blockquote></details>
<details>
<summary>runtimes/pytorch/ubi9-python-3.12/pyproject.toml (2)</summary><blockquote>

`69-74`: **Approve — ARM64 wheels for torch 2.6.0 and torchvision 0.21.0 confirmed**

curl output shows torch-2.6.0+cpu (cp311/cp312 manylinux_2_28_aarch64) and torchvision-0.21.0 (cp311/cp312 linux_aarch64) wheels present on the CPU index; index addition is fine.

---

`9-11`: **Blocker: torch pin will break aarch64 resolution — split per-arch like torchvision**

Unconditional `torch==2.6.0+cu126` forces the CUDA index on ARM and will fail (no aarch64 cu126 wheels). Mirror the per-arch approach you used for torchvision.




```diff
-    'torch==2.6.0+cu126',
+    'torch==2.6.0+cu126; platform_machine == "x86_64"',
+    'torch==2.6.0; platform_machine == "aarch64"',
⛔ Skipped due to learnings
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue #2148 was created addressing the mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. The issue includes comprehensive problem description covering affected files (runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda and pylock.toml), detailed solution with PyTorch index URL update from cu124 to cu126, lock regeneration steps using uv, clear acceptance criteria for wheel alignment verification, and proper context linking to PR #2145 review comment, assigned to jiridanek.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-08-27T19:02:33.353Z
Learning: Version alignment requested: Align TensorBoard pins across CUDA and ROCm PyTorch images (ROCm currently ~=2.18.0 vs CUDA ~=2.19.0) to reduce duplicate wheels and lock bloat; create dedicated GitHub issue with evidence and acceptance criteria.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue addresses mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. Comprehensive issue created with detailed problem description covering affected files, current state analysis, step-by-step solution including PyTorch index URL update and lock regeneration, clear acceptance criteria for cu126 wheel verification, and proper context linking to PR #2145 review comment.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-08-27T15:18:02.768Z
Learning: jiridanek requested GitHub issue creation for PyTorch CUDA variant explicit pinning investigation during PR #2145 review, specifically to explore when +cu124 suffixes are desirable versus index-only configuration. Issue #2156 was created with comprehensive investigation framework covering wheel selection behavior, deployment scenarios, risk analysis, repository pattern analysis, and clear acceptance criteria for establishing best practices across all PyTorch-enabled notebook environments.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-08-27T15:18:02.768Z
Learning: jiridanek requested GitHub issue creation for PyTorch CUDA variant explicit pinning investigation during PR #2145 review, specifically to explore when +cu124 suffixes are desirable versus index-only configuration. Issue #2154 was created with comprehensive investigation framework covering wheel selection behavior, deployment scenarios, risk analysis, repository pattern analysis, and clear acceptance criteria for establishing best practices across all PyTorch-enabled notebook environments.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1396
File: runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda:124-127
Timestamp: 2025-07-20T20:47:36.509Z
Learning: jiridanek identified that ARM64 wheels for h5py 3.14.0 are available on PyPI but being ignored due to AMD64-only dependency locking with --platform=linux/amd64. This causes unnecessary hdf5-devel package installation in ARM64 TensorFlow images when the ARM64 wheel h5py-3.14.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl could be used instead. The Pipfile.lock only contains 2 hashes for h5py, confirming limited platform consideration during lock generation.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: jupyter/trustyai/ubi9-python-3.12/Dockerfile.cpu:134-140
Timestamp: 2025-08-27T15:05:31.656Z
Learning: jiridanek requested GitHub issue creation for CUDA wheel optimization in TrustyAI CPU image during PR #2145 review. The CPU Dockerfile currently uses pylock.toml with CUDA-enabled PyTorch wheels (torch==2.6.0+cu126) which was previously discussed with harshad16 and grdryn but deferred. Issue created with comprehensive problem analysis covering unnecessary CUDA wheels in CPU-only image, multiple solution options including lock regeneration and separate CPU/CUDA files, clear acceptance criteria for wheel optimization verification, and proper context linking to PR #2145 review comment.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch/ubi9-python-3.12/pyproject.toml:6-11
Timestamp: 2025-08-27T15:45:10.946Z
Learning: jiridanek requested GitHub issue creation for missing CUDA runtime support in pytorch runtime during PR #2145 review. Issue #2161 was created addressing the missing CUDA 12.6 runtime libraries required by cu126 PyTorch wheels in runtimes/pytorch/ubi9-python-3.12/Dockerfile.cuda. The issue includes comprehensive problem description covering current CPU-only base image vs CUDA wheel requirements, three solution options (CUDA-enabled base image, install CUDA libraries, CPU wheels), clear acceptance criteria for GPU acceleration verification, reference to related issue #2148 for CUDA alignment context, and proper context linking to PR #2145 review comment, assigned to jiridanek.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-08-27T19:02:33.353Z
Learning: Opened issue #2169 to align TensorBoard pins (~=2.18.0 vs ~=2.19.0) across CUDA and ROCm PyTorch images; includes evidence from repo and acceptance criteria.
Learnt from: grdryn
PR: opendatahub-io/notebooks#1396
File: jupyter/tensorflow/ubi9-python-3.12/Pipfile:13-14
Timestamp: 2025-07-16T00:17:10.313Z
Learning: grdryn corrected CodeRabbit's false assessment about CUDA companion package wheel availability during PR #1396 review. The original analysis incorrectly checked all package releases instead of the specific versions that would be installed with tensorflow[and-cuda]~=2.19.0. The actual versions (nvidia-cudnn-cu12/9.3.0.75, nvidia-cuda-runtime-cu12/12.5.82, nvidia-cublas-cu12/12.5.3.2) do have aarch64 wheels available on PyPI, making the and-cuda extra compatible with arm64 builds.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1463
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:15-15
Timestamp: 2025-07-22T14:31:20.315Z
Learning: PyTorch 2.6.0 wheels with CUDA 12.4 support (torch-2.6.0+cu124) are available on the official PyTorch CUDA index at https://download.pytorch.org/whl/cu124/ for Python versions 3.9, 3.10, 3.11, 3.12, and 3.13 on both Linux x86_64 and Windows AMD64 platforms.
runtimes/pytorch/ubi9-python-3.12/pylock.toml (3)
2621-2651: PyNaCl 1.6.0 bump LGTM.

Wide wheel coverage (manylinux/musllinux, aarch64/x86_64, ABI3) is good for multi-arch builds. No concerns.

279-284: ```shell
#!/bin/bash
set -euo pipefail
echo "PWD: $(pwd)"
echo
echo "List runtimes/pytorch tree:"
ls -la runtimes/pytorch || true
echo
echo "Find all pylock.toml files:"
rg -n --hidden --files --glob '/pylock.toml' || true
echo
TARGET='runtimes/pytorch/ubi9-python-3.12/pylock.toml'
if [ -f "$TARGET" ]; then
echo "$TARGET exists: size=$(stat -c%s "$TARGET") bytes"
echo
echo "Search for 'cffi' occurrences in the target file with context:"
rg -n --hidden -S 'cffi' "$TARGET" -C5 || true
echo
LINE=$(rg -n 'cffi' "$TARGET" | head -n1 | cut -d: -f1 || true)
if [ -n "$LINE" ]; then
START=$(( LINE > 20 ? LINE-20 : 1 ))
echo "Printing lines ${START} to $((LINE+40)) from $TARGET:"
sed -n "${START},$((LINE+40))p" "$TARGET"
else
echo "No 'cffi' in $TARGET; printing first 120 lines:"
sed -n '1,120p' "$TARGET"
fi
else
echo "$TARGET not found"
fi
echo
echo "Search whole repo for explicit 'name = "cffi"' occurrences and wheel urls mentioning aarch64/arm64:"
rg -n --hidden -S 'name\s*=\s*"(?:cffi)"' -g '!/node_modules/' -C3 || true
rg -n --hidden -S 'aarch64|arm64|manylinux_aarch64|macosx_arm64|linux_aarch64' -g '/pylock.toml' -C3 || true
echo
echo "Check runtimes pyproject for cffi mention (if present):"
if [ -f runtimes/pytorch/ubi9-python-3.12/pyproject.toml ]; then
rg -n --hidden -S 'cffi' runtimes/pytorch/ubi9-python-3.12/pyproject.toml || true
else
echo "pyproject.toml not found at runtimes/pytorch/ubi9-python-3.12/pyproject.toml"
fi
---

`2387-2391`: **Overly complex marker; simplify and avoid manual edits.**

The marker mixes redundant checks and incorrectly compares implementation_name to 'PyPy' (sys.implementation.name is lower‑case, e.g. 'pypy'; platform_python_implementation returns 'PyPy'). ([peps.python.org](https://peps.python.org/pep-0421/?utm_source=openai))

Let the lock tool re-lock; if you must edit manually prefer either platform_python_implementation != 'PyPy' or implementation_name != 'pypy' to match the PEP semantics and reduce future diffs. 

> Likely an incorrect or invalid review comment.

</blockquote></details>

</blockquote></details>

</details>

coderabbitai · 2025-09-12T11:18:47Z

.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml

  - name: build-platforms
    value:
    - linux/x86_64
+    - linux-m2xlarge/arm64
  - name: dockerfile


💡 Verification agent

🧩 Analysis chain

Confirm platform key is valid in Konflux and in the generator (avoid drift).

The new entry uses linux-m2xlarge/arm64. Past convention here is linux/x86_64 (not linux/amd64). Please verify that Konflux accepts linux-m2xlarge/arm64 (vs a possible linux/arm64) and that scripts/generate_pull_request_pipelineruns.py has been updated so this autogen file won’t be overwritten back to single-arch.

If your Konflux expects plain linux/arm64, apply:

- name: build-platforms value: - linux/x86_64 - - linux-m2xlarge/arm64 + - linux/arm64

Run to confirm repo-wide conventions and generator support:

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail echo "Occurrences of linux-m2xlarge/arm64:" rg -n "linux-m2xlarge/arm64" -g ".tekton/**" -S || true echo echo "Other arm64 platform spellings in pipelines:" rg -n "linux.?/arm64" -g ".tekton/**" -S | sed 's/^/ /' || true echo echo "Check generator declares/validates build-platforms including arm64:" rg -n "build-platforms|platforms|arm64|x86_64" scripts/generate_pull_request_pipelineruns.py -S -C3 || true

Length of output: 2487

Replace linux-m2xlarge/arm64 with linux/arm64 — generator requires linux/arm64

scripts/generate_pull_request_pipelineruns.py defines allowed build_platforms (includes "linux/arm64") and populates the build-platforms param from that list; "linux-m2xlarge/arm64" will not be recognized and may be overwritten.

File: .tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (lines 36–40)

- name: build-platforms value: - linux/x86_64 - - linux-m2xlarge/arm64 + - linux/arm64

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: build-platforms

value:

- linux/x86_64

- linux-m2xlarge/arm64

- name: dockerfile

- name: build-platforms

value:

- linux/x86_64

- linux/arm64

- name: dockerfile

🤖 Prompt for AI Agents

.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml lines 36-40: the build-platforms list contains "linux-m2xlarge/arm64" which is not in the generator's allowed platforms and will be ignored/overwritten; change that entry to "linux/arm64" so it matches scripts/generate_pull_request_pipelineruns.py allowed build_platforms and the generator can populate the param correctly.

coderabbitai · 2025-09-12T11:18:48Z

jupyter/datascience/ubi9-python-3.12/pylock.toml

 [[packages]]
 name = "annotated-types"
 version = "0.7.0"
+marker = "platform_machine != 's390x'"
 sdist = { url = "https://files.pythonhosted.org/packages/ee/67/531ea369ba64dcff5ec9c3402f9f51bf748cec26dde048a2f973a4eea7f5/annotated_types-0.7.0.tar.gz", upload-time = 2024-05-20T21:33:25Z, size = 16081, hashes = { sha256 = "aff07c09a53a08bc8cfccb9c85b05f1aa9a2a6f23728d790723543408344ce89" } }
 wheels = [{ url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", upload-time = 2024-05-20T21:33:24Z, size = 13643, hashes = { sha256 = "1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53" } }]



🛠️ Refactor suggestion

Do not hand-edit pylock.toml; move the s390x marker to pyproject.toml and re-lock.

pylock.toml is uv-generated. Manual edits will be overwritten and break determinism. Put the platform marker in the source pyproject.toml (on the driving dependency) and regenerate the lock.

Apply this diff here (revert the manual edit):

marker = "platform_machine != 's390x'"

Then update the corresponding pyproject.toml, for example:

# in jupyter/datascience/ubi9-python-3.12/pyproject.toml [project.dependencies] # If pydantic drives annotated-types: pydantic = { version = ">=2,<3", markers = "platform_machine != 's390x'" }

Regenerate:

uv lock --locked-python 3.12

Commit the refreshed pylock.toml

I can draft the exact pyproject.toml changes once you confirm which top-level deps pull in annotated-types here.

coderabbitai · 2025-09-12T11:18:48Z

runtimes/pytorch/ubi9-python-3.12/pylock.toml

 name = "torchvision"
 version = "0.21.0+cu126"
+marker = "platform_machine == 'x86_64'"
 wheels = [
    { url = "https://download.pytorch.org/whl/cu126/torchvision-0.21.0%2Bcu126-cp310-cp310-linux_x86_64.whl", hashes = { sha256 = "db4369a89b866b319c8dd73931c3e5f314aa535f7035ae2336ce9a26d7ace15a" } },
    { url = "https://download.pytorch.org/whl/cu126/torchvision-0.21.0%2Bcu126-cp310-cp310-win_amd64.whl", hashes = { sha256 = "d6b23af252e8f4fc923d57efeab5aad7a33b6e15a72a119d576aa48ec1e0d924" } },


💡 Verification agent

🧩 Analysis chain

Ensure a matching torchvision CPU entry exists for aarch64, otherwise arm64 builds will fail.

You added an x86_64-only torchvision==0.21.0+cu126 with marker. Verify the lock also contains a second [[packages]] name = "torchvision" block for aarch64 (CPU wheel, likely 0.21.0 without CUDA) with marker = "platform_machine == 'aarch64'", aligned with the per-arch indices in pyproject.toml. If missing, re-run a multi-platform lock.

Use:

If you need to re-lock:

Ensure pyproject has:

torchvision==0.21.0+cu126; platform_machine == "x86_64" with index = pytorch-cuda

torchvision==0.21.0; platform_machine == "aarch64" with index = pytorch-cpu

Then run: uv lock --platform linux/x86_64 --platform linux/aarch64.

If desired, I can open a tracking issue and push the regenerated lock.

🏁 Script executed:

#!/bin/bash # Expect TWO torchvision blocks with distinct markers rg -nPU '(?s)\[\[packages\]\]\s+name\s*=\s*"torchvision".*?marker\s*=\s*"platform_machine == '\''x86_64'\''"' runtimes/pytorch/ubi9-python-3.12/pylock.toml >/dev/null && echo "x86_64 torchvision: OK" rg -nPU '(?s)\[\[packages\]\]\s+name\s*=\s*"torchvision".*?marker\s*=\s*"platform_machine == '\''aarch64'\''"' runtimes/pytorch/ubi9-python-3.12/pylock.toml >/dev/null && echo "aarch64 torchvision: OK" # Optional: confirm aarch64 entry is CPU (no +cu tag) rg -nPU '(?s)\[\[packages\]\]\s+name\s*=\s*"torchvision".*?marker\s*=\s*"platform_machine == '\''aarch64'\''".*?\n\s*version\s*=\s*"(0\.21\.0)(?!\+cu)' runtimes/pytorch/ubi9-python-3.12/pylock.toml && echo "aarch64 is CPU wheel"

Length of output: 107

🏁 Script executed:

#!/bin/bash set -euo pipefail file="runtimes/pytorch/ubi9-python-3.12/pylock.toml" if [ ! -f "$file" ]; then echo "MISSING: $file" exit 0 fi awk ' /^\[\[packages\]\]/ { if (block_name == "torchvision") { if (block_marker ~ /platform_machine == '\''x86_64'\''/) found_x=1 if (block_marker ~ /platform_machine == '\''aarch64'\''/) { found_a=1 if (block_version ~ /\+cu/) aarch64_pluscu=1 else aarch64_cpu=1 } } block_name=""; block_marker=""; block_version=""; next } { if ($0 ~ /^\s*name\s*=/) { if (match($0, /"([^"]+)"/, m)) block_name=m[1] } else if ($0 ~ /^\s*marker\s*=/) { if (match($0, /"([^"]+)"/, m)) block_marker=m[1] } else if ($0 ~ /^\s*version\s*=/) { if (match($0, /"([^"]+)"/, m)) block_version=m[1] } } END { if (block_name == "torchvision") { if (block_marker ~ /platform_machine == '\''x86_64'\''/) found_x=1 if (block_marker ~ /platform_machine == '\''aarch64'\''/) { found_a=1 if (block_version ~ /\+cu/) aarch64_pluscu=1 else aarch64_cpu=1 } } if (found_x) print "x86_64 torchvision: OK" if (found_a) { if (aarch64_cpu) print "aarch64 torchvision: OK (CPU wheel)" if (aarch64_pluscu) print "aarch64 torchvision: WARNING (CUDA wheel present)" } if (!found_x && !found_a) print "no torchvision blocks found" } ' "$file"

Length of output: 148

🏁 Script executed:

#!/bin/bash set -euo pipefail file="runtimes/pytorch/ubi9-python-3.12/pylock.toml" if [ ! -f "$file" ]; then echo "MISSING: $file" exit 0 fi python3 - <<'PY' import re f="runtimes/pytorch/ubi9-python-3.12/pylock.toml" blocks=[] curr=None with open(f, 'r', encoding='utf-8') as fh: for line in fh: if line.strip().startswith('[[packages]]'): if curr: blocks.append(curr) curr={'name':None,'version':None,'marker':None} continue if curr is None: continue m=re.match(r'\s*name\s*=\s*"([^"]+)"', line) if m: curr['name']=m.group(1) m=re.match(r'\s*version\s*=\s*"([^"]+)"', line) if m: curr['version']=m.group(1) m=re.match(r'\s*marker\s*=\s*"([^"]+)"', line) if m: curr['marker']=m.group(1) if curr: blocks.append(curr) found=False for b in blocks: if b.get('name') == 'torchvision': found=True marker = b.get('marker') or '<no marker>' version = b.get('version') or '<no version>' print(f"torchvision -> marker: {marker} ; version: {version}") if not found: print("NO_TORCHVISION_BLOCKS_FOUND") PY

Length of output: 75

Add missing aarch64 (CPU) torchvision entry and re-lock

runtimes/pytorch/ubi9-python-3.12/pylock.toml currently only contains an x86_64 torchvision (0.21.0+cu126); add a second [[packages]] block for platform_machine == 'aarch64' with version 0.21.0 (CPU wheel) and regenerate the lock (e.g. uv lock --platform linux/x86_64 --platform linux/aarch64).

🤖 Prompt for AI Agents

In runtimes/pytorch/ubi9-python-3.12/pylock.toml around lines 3302-3307 the file only defines an x86_64 torchvision package (0.21.0+cu126); add a second [[package]] block for platform_machine == 'aarch64' with name "torchvision", version "0.21.0" and the CPU wheel URL and its sha256 hash (the CPU wheel target for cp310/linux_aarch64), then regenerate the lockfile for both architectures (e.g. run: uv lock --platform linux/x86_64 --platform linux/aarch64) so the new aarch64 entry is populated and hashes/resolved dependencies are updated.

coderabbitai · 2025-09-12T11:18:48Z

runtimes/pytorch/ubi9-python-3.12/pyproject.toml

+torchvision = [
+            { index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" },
+            { index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }]


🛠️ Refactor suggestion

Also split uv source for torch per-arch to avoid using CUDA index on ARM

Without this, uv will still try pytorch-cuda for torch on aarch64.

-[tool.uv.sources] -torch = { index = "pytorch-cuda" } +[tool.uv.sources] +torch = [ + { index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" }, + { index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" } +] torchvision = [ { index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" }, { index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

torchvision = [

{ index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" },

{ index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }]

[tool.uv.sources]

torch = [

{ index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" },

{ index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }

]

torchvision = [

{ index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" },

{ index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }]

openshift-ci · 2025-09-12T11:21:44Z

@jiridanek: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/runtime-cuda-pt-ubi9-python-3-12-pr-image-mirror	`4b93f3d`	link	true	`/test runtime-cuda-pt-ubi9-python-3-12-pr-image-mirror`
ci/prow/notebook-jupyter-ds-ubi9-python-3-12-pr-image-mirror	`4b93f3d`	link	true	`/test notebook-jupyter-ds-ubi9-python-3-12-pr-image-mirror`
ci/prow/images	`4b93f3d`	link	true	`/test images`
ci/prow/notebooks-py312-ubi9-e2e-tests	`4b93f3d`	link	true	`/test notebooks-py312-ubi9-e2e-tests`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

atheo89 · 2025-09-16T11:35:39Z

@jiridanek i guess we can close this now, right?

openshift-merge-robot · 2025-09-16T11:35:49Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

arm64(cuda): try building remaining non-ROCm images for arm64

4b93f3d

github-actions bot added the review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel label Sep 12, 2025

openshift-ci bot requested review from daniellutz and dibryant September 12, 2025 11:09

openshift-ci bot added the size/m label Sep 12, 2025

ide-developer approved these changes Sep 12, 2025

View reviewed changes

openshift-ci bot assigned ide-developer Sep 12, 2025

openshift-ci bot added the lgtm label Sep 12, 2025

openshift-ci bot added size/m and removed size/m labels Sep 12, 2025

coderabbitai bot reviewed Sep 12, 2025

View reviewed changes

openshift-merge-robot added the needs-rebase label Sep 16, 2025

jiridanek marked this pull request as draft September 16, 2025 16:07

openshift-ci bot added the do-not-merge/work-in-progress label Sep 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

arm64(cuda): try building remaining non-ROCm images for arm64 #2439

arm64(cuda): try building remaining non-ROCm images for arm64 #2439

Uh oh!

jiridanek commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 12, 2025 •

edited

Loading

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

coderabbitai bot left a comment

Find pylock.toml files and inspect occurrences of typing-inspection without using process substitution

Uh oh!

coderabbitai bot Sep 12, 2025

Uh oh!

coderabbitai bot Sep 12, 2025

Uh oh!

coderabbitai bot Sep 12, 2025

Uh oh!

coderabbitai bot Sep 12, 2025

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

atheo89 commented Sep 16, 2025

Uh oh!

openshift-merge-robot commented Sep 16, 2025

Uh oh!

Uh oh!

arm64(cuda): try building remaining non-ROCm images for arm64 #2439

Are you sure you want to change the base?

arm64(cuda): try building remaining non-ROCm images for arm64 #2439

Uh oh!

Conversation

jiridanek commented Sep 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Merge criteria:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks (2 passed, 1 warning)

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Find pylock.toml files and inspect occurrences of typing-inspection without using process substitution

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

atheo89 commented Sep 16, 2025

Uh oh!

openshift-merge-robot commented Sep 16, 2025

Uh oh!

Uh oh!

jiridanek commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 12, 2025 •

edited

Loading