-
Notifications
You must be signed in to change notification settings - Fork 113
arm64(cuda): try building remaining non-ROCm images for arm64 #2439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds arm64 build-platforms to two Tekton PipelineRun YAMLs. Updates jupyter and PyTorch runtime pylock.toml constraints (markers), bumps pynacl to 1.6.0 with new wheels, adds a torchvision x86_64 marker, and splits torchvision dependency and sources per architecture in pyproject.toml, adding a new CPU UV index. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ide-developer The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (1)
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (1)
14-16
: Noisy trigger paths still present (tracked separately).The on‑cel‑expression still lists unrelated Minimal/DataScience paths. This was already tracked for cleanup in a prior issue; just noting it remains unchanged here.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
-
.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
(1 hunks) -
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
(1 hunks) -
jupyter/datascience/ubi9-python-3.12/pylock.toml
(3 hunks) -
runtimes/pytorch/ubi9-python-3.12/pylock.toml
(4 hunks) -
runtimes/pytorch/ubi9-python-3.12/pyproject.toml
(2 hunks)
🧰 Additional context used
🧠 Learnings (12)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-08-05T17:24:08.616Z
Learning: jiridanek requested PR review for #1521 covering s390x architecture support improvements, demonstrating continued focus on systematic multi-architecture compatibility enhancements in the opendatahub-io/notebooks repository through clean implementation with centralized configuration, proper CI integration, and architecture-aware testing patterns.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1320
File: jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda:42-52
Timestamp: 2025-07-09T12:29:56.162Z
Learning: jiridanek requested GitHub issue creation for OpenShift client architecture mapping problem affecting 29 Dockerfiles during PR #1320 review. Issue was created with comprehensive analysis covering all affected files using $(uname -m) returning 'aarch64' but OpenShift mirror expecting 'arm64', systematic solution using BuildKit TARGETARCH mapping with proper amd64→x86_64 and arm64→arm64 conversion, detailed acceptance criteria, and implementation guidance, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/rocm-tensorflow/ubi9-python-3.12/Dockerfile.rocm:50-50
Timestamp: 2025-07-08T19:30:01.738Z
Learning: jiridanek requested GitHub issue creation for multi-architecture support in ROCm TensorFlow image during PR #1333 review. Issue #1346 was created with comprehensive problem description covering hardcoded x86_64 architecture breaking multi-arch support, detailed impact analysis, three solution options (runtime detection, BuildKit TARGETARCH integration, hybrid approach) with pros/cons analysis, comprehensive acceptance criteria covering core requirements and testing, phased implementation guidance, related files identification, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/service.yaml:5-15
Timestamp: 2025-07-02T18:59:15.788Z
Learning: jiridanek creates targeted GitHub issues for specific test quality improvements identified during PR reviews in opendatahub-io/notebooks. Issue #1268 demonstrates this by converting a review comment about insufficient tf2onnx conversion test validation into a comprehensive improvement plan with clear acceptance criteria, code examples, and ROCm-specific context.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1909
File: runtimes/pytorch+llmcompressor/ubi9-python-3.11/Dockerfile.cuda:11-15
Timestamp: 2025-08-12T08:40:55.286Z
Learning: jiridanek requested GitHub issue creation for redundant CUDA upgrade optimization during PR #1909 review. Analysis revealed all 14 CUDA Dockerfiles contain redundant `yum upgrade -y` commands in cuda-base stages that execute after base stages already performed comprehensive `dnf upgrade` via pre-upgrade blocks, causing unnecessary CI latency and build inefficiency. Issue includes complete scope analysis with specific line numbers, investigation framework requiring NVIDIA upstream documentation review, multiple solution options, comprehensive acceptance criteria covering systematic testing and performance measurement, and proper context linking to PR #1909 review comment.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1909
File: runtimes/pytorch+llmcompressor/ubi9-python-3.11/Dockerfile.cuda:11-15
Timestamp: 2025-08-12T08:40:55.286Z
Learning: jiridanek requested GitHub issue creation for redundant CUDA upgrade optimization during PR #1909 review. Issue covers duplicate yum/dnf upgrade commands in cuda-base stages that execute after base stages already performed comprehensive upgrades, causing unnecessary CI latency and build inefficiency across multiple CUDA Dockerfiles. The solution requires investigating NVIDIA upstream documentation requirements before removing redundant upgrades, with systematic testing of all CUDA variants and performance measurement. Issue follows established pattern of comprehensive problem analysis, multiple solution options, detailed acceptance criteria, and proper context linking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1995
File: jupyter/rocm/tensorflow/ubi9-python-3.11/Dockerfile.rocm:0-0
Timestamp: 2025-08-19T06:31:05.424Z
Learning: jiridanek requested GitHub issue creation for ROCm device detection test during PR #1995 review. The issue covers comprehensive test requirements for validating that device_lib.list_local_devices() successfully detects ROCm/HIP devices after ROCm package optimizations, including multiple solution options (pytest, shell script, notebook-based), detailed acceptance criteria for core requirements and test coverage, implementation guidance for test framework selection and CI integration, example test code for TensorFlow device detection validation, and proper context linking to PR #1995 and review comment, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda:38-38
Timestamp: 2025-07-08T19:30:20.513Z
Learning: jiridanek requested GitHub issue creation for multi-architecture support in TensorFlow CUDA runtime image during PR #1333 review. Issue was created with comprehensive problem description covering hardcoded NVARCH limitation, multiple solution options using TARGETARCH build argument with architecture mapping, acceptance criteria for multi-architecture builds, implementation guidance with code examples, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1320
File: jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda:65-66
Timestamp: 2025-07-09T12:31:02.033Z
Learning: jiridanek requested GitHub issue creation for MSSQL repo file hardcoding problem during PR #1320 review. Issue #1363 was created and updated with comprehensive problem description covering hardcoded x86_64 MSSQL repo files breaking multi-architecture builds across 10 affected Dockerfiles (including datascience, CUDA, ROCm, and TrustyAI variants), detailed root cause analysis, three solution options with code examples, clear acceptance criteria for all image types, implementation guidance following established multi-architecture patterns, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/rocm-tensorflow/ubi9-python-3.12/pyproject.toml:75-77
Timestamp: 2025-08-27T15:33:28.871Z
Learning: jiridanek requested GitHub issue creation for multi-architecture environment exploration during PR #2145 review, specifically to investigate architecture-specific environments for ROCm TensorFlow wheel optimization across x86_64, aarch64, ppc64le, and s390x architectures, noting uncertainty about implementation benefits but wanting systematic exploration through issue tracking. Issue #2158 was created with comprehensive analysis covering current limitations, investigation areas, multiple solution options, and clear acceptance criteria.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2180
File: .tekton/odh-base-image-cuda-py312-ubi9-pull-request.yaml:36-39
Timestamp: 2025-08-28T12:42:23.404Z
Learning: Konflux build system in opendatahub-io/notebooks repository requires "linux/x86_64" platform identifier format for Tekton PipelineRun build-platforms parameter, not the OCI-standard "linux/amd64". This is consistently used across all .tekton/ pipeline files and enforced by scripts/generate_pull_request_pipelineruns.py type definitions. Docker/buildah contexts use "linux/amd64" while Konflux/Tekton contexts use "linux/x86_64".
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1396
File: runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda:124-127
Timestamp: 2025-07-20T20:47:36.509Z
Learning: jiridanek identified that ARM64 wheels for h5py 3.14.0 are available on PyPI but being ignored due to AMD64-only dependency locking with --platform=linux/amd64. This causes unnecessary hdf5-devel package installation in ARM64 TensorFlow images when the ARM64 wheel h5py-3.14.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl could be used instead. The Pipfile.lock only contains 2 hashes for h5py, confirming limited platform consideration during lock generation.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue #2148 was created addressing the mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. The issue includes comprehensive problem description covering affected files (runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda and pylock.toml), detailed solution with PyTorch index URL update from cu124 to cu126, lock regeneration steps using uv, clear acceptance criteria for wheel alignment verification, and proper context linking to PR #2145 review comment, assigned to jiridanek.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue addresses mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. Comprehensive issue created with detailed problem description covering affected files, current state analysis, step-by-step solution including PyTorch index URL update and lock regeneration, clear acceptance criteria for cu126 wheel verification, and proper context linking to PR #2145 review comment.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2000
File: runtimes/rocm-pytorch/ubi9-python-3.11/requirements.txt:807-809
Timestamp: 2025-08-20T12:39:54.827Z
Learning: Issue #2055 tracks the remaining runtimes/rocm-tensorflow/ubi9-python-3.12/requirements.txt file that stayed at jupyter-core==5.7.2 during PR #2000, which will be addressed separately due to specific challenges with that ROCm TensorFlow image.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/pylock.toml:110-110
Timestamp: 2025-08-29T15:18:43.229Z
Learning: When uv compiles dependencies with platform restrictions (like `platform_machine != 'ppc64le'`), it automatically propagates these markers to transitive dependencies in the generated lock files. This is expected behavior and doesn't require manual intervention in source files.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2185
File: jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda:109-117
Timestamp: 2025-08-29T08:48:55.985Z
Learning: jiridanek prefers to implement systematic cleanup improvements through dedicated GitHub issues (like #2076) rather than applying individual point fixes during PR reviews, maintaining consistency with their established pattern of comprehensive code quality improvements.
📚 Learning: 2025-08-28T12:42:23.404Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2180
File: .tekton/odh-base-image-cuda-py312-ubi9-pull-request.yaml:36-39
Timestamp: 2025-08-28T12:42:23.404Z
Learning: Konflux build system in opendatahub-io/notebooks repository requires "linux/x86_64" platform identifier format for Tekton PipelineRun build-platforms parameter, not the OCI-standard "linux/amd64". This is consistently used across all .tekton/ pipeline files and enforced by scripts/generate_pull_request_pipelineruns.py type definitions. Docker/buildah contexts use "linux/amd64" while Konflux/Tekton contexts use "linux/x86_64".
Applied to files:
.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
📚 Learning: 2025-08-27T14:49:24.112Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue #2148 was created addressing the mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. The issue includes comprehensive problem description covering affected files (runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda and pylock.toml), detailed solution with PyTorch index URL update from cu124 to cu126, lock regeneration steps using uv, clear acceptance criteria for wheel alignment verification, and proper context linking to PR #2145 review comment, assigned to jiridanek.
Applied to files:
.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
runtimes/pytorch/ubi9-python-3.12/pyproject.toml
runtimes/pytorch/ubi9-python-3.12/pylock.toml
📚 Learning: 2025-08-27T14:49:24.112Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue addresses mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. Comprehensive issue created with detailed problem description covering affected files, current state analysis, step-by-step solution including PyTorch index URL update and lock regeneration, clear acceptance criteria for cu126 wheel verification, and proper context linking to PR #2145 review comment.
Applied to files:
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
runtimes/pytorch/ubi9-python-3.12/pyproject.toml
runtimes/pytorch/ubi9-python-3.12/pylock.toml
📚 Learning: 2025-09-05T12:10:28.916Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2265
File: .tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml:16-16
Timestamp: 2025-09-05T12:10:28.916Z
Learning: jiridanek requested GitHub issue creation for trigger path cleanup in pytorch+llmcompressor pipeline during PR #2265 review. The issue addresses copy-paste errors where irrelevant Minimal/DataScience trigger paths were included in the pytorch+llmcompressor pipeline on-cel-expression, causing unnecessary pipeline triggers. Solution involves removing unrelated paths and keeping only pytorch+llmcompressor-specific paths, build-args/cuda.conf, jupyter/utils, and the pipeline YAML itself, with comprehensive acceptance criteria and proper context linking.
Applied to files:
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
📚 Learning: 2025-09-12T09:51:55.382Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-09-12T09:51:55.382Z
Learning: jiridanek identified orphaned TensorFlow ROCm Python 3.12 entries in manifests/base/params-latest.env during PR #2103 review. The params file references odh-workbench-jupyter-tensorflow-rocm-py312-ubi9 and odh-pipeline-runtime-tensorflow-rocm-py312-ubi9 images with 2025a-v1.35 tags, but the corresponding source directories (jupyter/rocm/tensorflow/ubi9-python-3.12/ and runtimes/rocm-tensorflow/ubi9-python-3.12/) don't exist in the repository, creating a broken build situation where Tekton pipelines exist but have no source to build from.
Applied to files:
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
📚 Learning: 2025-08-29T15:18:43.229Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/pylock.toml:110-110
Timestamp: 2025-08-29T15:18:43.229Z
Learning: The file `runtimes/datascience/ubi9-python-3.12/pylock.toml` is a generated file and should not be edited directly. Changes should be made to the source pyproject.toml file instead.
Applied to files:
jupyter/datascience/ubi9-python-3.12/pylock.toml
runtimes/pytorch/ubi9-python-3.12/pyproject.toml
runtimes/pytorch/ubi9-python-3.12/pylock.toml
📚 Learning: 2025-09-12T08:38:43.685Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2432
File: jupyter/datascience/ubi9-python-3.12/pylock.toml:105-111
Timestamp: 2025-09-12T08:38:43.685Z
Learning: The pylock.toml file in jupyter/datascience/ubi9-python-3.12/ is autogenerated by uv and should not be manually edited. Changes should be made to the source files like pyproject.toml instead.
Applied to files:
jupyter/datascience/ubi9-python-3.12/pylock.toml
📚 Learning: 2025-09-02T07:55:31.824Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2233
File: tests/test_main.py:46-56
Timestamp: 2025-09-02T07:55:31.824Z
Learning: jiridanek deferred a suggestion for enhanced pyproject.toml/pylock.toml validation in PR #2233 tests/test_main.py. The current test implementation works fine, but future enhancements may need: name normalization using canonicalize_name() to handle case/separator mismatches, environment markers evaluation to skip dependencies not applicable to the image's Python/OS, and handling non-PEP440 versions from VCS/URL dependencies by checking if locked_version exists before asserting specifier.contains(). The deferred implementation includes evaluating markers against image environment (_env = {"python_version": python, "sys_platform": "linux", "platform_system": "Linux"}) and using tomllib.load() with binary mode for consistency.
Applied to files:
jupyter/datascience/ubi9-python-3.12/pylock.toml
📚 Learning: 2025-09-05T10:05:35.575Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1513
File: runtimes/datascience/ubi9-python-3.12/pylock.toml:180-180
Timestamp: 2025-09-05T10:05:35.575Z
Learning: In Python lock files for the datascience runtime, both bcrypt and paramiko packages are excluded from s390x platform using the marker "platform_machine != 's390x'" due to compatibility issues on IBM System z mainframe architecture.
Applied to files:
jupyter/datascience/ubi9-python-3.12/pylock.toml
📚 Learning: 2025-07-20T20:47:36.509Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1396
File: runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda:124-127
Timestamp: 2025-07-20T20:47:36.509Z
Learning: jiridanek identified that ARM64 wheels for h5py 3.14.0 are available on PyPI but being ignored due to AMD64-only dependency locking with --platform=linux/amd64. This causes unnecessary hdf5-devel package installation in ARM64 TensorFlow images when the ARM64 wheel h5py-3.14.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl could be used instead. The Pipfile.lock only contains 2 hashes for h5py, confirming limited platform consideration during lock generation.
Applied to files:
jupyter/datascience/ubi9-python-3.12/pylock.toml
📚 Learning: 2025-08-29T15:17:41.893Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/pylock.toml:10-10
Timestamp: 2025-08-29T15:17:41.893Z
Learning: Files named pylock.toml in the notebooks repository are autogenerated by uv from pyproject.toml files and should not be directly modified. Changes should be made to the source pyproject.toml file instead.
Applied to files:
jupyter/datascience/ubi9-python-3.12/pylock.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
- GitHub Check: build (runtime-cuda-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
- GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
- GitHub Check: Red Hat Konflux / odh-workbench-jupyter-datascience-cpu-py312-ubi9-on-pull-request
- GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-cuda-py312-ubi9-on-pull-request
- GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
🔇 Additional comments (9)
.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (1)
36-40
: Build readiness check for aarch64 CUDA image.Since this is a CUDA runtime, double‑check the Dockerfile and build args aren’t hardcoding x86_64-only artifacts (e.g., NVARCH, MSSQL repo files, OpenShift client arch mapping). This prevents arm64 pipeline red builds.
.tekton/odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (1)
36-40
: Same platform identifier concern: ensure Konflux + generator alignment.Mirrors the runtime pipeline. Please verify
linux-m2xlarge/arm64
is the intended value and that the autogenerator emits it; otherwise this edit will be lost on regeneration. If your cluster expectslinux/arm64
, switch as below.- name: build-platforms value: - linux/x86_64 - - linux-m2xlarge/arm64 + - linux/arm64You can reuse the script from the other comment to scan conventions and the generator.
jupyter/datascience/ubi9-python-3.12/pylock.toml (2)
2778-2784
: No action required — dependents already exclude pydantic-core on s390x.
Verified by running the provided script against jupyter/datascience/ubi9-python-3.12/pylock.toml; output: OK: all dependents carry s390x marker.
3804-3810
: ```shell
#!/bin/bash
set -euo pipefailecho "PWD: $(pwd)"
echoFind pylock.toml files and inspect occurrences of typing-inspection without using process substitution
echo "Searching for pylock.toml files..."
found=0
for f in $(find . -type f -name 'pylock.toml' 2>/dev/null); do
found=1
echo
echo "=== $f ==="
if command -v rg >/dev/null 2>&1; then
rg -n --hidden -S "typing-inspection|typing_inspection" "$f" -C4 || true
else
grep -nE "typing-inspection|typing_inspection" "$f" || true
fiecho "--- parsed block(s) that mention typing-inspection ---"
python3 - "$f" <<'PY'
import sys,re
fname=sys.argv[1]
txt=open(fname,'rb').read().decode('utf-8', errors='replace')
blocks=txt.split("\n[[packages]]\n")
for b in blocks:
if 'typing-inspection' in b or 'typing_inspection' in b:
m=re.search(r'name\s*=\s*"([^"]+)"', b)
name=m.group(1) if m else ''
print("Block package:", name)
for line in b.splitlines():
if any(k in line for k in ('marker','requires','dependencies','version','name','requires =','dependencies =')):
print(" "+line.strip())
for line in b.splitlines():
if 'typing-inspection' in line or 'typing_inspection' in line:
print(" ->", line.strip())
print("--")
PY
doneif [ $found -eq 0 ]; then
echo "No pylock.toml files found."
exit 0
fiecho
echo "Collecting package names whose pylock blocks reference typing-inspection (direct dependents)..."
dependents=$(python3 - <<'PY'
import re,glob
deps=set()
for fname in glob.glob("**/pylock.toml", recursive=True):
txt=open(fname,'rb').read().decode('utf-8', errors='replace')
blocks=txt.split("\n[[packages]]\n")
for b in blocks:
m=re.search(r'name\s*=\s*"([^"]+)"', b)
name=m.group(1) if m else None
if not name:
continue
# skip the typing-inspection package itself
if name in ("typing-inspection","typing_inspection"):
continue
# if typing-inspection appears anywhere in this block (dependencies/requires)
if 'typing-inspection' in b or 'typing_inspection' in b:
deps.add(name)
print("\n".join(sorted(deps)))
PY
)if [ -z "$dependents" ]; then
echo "No dependent packages found in pylock files."
else
echo "Dependents found:"
printf "%s\n" "$dependents"
echo
echo "Searching for those dependents in top-level files (pyproject.toml / Pipfile / requirements*):"
IFS=$'\n'
for name in $dependents; do
echo
echo "== Top-level search for: $name =="
if command -v rg >/dev/null 2>&1; then
rg -n --hidden -S --glob '/pyproject.toml' --glob '/Pipfile' --glob '**/requirements*' "\b${name}\b" -C3 || true
else
grep -RIn --include=pyproject.toml --include=Pipfile --include="requirements*" "\b${name}\b" || true
fi
done
fiecho
echo "Repository-wide occurrences of 'platform_machine' and 's390x':"
if command -v rg >/dev/null 2>&1; then
rg -n --hidden -S "platform_machine" -C2 || true
rg -n --hidden -S "s390x" -C2 || true
else
grep -RIn "platform_machine" || true
grep -RIn "s390x" || true
fiecho
echo "Finished checks."</blockquote></details> <details> <summary>runtimes/pytorch/ubi9-python-3.12/pyproject.toml (2)</summary><blockquote> `69-74`: **Approve — ARM64 wheels for torch 2.6.0 and torchvision 0.21.0 confirmed** curl output shows torch-2.6.0+cpu (cp311/cp312 manylinux_2_28_aarch64) and torchvision-0.21.0 (cp311/cp312 linux_aarch64) wheels present on the CPU index; index addition is fine. --- `9-11`: **Blocker: torch pin will break aarch64 resolution — split per-arch like torchvision** Unconditional `torch==2.6.0+cu126` forces the CUDA index on ARM and will fail (no aarch64 cu126 wheels). Mirror the per-arch approach you used for torchvision. ```diff - 'torch==2.6.0+cu126', + 'torch==2.6.0+cu126; platform_machine == "x86_64"', + 'torch==2.6.0; platform_machine == "aarch64"',
⛔ Skipped due to learnings
Learnt from: jiridanek PR: opendatahub-io/notebooks#2145 File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159 Timestamp: 2025-08-27T14:49:24.112Z Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue #2148 was created addressing the mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. The issue includes comprehensive problem description covering affected files (runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda and pylock.toml), detailed solution with PyTorch index URL update from cu124 to cu126, lock regeneration steps using uv, clear acceptance criteria for wheel alignment verification, and proper context linking to PR #2145 review comment, assigned to jiridanek.
Learnt from: jiridanek PR: opendatahub-io/notebooks#0 File: :0-0 Timestamp: 2025-08-27T19:02:33.353Z Learning: Version alignment requested: Align TensorBoard pins across CUDA and ROCm PyTorch images (ROCm currently ~=2.18.0 vs CUDA ~=2.19.0) to reduce duplicate wheels and lock bloat; create dedicated GitHub issue with evidence and acceptance criteria.
Learnt from: jiridanek PR: opendatahub-io/notebooks#2145 File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159 Timestamp: 2025-08-27T14:49:24.112Z Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue addresses mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. Comprehensive issue created with detailed problem description covering affected files, current state analysis, step-by-step solution including PyTorch index URL update and lock regeneration, clear acceptance criteria for cu126 wheel verification, and proper context linking to PR #2145 review comment.
Learnt from: jiridanek PR: opendatahub-io/notebooks#0 File: :0-0 Timestamp: 2025-08-27T15:18:02.768Z Learning: jiridanek requested GitHub issue creation for PyTorch CUDA variant explicit pinning investigation during PR #2145 review, specifically to explore when +cu124 suffixes are desirable versus index-only configuration. Issue #2156 was created with comprehensive investigation framework covering wheel selection behavior, deployment scenarios, risk analysis, repository pattern analysis, and clear acceptance criteria for establishing best practices across all PyTorch-enabled notebook environments.
Learnt from: jiridanek PR: opendatahub-io/notebooks#0 File: :0-0 Timestamp: 2025-08-27T15:18:02.768Z Learning: jiridanek requested GitHub issue creation for PyTorch CUDA variant explicit pinning investigation during PR #2145 review, specifically to explore when +cu124 suffixes are desirable versus index-only configuration. Issue #2154 was created with comprehensive investigation framework covering wheel selection behavior, deployment scenarios, risk analysis, repository pattern analysis, and clear acceptance criteria for establishing best practices across all PyTorch-enabled notebook environments.
Learnt from: jiridanek PR: opendatahub-io/notebooks#1396 File: runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda:124-127 Timestamp: 2025-07-20T20:47:36.509Z Learning: jiridanek identified that ARM64 wheels for h5py 3.14.0 are available on PyPI but being ignored due to AMD64-only dependency locking with --platform=linux/amd64. This causes unnecessary hdf5-devel package installation in ARM64 TensorFlow images when the ARM64 wheel h5py-3.14.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl could be used instead. The Pipfile.lock only contains 2 hashes for h5py, confirming limited platform consideration during lock generation.
Learnt from: jiridanek PR: opendatahub-io/notebooks#2145 File: jupyter/trustyai/ubi9-python-3.12/Dockerfile.cpu:134-140 Timestamp: 2025-08-27T15:05:31.656Z Learning: jiridanek requested GitHub issue creation for CUDA wheel optimization in TrustyAI CPU image during PR #2145 review. The CPU Dockerfile currently uses pylock.toml with CUDA-enabled PyTorch wheels (torch==2.6.0+cu126) which was previously discussed with harshad16 and grdryn but deferred. Issue created with comprehensive problem analysis covering unnecessary CUDA wheels in CPU-only image, multiple solution options including lock regeneration and separate CPU/CUDA files, clear acceptance criteria for wheel optimization verification, and proper context linking to PR #2145 review comment.
Learnt from: jiridanek PR: opendatahub-io/notebooks#2145 File: runtimes/pytorch/ubi9-python-3.12/pyproject.toml:6-11 Timestamp: 2025-08-27T15:45:10.946Z Learning: jiridanek requested GitHub issue creation for missing CUDA runtime support in pytorch runtime during PR #2145 review. Issue #2161 was created addressing the missing CUDA 12.6 runtime libraries required by cu126 PyTorch wheels in runtimes/pytorch/ubi9-python-3.12/Dockerfile.cuda. The issue includes comprehensive problem description covering current CPU-only base image vs CUDA wheel requirements, three solution options (CUDA-enabled base image, install CUDA libraries, CPU wheels), clear acceptance criteria for GPU acceleration verification, reference to related issue #2148 for CUDA alignment context, and proper context linking to PR #2145 review comment, assigned to jiridanek.
Learnt from: jiridanek PR: opendatahub-io/notebooks#0 File: :0-0 Timestamp: 2025-08-27T19:02:33.353Z Learning: Opened issue #2169 to align TensorBoard pins (~=2.18.0 vs ~=2.19.0) across CUDA and ROCm PyTorch images; includes evidence from repo and acceptance criteria.
Learnt from: grdryn PR: opendatahub-io/notebooks#1396 File: jupyter/tensorflow/ubi9-python-3.12/Pipfile:13-14 Timestamp: 2025-07-16T00:17:10.313Z Learning: grdryn corrected CodeRabbit's false assessment about CUDA companion package wheel availability during PR #1396 review. The original analysis incorrectly checked all package releases instead of the specific versions that would be installed with tensorflow[and-cuda]~=2.19.0. The actual versions (nvidia-cudnn-cu12/9.3.0.75, nvidia-cuda-runtime-cu12/12.5.82, nvidia-cublas-cu12/12.5.3.2) do have aarch64 wheels available on PyPI, making the and-cuda extra compatible with arm64 builds.
Learnt from: jiridanek PR: opendatahub-io/notebooks#1463 File: jupyter/trustyai/ubi9-python-3.11/Pipfile:15-15 Timestamp: 2025-07-22T14:31:20.315Z Learning: PyTorch 2.6.0 wheels with CUDA 12.4 support (torch-2.6.0+cu124) are available on the official PyTorch CUDA index at https://download.pytorch.org/whl/cu124/ for Python versions 3.9, 3.10, 3.11, 3.12, and 3.13 on both Linux x86_64 and Windows AMD64 platforms.
runtimes/pytorch/ubi9-python-3.12/pylock.toml (3)
2621-2651
: PyNaCl 1.6.0 bump LGTM.Wide wheel coverage (manylinux/musllinux, aarch64/x86_64, ABI3) is good for multi-arch builds. No concerns.
279-284
: ```shell
#!/bin/bash
set -euo pipefail
echo "PWD: $(pwd)"
echo
echo "List runtimes/pytorch tree:"
ls -la runtimes/pytorch || true
echo
echo "Find all pylock.toml files:"
rg -n --hidden --files --glob '/pylock.toml' || true
echo
TARGET='runtimes/pytorch/ubi9-python-3.12/pylock.toml'
if [ -f "$TARGET" ]; then
echo "$TARGET exists: size=$(stat -c%s "$TARGET") bytes"
echo
echo "Search for 'cffi' occurrences in the target file with context:"
rg -n --hidden -S 'cffi' "$TARGET" -C5 || true
echo
LINE=$(rg -n 'cffi' "$TARGET" | head -n1 | cut -d: -f1 || true)
if [ -n "$LINE" ]; then
START=$(( LINE > 20 ? LINE-20 : 1 ))
echo "Printing lines${START} to $ ((LINE+40)) from $TARGET:"
sed -n "${START},$((LINE+40))p" "$TARGET"
else
echo "No 'cffi' in $TARGET; printing first 120 lines:"
sed -n '1,120p' "$TARGET"
fi
else
echo "$TARGET not found"
fi
echo
echo "Search whole repo for explicit 'name = "cffi"' occurrences and wheel urls mentioning aarch64/arm64:"
rg -n --hidden -S 'name\s*=\s*"(?:cffi)"' -g '!/node_modules/' -C3 || true
rg -n --hidden -S 'aarch64|arm64|manylinux_aarch64|macosx_arm64|linux_aarch64' -g '/pylock.toml' -C3 || true
echo
echo "Check runtimes pyproject for cffi mention (if present):"
if [ -f runtimes/pytorch/ubi9-python-3.12/pyproject.toml ]; then
rg -n --hidden -S 'cffi' runtimes/pytorch/ubi9-python-3.12/pyproject.toml || true
else
echo "pyproject.toml not found at runtimes/pytorch/ubi9-python-3.12/pyproject.toml"
fi--- `2387-2391`: **Overly complex marker; simplify and avoid manual edits.** The marker mixes redundant checks and incorrectly compares implementation_name to 'PyPy' (sys.implementation.name is lower‑case, e.g. 'pypy'; platform_python_implementation returns 'PyPy'). ([peps.python.org](https://peps.python.org/pep-0421/?utm_source=openai)) Let the lock tool re-lock; if you must edit manually prefer either platform_python_implementation != 'PyPy' or implementation_name != 'pypy' to match the PEP semantics and reduce future diffs. > Likely an incorrect or invalid review comment. </blockquote></details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
- name: build-platforms | ||
value: | ||
- linux/x86_64 | ||
- linux-m2xlarge/arm64 | ||
- name: dockerfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Confirm platform key is valid in Konflux and in the generator (avoid drift).
The new entry uses linux-m2xlarge/arm64
. Past convention here is linux/x86_64
(not linux/amd64
). Please verify that Konflux accepts linux-m2xlarge/arm64
(vs a possible linux/arm64
) and that scripts/generate_pull_request_pipelineruns.py
has been updated so this autogen file won’t be overwritten back to single-arch.
If your Konflux expects plain linux/arm64
, apply:
- name: build-platforms
value:
- linux/x86_64
- - linux-m2xlarge/arm64
+ - linux/arm64
Run to confirm repo-wide conventions and generator support:
🏁 Script executed:
#!/usr/bin/env bash
set -euo pipefail
echo "Occurrences of linux-m2xlarge/arm64:"
rg -n "linux-m2xlarge/arm64" -g ".tekton/**" -S || true
echo
echo "Other arm64 platform spellings in pipelines:"
rg -n "linux.?/arm64" -g ".tekton/**" -S | sed 's/^/ /' || true
echo
echo "Check generator declares/validates build-platforms including arm64:"
rg -n "build-platforms|platforms|arm64|x86_64" scripts/generate_pull_request_pipelineruns.py -S -C3 || true
Length of output: 2487
Replace linux-m2xlarge/arm64 with linux/arm64 — generator requires linux/arm64
scripts/generate_pull_request_pipelineruns.py defines allowed build_platforms (includes "linux/arm64") and populates the build-platforms param from that list; "linux-m2xlarge/arm64" will not be recognized and may be overwritten.
File: .tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml (lines 36–40)
- name: build-platforms
value:
- linux/x86_64
- - linux-m2xlarge/arm64
+ - linux/arm64
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
- name: build-platforms | |
value: | |
- linux/x86_64 | |
- linux-m2xlarge/arm64 | |
- name: dockerfile | |
- name: build-platforms | |
value: | |
- linux/x86_64 | |
- linux/arm64 | |
- name: dockerfile |
🤖 Prompt for AI Agents
.tekton/odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-pull-request.yaml
lines 36-40: the build-platforms list contains "linux-m2xlarge/arm64" which is
not in the generator's allowed platforms and will be ignored/overwritten; change
that entry to "linux/arm64" so it matches
scripts/generate_pull_request_pipelineruns.py allowed build_platforms and the
generator can populate the param correctly.
[[packages]] | ||
name = "annotated-types" | ||
version = "0.7.0" | ||
marker = "platform_machine != 's390x'" | ||
sdist = { url = "https://files.pythonhosted.org/packages/ee/67/531ea369ba64dcff5ec9c3402f9f51bf748cec26dde048a2f973a4eea7f5/annotated_types-0.7.0.tar.gz", upload-time = 2024-05-20T21:33:25Z, size = 16081, hashes = { sha256 = "aff07c09a53a08bc8cfccb9c85b05f1aa9a2a6f23728d790723543408344ce89" } } | ||
wheels = [{ url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", upload-time = 2024-05-20T21:33:24Z, size = 13643, hashes = { sha256 = "1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53" } }] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Do not hand-edit pylock.toml; move the s390x marker to pyproject.toml and re-lock.
pylock.toml is uv-generated. Manual edits will be overwritten and break determinism. Put the platform marker in the source pyproject.toml (on the driving dependency) and regenerate the lock.
Apply this diff here (revert the manual edit):
marker = "platform_machine != 's390x'"
Then update the corresponding pyproject.toml, for example:
# in jupyter/datascience/ubi9-python-3.12/pyproject.toml
[project.dependencies]
# If pydantic drives annotated-types:
pydantic = { version = ">=2,<3", markers = "platform_machine != 's390x'" }
Regenerate:
- uv lock --locked-python 3.12
- Commit the refreshed pylock.toml
I can draft the exact pyproject.toml changes once you confirm which top-level deps pull in annotated-types here.
name = "torchvision" | ||
version = "0.21.0+cu126" | ||
marker = "platform_machine == 'x86_64'" | ||
wheels = [ | ||
{ url = "https://download.pytorch.org/whl/cu126/torchvision-0.21.0%2Bcu126-cp310-cp310-linux_x86_64.whl", hashes = { sha256 = "db4369a89b866b319c8dd73931c3e5f314aa535f7035ae2336ce9a26d7ace15a" } }, | ||
{ url = "https://download.pytorch.org/whl/cu126/torchvision-0.21.0%2Bcu126-cp310-cp310-win_amd64.whl", hashes = { sha256 = "d6b23af252e8f4fc923d57efeab5aad7a33b6e15a72a119d576aa48ec1e0d924" } }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Ensure a matching torchvision CPU entry exists for aarch64, otherwise arm64 builds will fail.
You added an x86_64-only torchvision==0.21.0+cu126
with marker. Verify the lock also contains a second [[packages]] name = "torchvision"
block for aarch64 (CPU wheel, likely 0.21.0
without CUDA) with marker = "platform_machine == 'aarch64'"
, aligned with the per-arch indices in pyproject.toml. If missing, re-run a multi-platform lock.
Use:
If you need to re-lock:
- Ensure pyproject has:
- torchvision==0.21.0+cu126; platform_machine == "x86_64" with index = pytorch-cuda
- torchvision==0.21.0; platform_machine == "aarch64" with index = pytorch-cpu
- Then run:
uv lock --platform linux/x86_64 --platform linux/aarch64
.
If desired, I can open a tracking issue and push the regenerated lock.
🏁 Script executed:
#!/bin/bash
# Expect TWO torchvision blocks with distinct markers
rg -nPU '(?s)\[\[packages\]\]\s+name\s*=\s*"torchvision".*?marker\s*=\s*"platform_machine == '\''x86_64'\''"' runtimes/pytorch/ubi9-python-3.12/pylock.toml >/dev/null && echo "x86_64 torchvision: OK"
rg -nPU '(?s)\[\[packages\]\]\s+name\s*=\s*"torchvision".*?marker\s*=\s*"platform_machine == '\''aarch64'\''"' runtimes/pytorch/ubi9-python-3.12/pylock.toml >/dev/null && echo "aarch64 torchvision: OK"
# Optional: confirm aarch64 entry is CPU (no +cu tag)
rg -nPU '(?s)\[\[packages\]\]\s+name\s*=\s*"torchvision".*?marker\s*=\s*"platform_machine == '\''aarch64'\''".*?\n\s*version\s*=\s*"(0\.21\.0)(?!\+cu)' runtimes/pytorch/ubi9-python-3.12/pylock.toml && echo "aarch64 is CPU wheel"
Length of output: 107
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="runtimes/pytorch/ubi9-python-3.12/pylock.toml"
if [ ! -f "$file" ]; then
echo "MISSING: $file"
exit 0
fi
awk '
/^\[\[packages\]\]/ {
if (block_name == "torchvision") {
if (block_marker ~ /platform_machine == '\''x86_64'\''/) found_x=1
if (block_marker ~ /platform_machine == '\''aarch64'\''/) {
found_a=1
if (block_version ~ /\+cu/) aarch64_pluscu=1
else aarch64_cpu=1
}
}
block_name=""; block_marker=""; block_version=""; next
}
{
if ($0 ~ /^\s*name\s*=/) {
if (match($0, /"([^"]+)"/, m)) block_name=m[1]
} else if ($0 ~ /^\s*marker\s*=/) {
if (match($0, /"([^"]+)"/, m)) block_marker=m[1]
} else if ($0 ~ /^\s*version\s*=/) {
if (match($0, /"([^"]+)"/, m)) block_version=m[1]
}
}
END {
if (block_name == "torchvision") {
if (block_marker ~ /platform_machine == '\''x86_64'\''/) found_x=1
if (block_marker ~ /platform_machine == '\''aarch64'\''/) {
found_a=1
if (block_version ~ /\+cu/) aarch64_pluscu=1
else aarch64_cpu=1
}
}
if (found_x) print "x86_64 torchvision: OK"
if (found_a) {
if (aarch64_cpu) print "aarch64 torchvision: OK (CPU wheel)"
if (aarch64_pluscu) print "aarch64 torchvision: WARNING (CUDA wheel present)"
}
if (!found_x && !found_a) print "no torchvision blocks found"
}
' "$file"
Length of output: 148
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="runtimes/pytorch/ubi9-python-3.12/pylock.toml"
if [ ! -f "$file" ]; then
echo "MISSING: $file"
exit 0
fi
python3 - <<'PY'
import re
f="runtimes/pytorch/ubi9-python-3.12/pylock.toml"
blocks=[]
curr=None
with open(f, 'r', encoding='utf-8') as fh:
for line in fh:
if line.strip().startswith('[[packages]]'):
if curr: blocks.append(curr)
curr={'name':None,'version':None,'marker':None}
continue
if curr is None:
continue
m=re.match(r'\s*name\s*=\s*"([^"]+)"', line)
if m: curr['name']=m.group(1)
m=re.match(r'\s*version\s*=\s*"([^"]+)"', line)
if m: curr['version']=m.group(1)
m=re.match(r'\s*marker\s*=\s*"([^"]+)"', line)
if m: curr['marker']=m.group(1)
if curr: blocks.append(curr)
found=False
for b in blocks:
if b.get('name') == 'torchvision':
found=True
marker = b.get('marker') or '<no marker>'
version = b.get('version') or '<no version>'
print(f"torchvision -> marker: {marker} ; version: {version}")
if not found:
print("NO_TORCHVISION_BLOCKS_FOUND")
PY
Length of output: 75
Add missing aarch64 (CPU) torchvision entry and re-lock
runtimes/pytorch/ubi9-python-3.12/pylock.toml currently only contains an x86_64 torchvision (0.21.0+cu126); add a second [[packages]] block for platform_machine == 'aarch64' with version 0.21.0 (CPU wheel) and regenerate the lock (e.g. uv lock --platform linux/x86_64 --platform linux/aarch64).
🤖 Prompt for AI Agents
In runtimes/pytorch/ubi9-python-3.12/pylock.toml around lines 3302-3307 the file
only defines an x86_64 torchvision package (0.21.0+cu126); add a second
[[package]] block for platform_machine == 'aarch64' with name "torchvision",
version "0.21.0" and the CPU wheel URL and its sha256 hash (the CPU wheel target
for cp310/linux_aarch64), then regenerate the lockfile for both architectures
(e.g. run: uv lock --platform linux/x86_64 --platform linux/aarch64) so the new
aarch64 entry is populated and hashes/resolved dependencies are updated.
torchvision = [ | ||
{ index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" }, | ||
{ index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Also split uv source for torch per-arch to avoid using CUDA index on ARM
Without this, uv will still try pytorch-cuda
for torch on aarch64.
-[tool.uv.sources]
-torch = { index = "pytorch-cuda" }
+[tool.uv.sources]
+torch = [
+ { index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" },
+ { index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }
+]
torchvision = [
{ index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" },
{ index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
torchvision = [ | |
{ index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" }, | |
{ index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }] | |
[tool.uv.sources] | |
torch = [ | |
{ index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" }, | |
{ index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" } | |
] | |
torchvision = [ | |
{ index = "pytorch-cuda", marker = "platform_machine == 'x86_64'" }, | |
{ index = "pytorch-cpu", marker = "platform_machine == 'aarch64'" }] |
@jiridanek: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@jiridanek i guess we can close this now, right? |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Description
How Has This Been Tested?
Self checklist (all need to be checked):
make test
(gmake
on macOS) before asking for reviewDockerfile.konflux
files should be done inodh/notebooks
and automatically synced torhds/notebooks
. For Konflux-specific changes, modifyDockerfile.konflux
files directly inrhds/notebooks
as these require special attention in the downstream repository and flow to the upcoming RHOAI release.Merge criteria:
Summary by CodeRabbit
New Features
Chores