RHOAIENG-27434: create ROCm Tensorflow Python 3.12 Image #1259

dibryant · 2025-07-02T14:14:36Z

Fixes for https://issues.redhat.com/browse/RHOAIENG-27434

Description

Create Rocm Tensorflow Python 3.12

How Has This Been Tested?

$ make rocm-jupyter-tensorflow-ubi9-python-3.12 \
    -e RELEASE_PYTHON_VERSION=3.12 \
    -e IMAGE_REGISTRY="quay.io/$USER/workbench-images" \
    -e RELEASE="2025b" \
    -e CONTAINER_BUILD_CACHE_ARGS="" \
    -e PUSH_IMAGES="no"
$ export QUAY_IMAGE=$(podman images --format "{{.Repository}}:{{.Tag}}" | head -1)
$ podman run -it -p 8888:8888 $QUAY_IMAGE

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

New Features
- Added a ROCm-enabled TensorFlow Jupyter image for UBI9 with Python 3.12, Kubernetes manifests (Service/StatefulSet/kustomize), DB/MSSQL tooling, JupyterLab addons, PDF export and Elyra integration.
Chores
- Added Pipfiles/requirements, pinned ROCm TensorFlow wheel and related packages, bumped multiple dependency versions, adjusted Makefile targets, and removed a problematic pip install constraint.
Tests
- Added a notebook test suite validating Python/TensorFlow versions, tf2onnx conversion, MNIST training/evaluation, and TensorBoard logging.

coderabbitai · 2025-07-02T14:14:43Z

Walkthrough

Adds a new ROCm-enabled multi-stage Jupyter TensorFlow image for UBI9/Python 3.12, with Pipfile/requirements updates, Kustomize manifests, a test notebook, and Makefile changes to enable building the new images.

Changes

Cohort / File(s)	Summary of changes
ROCm TF Jupyter 3.12 (new image & tests) `jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm`, `jupyter/rocm/tensorflow/ubi9-python-3.12/Pipfile`, `jupyter/rocm/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb`	New multi-stage ROCm Jupyter Dockerfile (mongocli-builder, base, rocm-base, rocm-jupyter-* stages), Pipfile with explicit tensorflow-rocm cp312 wheel and pinned deps, and a test notebook exercising TF/ONNX/TensorBoard and version checks.
Kustomize deployment manifests `jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/kustomization.yaml`, `.../service.yaml`, `.../statefulset.yaml`	New kustomize base, ClusterIP Service (port 8888), and StatefulSet (single-replica notebook container, probes, NOTEBOOK_ARGS env, resource requests/limits) targeting the new image.
Makefile build orchestration `Makefile`	Enables building rocm runtime/jupyter TensorFlow images for Python 3.12 (adds/uncomments BASE_DIRS and all-images entries) and fixes a comment typo.
ROCm TensorFlow runtime updates (3.12) `runtimes/rocm-tensorflow/ubi9-python-3.12/Pipfile`, `.../requirements.txt`, `runtimes/rocm-tensorflow/ubi9-python-3.12/Dockerfile.rocm`	Pipfile: switch tensorflow-rocm to explicit 2.18.1 wheel URL and bump tensorboard; fix tf2onnx spacing. requirements.txt: multiple package version/hash updates. Dockerfile: remove --build-constraints option from uv pip install (comment added).
ROCm TensorFlow 3.11 formatting fixes `jupyter/rocm/tensorflow/ubi9-python-3.11/Pipfile`, `runtimes/rocm-tensorflow/ubi9-python-3.11/Pipfile`	Minor formatting: remove space in tf2onnx "~=" version spec.
Datascience 3.12 dependency bumps `jupyter/datascience/ubi9-python-3.12/Pipfile`, `jupyter/datascience/ubi9-python-3.12/requirements.txt`	Bump `kubeflow-training` 1.9.2 → 1.9.3 and update associated sha256 hashes (two occurrences).
Runtime deps pin updates `runtimes/rocm-tensorflow/ubi9-python-3.12/requirements.txt`	Multiple package version upgrades and updated hashes (aiohttp, certifi, debugpy, distlib, fonttools, fsspec, h5py hashes expanded, jsonschema, minio, narwhals, pyarrow, virtualenv).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Systematic fix: ROCm TensorFlow kustomization files incorrectly reference PyTorch image tags #1267 — Kustomize image tag mismatch: kustomization.yaml references a PyTorch image tag (copy-paste) that likely needs correction to the rocm-jupyter-tensorflow image.
Enable cuda-rstudio-c9s-python build target for Python 3.12 #1302 — Makefile targets for Python 3.12: this PR uncomments/adds Python 3.12 ROCm image build targets, matching the issue's scope.
tensorflow_rocm package lacks Python 3.12 wheel support causing build failures #1354 — TensorFlow wheel availability: this PR replaces the tensorflow-rocm spec with an explicit cp312 wheel (2.18.1), addressing the reported wheel/build failure.

Possibly related PRs

RHOAIENG-21690: chore(Dockerfiles): augument micropipenv with uv for faster package installs #968 — Related packaging/install pattern: both add uv alongside micropipenv and switch installs to uv pip install for similar ROCm Jupyter/TensorFlow images.
[2024b] RHOAIENG-30247 chore(deps): Bump codeflare-sdk version from 0.26.0 to 0.28.0 #1489 — Dependency bump overlap: both PRs modify codeflare-sdk dependency versions in Pipfile(s).

Suggested labels

review-requested

Suggested reviewers

daniellutz

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c83cdda and 26b399f.

📒 Files selected for processing (1)

jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: build (rocm-runtime-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (rocm-jupyter-tensorflow-ubi9-python-3.11, 3.11, linux/amd64, false) / build
GitHub Check: build (rocm-runtime-tensorflow-ubi9-python-3.11, 3.11, linux/amd64, false) / build
GitHub Check: build (rocm-jupyter-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-rocm-py311-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-rocm-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-rocm-py311-ubi9-on-pull-request

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 11

🧹 Nitpick comments (1)

jupyter/rocm/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb (1)

48-53: tf2onnx conversion test needs more comprehensive validation.

The current test only checks that the conversion result is not None, but doesn't validate the actual ONNX model structure or functionality.

     def test_tf2onnx_conversion(self):
-        # Replace this with an actual TensorFlow model conversion using tf2onnx
         model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(10,))])
-        onnx_model = tf2onnx.convert.from_keras(model)
+        onnx_model, _ = tf2onnx.convert.from_keras(model)
         
         self.assertTrue(onnx_model is not None)
+        # Verify ONNX model has expected structure
+        self.assertTrue(hasattr(onnx_model, 'graph'), "ONNX model should have a graph")
+        self.assertTrue(len(onnx_model.graph.node) > 0, "ONNX model should have nodes")

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bce04f9 and dc5976f.

⛔ Files ignored due to path filters (1)

jupyter/rocm/tensorflow/ubi9-python-3.12/Pipfile.lock is excluded by !**/*.lock

📒 Files selected for processing (6)

jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm (1 hunks)
jupyter/rocm/tensorflow/ubi9-python-3.12/Pipfile (1 hunks)
jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/kustomization.yaml (1 hunks)
jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/service.yaml (1 hunks)
jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml (1 hunks)
jupyter/rocm/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb (1 hunks)

🧰 Additional context used

🧠 Learnings (7)

📓 Common learnings

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: In the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytorch/ubi9-python-3.11/kustomize/base/kustomization.yaml` file that causes rocm runtime tests to fail with "no such file or directory" error. This is tracked in JIRA RHOAIENG-22044 and was intended to be fixed in PR #1015.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/minimal/ubi9-python-3.12/Dockerfile.cpu:21-24
Timestamp: 2025-07-01T06:48:13.154Z
Learning: jiridanek creates comprehensive follow-up issues from review comments that expand scope appropriately, include clear acceptance criteria, proper backlinks, and structured implementation guidance. Issue #1241 demonstrates this by turning a specific oc client checksum concern into a thorough security enhancement plan covering all downloaded binaries across the Python 3.12 implementation.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1247
File: .github/workflows/build-notebooks-TEMPLATE.yaml:50-53
Timestamp: 2025-07-01T14:36:52.852Z
Learning: In the opendatahub-io/notebooks repository, the test runner's Python version (configured in GitHub Actions UV setup) intentionally doesn't need to match the Python version of the container images being tested. jiridanek's team uses Python 3.12 for running tests while images may use different Python versions (like 3.11), and this approach works fine since the test code is separate from the application code running inside the containers.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-20T11:51:59.716Z
Learning: This project follows the practice of associating PRs with Jira tickets from https://issues.redhat.com for traceability between requirements, release process, and product documentation. This is critical for enterprise software development compliance and cross-team coordination.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: TrustyAI's jupyter-bokeh was pinned to 3.0.5 due to compatibility requirements with TrustyAI's visualization components, but the actual deployed version in requirements.txt shows 3.0.7, indicating incremental testing. The upgrade to 4.0.5 in this PR represents the completion of a gradual migration strategy from the 3.x series after confirming compatibility with Bokeh 3.7.3.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml:11-17
Timestamp: 2025-07-01T06:50:37.115Z
Learning: jiridanek manages StatefulSet selector issues systematically across multiple images in opendatahub-io/notebooks. When the same configuration issue (empty spec.selector and template.metadata.labels) appears in different images like jupyter/minimal and jupyter/tensorflow, he tracks them under a single coordinated effort rather than creating duplicate issues for each affected image.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/pod-patch.yaml:11-22
Timestamp: 2025-06-30T14:36:53.890Z
Learning: The pod-patch.yaml file in jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/ is used only for running tests, not production deployments. This affects the risk assessment for resource management configurations like sizeLimit on emptyDir volumes.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/minimal/ubi9-python-3.12/Dockerfile.rocm:43-55
Timestamp: 2025-07-01T06:48:21.070Z
Learning: When security concerns are raised during PR reviews in opendatahub-io/notebooks, comprehensive follow-up issues are created (often by CodeRabbit) to track all related security enhancements with clear acceptance criteria and implementation guidance. This ensures security improvements are systematically addressed in dedicated efforts rather than blocking current deliverables.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: jiridanek's team uses containerized dependency locking for cross-platform compatibility in opendatahub-io/notebooks. They run `pipenv lock` inside UBI9 containers with specific platform arguments (`--platform=linux/amd64 --python-version 3.12`) to avoid host OS dependency conflicts when generating Pipfile.lock files.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: In the opendatahub-io/notebooks repository, TensorFlow packages with `extras = ["and-cuda"]` can cause build conflicts on macOS due to platform-specific CUDA packages. When the Dockerfile installs CUDA system-wide, removing the extras and letting TensorFlow find CUDA at runtime resolves these conflicts.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: The jupyter-bokeh package was previously pinned to version 3.0.5 in the TrustyAI notebook image due to compatibility requirements with TrustyAI components, as indicated by the comment "Should be pinned down to this version in order to be compatible with trustyai" that was removed in this update.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/datascience/ubi9-python-3.11/Pipfile:34-36
Timestamp: 2025-06-28T14:13:27.890Z
Learning: In the opendatahub-io/notebooks repository, the dependency pinning strategy follows a deliberate pattern: core `jupyterlab` package uses exact pinning (==) across all notebook images to ensure UI consistency, while JupyterLab extensions and all server components (jupyter-server, jupyter-server-proxy, jupyter-server-terminals) use compatible release (~=) pinning to allow automatic security updates and bug fixes while maintaining API compatibility.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1197
File: runtimes/minimal/ubi9-python-3.11/requirements.txt:395-405
Timestamp: 2025-06-26T15:28:35.416Z
Learning: psutil version 7.x is compatible with UBI9, CentOS Stream 9, and RHEL 9 platforms in the opendatahub-io/notebooks repository. The upgrade from psutil 5.x to 7.x has been validated for these environments.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:21:09.429Z
Learning: TrustyAI 0.6.1 (latest version as of June 2025) has a hard dependency constraint on jupyter-bokeh~=3.0.5, preventing upgrades to jupyter-bokeh 4.x in notebook images that include TrustyAI. This requires either waiting for TrustyAI to update their dependency or excluding TrustyAI from jupyter-bokeh upgrades.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/base/statefulset.yaml:54-60
Timestamp: 2025-06-30T14:43:08.138Z
Learning: Issue #1212 in opendatahub-io/notebooks demonstrates that missing securityContext configuration (allowPrivilegeEscalation, runAsNonRoot, seccompProfile) causes runtime pods to fail reaching ready state and timeout after 300s on OpenShift due to PodSecurity policy violations.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: Runtime deployment tests in opendatahub-io/notebooks may show PodSecurity warnings about allowPrivilegeEscalation, capabilities, runAsNonRoot, and seccompProfile settings. These warnings occur on OpenShift but not on GitHub Actions because GitHub Actions uses upstream Kubernetes without SecurityContextConstraints (SCC).

jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/kustomization.yaml (5)

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: In the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytorch/ubi9-python-3.11/kustomize/base/kustomization.yaml` file that causes rocm runtime tests to fail with "no such file or directory" error. This is tracked in JIRA RHOAIENG-22044 and was intended to be fixed in PR #1015.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/pod-patch.yaml:11-22
Timestamp: 2025-06-30T14:36:53.890Z
Learning: The pod-patch.yaml file in jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/ is used only for running tests, not production deployments. This affects the risk assessment for resource management configurations like sizeLimit on emptyDir volumes.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml:11-17
Timestamp: 2025-07-01T06:50:37.115Z
Learning: jiridanek manages StatefulSet selector issues systematically across multiple images in opendatahub-io/notebooks. When the same configuration issue (empty spec.selector and template.metadata.labels) appears in different images like jupyter/minimal and jupyter/tensorflow, he tracks them under a single coordinated effort rather than creating duplicate issues for each affected image.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/rocm/pytorch/ubi9-python-3.11/requirements.txt:987-989
Timestamp: 2025-06-28T15:06:03.344Z
Learning: In the opendatahub-io/notebooks repository, checks for broken links and missing files in kustomization manifests are already performed by `ci/kustomize.sh`, which is invoked from `.github/workflows/code-quality.yaml` (lines 112–116). No additional pytest is needed for this purpose.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml:11-17
Timestamp: 2025-07-01T06:50:37.115Z
Learning: StatefulSet selector issue with empty `spec.selector: {}` is a known systemic problem across multiple images in opendatahub-io/notebooks repository, tracked in issue #1236. This affects multiple StatefulSet manifests and is being addressed centrally rather than in individual PRs.

jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/service.yaml (3)

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: In the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytorch/ubi9-python-3.11/kustomize/base/kustomization.yaml` file that causes rocm runtime tests to fail with "no such file or directory" error. This is tracked in JIRA RHOAIENG-22044 and was intended to be fixed in PR #1015.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml:11-17
Timestamp: 2025-07-01T06:50:37.115Z
Learning: jiridanek manages StatefulSet selector issues systematically across multiple images in opendatahub-io/notebooks. When the same configuration issue (empty spec.selector and template.metadata.labels) appears in different images like jupyter/minimal and jupyter/tensorflow, he tracks them under a single coordinated effort rather than creating duplicate issues for each affected image.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/pod-patch.yaml:11-22
Timestamp: 2025-06-30T14:36:53.890Z
Learning: The pod-patch.yaml file in jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/ is used only for running tests, not production deployments. This affects the risk assessment for resource management configurations like sizeLimit on emptyDir volumes.

jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml (7)

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T07:11:44.903Z
Learning: Resource limits in StatefulSet manifests in opendatahub-io/notebooks are configured for testing purposes, not production deployments. This affects risk assessment when reviewing resource configurations like memory and CPU limits.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml:11-17
Timestamp: 2025-07-01T06:50:37.115Z
Learning: jiridanek manages StatefulSet selector issues systematically across multiple images in opendatahub-io/notebooks. When the same configuration issue (empty spec.selector and template.metadata.labels) appears in different images like jupyter/minimal and jupyter/tensorflow, he tracks them under a single coordinated effort rather than creating duplicate issues for each affected image.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: In the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytorch/ubi9-python-3.11/kustomize/base/kustomization.yaml` file that causes rocm runtime tests to fail with "no such file or directory" error. This is tracked in JIRA RHOAIENG-22044 and was intended to be fixed in PR #1015.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml:11-17
Timestamp: 2025-07-01T06:50:37.115Z
Learning: StatefulSet selector issue with empty `spec.selector: {}` is a known systemic problem across multiple images in opendatahub-io/notebooks repository, tracked in issue #1236. This affects multiple StatefulSet manifests and is being addressed centrally rather than in individual PRs.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/pod-patch.yaml:11-22
Timestamp: 2025-06-30T14:36:53.890Z
Learning: The pod-patch.yaml file in jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/ is used only for running tests, not production deployments. This affects the risk assessment for resource management configurations like sizeLimit on emptyDir volumes.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/base/statefulset.yaml:54-60
Timestamp: 2025-06-30T14:43:08.138Z
Learning: Issue #1212 in opendatahub-io/notebooks demonstrates that missing securityContext configuration (allowPrivilegeEscalation, runAsNonRoot, seccompProfile) causes runtime pods to fail reaching ready state and timeout after 300s on OpenShift due to PodSecurity policy violations.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: Runtime deployment tests in opendatahub-io/notebooks may show PodSecurity warnings about allowPrivilegeEscalation, capabilities, runAsNonRoot, and seccompProfile settings. These warnings occur on OpenShift but not on GitHub Actions because GitHub Actions uses upstream Kubernetes without SecurityContextConstraints (SCC).

jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm (9)

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: In the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytorch/ubi9-python-3.11/kustomize/base/kustomization.yaml` file that causes rocm runtime tests to fail with "no such file or directory" error. This is tracked in JIRA RHOAIENG-22044 and was intended to be fixed in PR #1015.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: In the opendatahub-io/notebooks repository, TensorFlow packages with `extras = ["and-cuda"]` can cause build conflicts on macOS due to platform-specific CUDA packages. When the Dockerfile installs CUDA system-wide, removing the extras and letting TensorFlow find CUDA at runtime resolves these conflicts.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: jiridanek's team uses containerized dependency locking for cross-platform compatibility in opendatahub-io/notebooks. They run `pipenv lock` inside UBI9 containers with specific platform arguments (`--platform=linux/amd64 --python-version 3.12`) to avoid host OS dependency conflicts when generating Pipfile.lock files.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml:11-17
Timestamp: 2025-07-01T06:50:37.115Z
Learning: jiridanek manages StatefulSet selector issues systematically across multiple images in opendatahub-io/notebooks. When the same configuration issue (empty spec.selector and template.metadata.labels) appears in different images like jupyter/minimal and jupyter/tensorflow, he tracks them under a single coordinated effort rather than creating duplicate issues for each affected image.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1154
File: manifests/base/jupyter-pytorch-notebook-imagestream.yaml:0-0
Timestamp: 2025-06-16T11:06:33.139Z
Learning: In the opendatahub-io/notebooks repository, N-1 versions of images in manifest files (like imagestream.yaml files) should not be updated regularly. The versions of packages like codeflare-sdk in N-1 images are frozen to what was released when the image was moved from N to N-1 version. N-1 images are only updated for security vulnerabilities of packages, not for regular version bumps. This is why the version of packages in N-1 images may be quite old compared to the latest N version.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/pod-patch.yaml:11-22
Timestamp: 2025-06-30T14:35:34.805Z
Learning: In the opendatahub-io/notebooks repository, mounting emptyDir volumes over /opt/app-root/src is intentional behavior that matches production deployment patterns where odh-dashboard mounts empty PVCs at this location (the $HOME directory). This mounting is expected to hide base image content.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/base/statefulset.yaml:54-60
Timestamp: 2025-06-30T14:43:08.138Z
Learning: Issue #1212 in opendatahub-io/notebooks demonstrates that missing securityContext configuration (allowPrivilegeEscalation, runAsNonRoot, seccompProfile) causes runtime pods to fail reaching ready state and timeout after 300s on OpenShift due to PodSecurity policy violations.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: Runtime deployment tests in opendatahub-io/notebooks may show PodSecurity warnings about allowPrivilegeEscalation, capabilities, runAsNonRoot, and seccompProfile settings. These warnings occur on OpenShift but not on GitHub Actions because GitHub Actions uses upstream Kubernetes without SecurityContextConstraints (SCC).

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: OpenShift CI infrastructure issues in opendatahub-io/notebooks can manifest as "ContainerFailed one or more containers exited" errors in release steps, or as "Entrypoint received interrupt: terminated" messages when pods are killed during CI runs. These are typically infrastructure-level issues rather than code problems.

jupyter/rocm/tensorflow/ubi9-python-3.12/Pipfile (7)

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: In the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytorch/ubi9-python-3.11/kustomize/base/kustomization.yaml` file that causes rocm runtime tests to fail with "no such file or directory" error. This is tracked in JIRA RHOAIENG-22044 and was intended to be fixed in PR #1015.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: jiridanek's team uses containerized dependency locking for cross-platform compatibility in opendatahub-io/notebooks. They run `pipenv lock` inside UBI9 containers with specific platform arguments (`--platform=linux/amd64 --python-version 3.12`) to avoid host OS dependency conflicts when generating Pipfile.lock files.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/datascience/ubi9-python-3.11/Pipfile:34-36
Timestamp: 2025-06-28T14:13:27.890Z
Learning: In the opendatahub-io/notebooks repository, the dependency pinning strategy follows a deliberate pattern: core `jupyterlab` package uses exact pinning (==) across all notebook images to ensure UI consistency, while JupyterLab extensions and all server components (jupyter-server, jupyter-server-proxy, jupyter-server-terminals) use compatible release (~=) pinning to allow automatic security updates and bug fixes while maintaining API compatibility.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: The jupyter-bokeh package was previously pinned to version 3.0.5 in the TrustyAI notebook image due to compatibility requirements with TrustyAI components, as indicated by the comment "Should be pinned down to this version in order to be compatible with trustyai" that was removed in this update.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: In the opendatahub-io/notebooks repository, TensorFlow packages with `extras = ["and-cuda"]` can cause build conflicts on macOS due to platform-specific CUDA packages. When the Dockerfile installs CUDA system-wide, removing the extras and letting TensorFlow find CUDA at runtime resolves these conflicts.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:21:09.429Z
Learning: The jupyter-bokeh pinning to 3.0.5 in TrustyAI notebook image was not due to TrustyAI code compatibility issues, but because the trustyai package itself explicitly declares jupyter-bokeh~=3.0.5 as a hard dependency, causing pip dependency resolution conflicts when trying to upgrade to jupyter-bokeh 4.x.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:21:09.429Z
Learning: TrustyAI explicitly declares jupyter-bokeh~=3.0.5 as a hard dependency in both their requirements.txt and pyproject.toml files, with no open issues tracking jupyter-bokeh 4.x compatibility. This creates an unresolvable pip dependency conflict when trying to upgrade jupyter-bokeh to 4.x in notebook images that include TrustyAI.

jupyter/rocm/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb (7)

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: In the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytorch/ubi9-python-3.11/kustomize/base/kustomization.yaml` file that causes rocm runtime tests to fail with "no such file or directory" error. This is tracked in JIRA RHOAIENG-22044 and was intended to be fixed in PR #1015.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1247
File: .github/workflows/build-notebooks-TEMPLATE.yaml:50-53
Timestamp: 2025-07-01T14:36:52.852Z
Learning: In the opendatahub-io/notebooks repository, the test runner's Python version (configured in GitHub Actions UV setup) intentionally doesn't need to match the Python version of the container images being tested. jiridanek's team uses Python 3.12 for running tests while images may use different Python versions (like 3.11), and this approach works fine since the test code is separate from the application code running inside the containers.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: TrustyAI's jupyter-bokeh was pinned to 3.0.5 due to compatibility requirements with TrustyAI's visualization components, but the actual deployed version in requirements.txt shows 3.0.7, indicating incremental testing. The upgrade to 4.0.5 in this PR represents the completion of a gradual migration strategy from the 3.x series after confirming compatibility with Bokeh 3.7.3.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: In the opendatahub-io/notebooks repository, TensorFlow packages with `extras = ["and-cuda"]` can cause build conflicts on macOS due to platform-specific CUDA packages. When the Dockerfile installs CUDA system-wide, removing the extras and letting TensorFlow find CUDA at runtime resolves these conflicts.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: The jupyter-bokeh package was previously pinned to version 3.0.5 in the TrustyAI notebook image due to compatibility requirements with TrustyAI components, as indicated by the comment "Should be pinned down to this version in order to be compatible with trustyai" that was removed in this update.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: Test failures in opendatahub-io/notebooks should be categorized as either repository bugs (code/configuration issues) or infrastructure issues (CI platform problems) for proper diagnosis and resolution.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/rocm/pytorch/ubi9-python-3.11/requirements.txt:987-989
Timestamp: 2025-06-28T15:06:03.344Z
Learning: In the opendatahub-io/notebooks repository, checks for broken links and missing files in kustomization manifests are already performed by `ci/kustomize.sh`, which is invoked from `.github/workflows/code-quality.yaml` (lines 112–116). No additional pytest is needed for this purpose.

🪛 Checkov (3.2.334)

jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml

[MEDIUM] 2-60: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)

[MEDIUM] 2-60: Minimize the admission of root containers

(CKV_K8S_23)

jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/service.yaml

jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/kustomization.yaml

jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml

jupyter/rocm/tensorflow/ubi9-python-3.12/Pipfile

jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm

jupyter/rocm/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb

jiridanek

Please add your new image to Makefile. There are two places. Uncommenting the relevant line(s) should do the trick. I'll try to ask rabbit if it can give precise instructions.

@coderabbitai the new python 3.12 image has to be added in the two lists in Makefile. For Pipfile lock updates and for all-images target so that GitHub actions CI runs on it. Can you please explain how this is to be accomplished?

jiridanek · 2025-07-02T18:12:54Z

@coderabbitai the new python 3.12 image has to be added in the two lists in Makefile. For Pipfile lock updates and for all-images target so that GitHub actions CI runs on it. Can you please explain how this is to be accomplished?

coderabbitai · 2025-07-02T18:14:31Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@jiridanek, I'll help you identify where to add the new Python 3.12 ROCm TensorFlow image in the Makefile. Let me examine the current Makefile structure first.

🏁 Script executed:

#!/bin/bash
# Examine the Makefile to understand its structure and find the relevant sections
cat Makefile

Length of output: 23173

Perfect! I can see exactly what needs to be done. The Makefile already has the infrastructure in place for the ROCm TensorFlow Python 3.12 image, but the entries are currently commented out. You need to uncomment two specific lines:

1. For Pipfile lock updates (around line 445):

In the BASE_DIRS variable for Python 3.12, uncomment this line:

# jupyter/rocm/tensorflow/ubi9-python-$(PYTHON_VERSION)

So it becomes:

else ifeq ($(PYTHON_VERSION), 3.12)
	BASE_DIRS := \
	    jupyter/minimal/ubi9-python-$(PYTHON_VERSION) \
		jupyter/datascience/ubi9-python-$(PYTHON_VERSION) \
		jupyter/pytorch/ubi9-python-$(PYTHON_VERSION) \
		jupyter/tensorflow/ubi9-python-$(PYTHON_VERSION) \
		jupyter/rocm/pytorch/ubi9-python-$(PYTHON_VERSION) \
		jupyter/rocm/tensorflow/ubi9-python-$(PYTHON_VERSION)
		# jupyter/trustyai/ubi9-python-$(PYTHON_VERSION)
		# codeserver/ubi9-python-$(PYTHON_VERSION)

2. For all-images target (around line 515):

In the all-images target for Python 3.12, uncomment this line:

# rocm-jupyter-tensorflow-ubi9-python-$(RELEASE_PYTHON_VERSION)

So it becomes:

else ifeq ($(RELEASE_PYTHON_VERSION), 3.12)
all-images: \
	jupyter-minimal-ubi9-python-$(RELEASE_PYTHON_VERSION) \
	jupyter-datascience-ubi9-python-$(RELEASE_PYTHON_VERSION) \
	cuda-jupyter-minimal-ubi9-python-$(RELEASE_PYTHON_VERSION) \
	cuda-jupyter-tensorflow-ubi9-python-$(RELEASE_PYTHON_VERSION) \
	cuda-jupyter-pytorch-ubi9-python-$(RELEASE_PYTHON_VERSION) \
 	rocm-jupyter-pytorch-ubi9-python-$(RELEASE_PYTHON_VERSION) \
 	rocm-jupyter-tensorflow-ubi9-python-$(RELEASE_PYTHON_VERSION)

These changes will:

Include the ROCm TensorFlow Python 3.12 image in the Pipfile lock renewal process
Add it to the GitHub Actions CI matrix so it gets built and tested automatically
Ensure consistency with the other Python 3.12 images in the build system

The build target rocm-jupyter-tensorflow-ubi9-python-$(RELEASE_PYTHON_VERSION) is already defined earlier in the Makefile (around line 186), so no additional build configuration is needed.

🧠 Learnings used

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1247
File: .github/workflows/build-notebooks-TEMPLATE.yaml:50-53
Timestamp: 2025-07-01T14:36:52.852Z
Learning: In the opendatahub-io/notebooks repository, the test runner's Python version (configured in GitHub Actions UV setup) intentionally doesn't need to match the Python version of the container images being tested. jiridanek's team uses Python 3.12 for running tests while images may use different Python versions (like 3.11), and this approach works fine since the test code is separate from the application code running inside the containers.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/minimal/ubi9-python-3.12/Dockerfile.cpu:21-24
Timestamp: 2025-07-01T06:48:13.154Z
Learning: jiridanek creates comprehensive follow-up issues from review comments that expand scope appropriately, include clear acceptance criteria, proper backlinks, and structured implementation guidance. Issue #1241 demonstrates this by turning a specific oc client checksum concern into a thorough security enhancement plan covering all downloaded binaries across the Python 3.12 implementation.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/minimal/ubi9-python-3.12/Dockerfile.rocm:43-55
Timestamp: 2025-07-01T06:48:21.070Z
Learning: When security concerns are raised during PR reviews in opendatahub-io/notebooks, comprehensive follow-up issues are created (often by CodeRabbit) to track all related security enhancements with clear acceptance criteria and implementation guidance. This ensures security improvements are systematically addressed in dedicated efforts rather than blocking current deliverables.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-26T16:19:31.295Z
Learning: In the opendatahub-io/notebooks repository, the Playwright Docker image version in `.github/workflows/build-notebooks-TEMPLATE.yaml` (format: `mcr.microsoft.com/playwright:v1.53.1-noble`) must always match the `@playwright/test` version specified in the `tests/browser/package.json` file. Both versions need to be updated together to maintain consistency between CI/CD pipeline and project dependencies.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-26T16:19:31.295Z
Learning: In the opendatahub-io/notebooks repository, the Playwright Docker image version in `.github/workflows/build-notebooks-TEMPLATE.yaml` (format: `mcr.microsoft.com/playwright:v1.53.1-noble`) must always match the `@playwright/test` version specified in the `tests/browser/package.json` file. Both versions need to be updated together to maintain consistency between CI/CD pipeline and project dependencies.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: jiridanek's team uses containerized dependency locking for cross-platform compatibility in opendatahub-io/notebooks. They run `pipenv lock` inside UBI9 containers with specific platform arguments (`--platform=linux/amd64 --python-version 3.12`) to avoid host OS dependency conflicts when generating Pipfile.lock files.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: In the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytorch/ubi9-python-3.11/kustomize/base/kustomization.yaml` file that causes rocm runtime tests to fail with "no such file or directory" error. This is tracked in JIRA RHOAIENG-22044 and was intended to be fixed in PR #1015.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb:24-30
Timestamp: 2025-07-01T06:57:26.974Z
Learning: In the opendatahub-io/notebooks repository, jiridanek's team relies on strict CI process enforcement to handle potentially fragile code patterns like hard-coded relative paths. They accept some code fragility when it's mitigated through reliable CI setup and process controls, rather than always requiring code-level fixes.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-30T19:44:57.656Z
Learning: In the opendatahub-io/notebooks repository, jiridanek effectively manages comprehensive PR reviews by tracking all feedback as follow-up GitHub issues rather than blocking merge of high-quality implementations. This demonstrates excellent project management practices for balancing current deliverables with future improvements.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml:11-17
Timestamp: 2025-07-01T06:50:37.115Z
Learning: jiridanek manages StatefulSet selector issues systematically across multiple images in opendatahub-io/notebooks. When the same configuration issue (empty spec.selector and template.metadata.labels) appears in different images like jupyter/minimal and jupyter/tensorflow, he tracks them under a single coordinated effort rather than creating duplicate issues for each affected image.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb:31-34
Timestamp: 2025-07-01T07:03:05.385Z
Learning: jiridanek demonstrates excellent pattern recognition for identifying duplicated code issues across the opendatahub-io/notebooks repository. When spotting a potential problem in test notebooks, he correctly assesses that such patterns are likely replicated across multiple similar files rather than being isolated incidents, leading to more effective systematic solutions.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-20T11:51:59.716Z
Learning: This project follows the practice of associating PRs with Jira tickets from https://issues.redhat.com for traceability between requirements, release process, and product documentation. This is critical for enterprise software development compliance and cross-team coordination.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1218
File: jupyter/trustyai/ubi9-python-3.11/Pipfile:49-49
Timestamp: 2025-06-28T14:15:41.168Z
Learning: TrustyAI's jupyter-bokeh was pinned to 3.0.5 due to compatibility requirements with TrustyAI's visualization components, but the actual deployed version in requirements.txt shows 3.0.7, indicating incremental testing. The upgrade to 4.0.5 in this PR represents the completion of a gradual migration strategy from the 3.x series after confirming compatibility with Bokeh 3.7.3.

Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/pod-patch.yaml:11-22
Timestamp: 2025-06-30T14:36:53.890Z
Learning: The pod-patch.yaml file in jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/ is used only for running tests, not production deployments. This affects the risk assessment for resource management configurations like sizeLimit on emptyDir volumes.

openshift-ci · 2025-07-03T03:18:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jiridanek. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

… requirements error

…erfile as these aren't available on ubi9+epel9

…rfile

…ccording to Documentation in TensorFlow Dockerfile ``` Error: Problem: package rocm-opencl-sdk-6.4.3.60403-128.el9.x86_64 from ROCm requires rocm-opencl-devel = 2.0.0.60403-128.el9, but none of the providers can be installed - package rocm-opencl-devel-2.0.0.60403-128.el9.x86_64 from ROCm requires rocm-opencl, but none of the providers can be installed - conflicting requests - nothing provides opencl-filesystem needed by rocm-opencl-5.4.4-1.el9.x86_64 from epel - nothing provides ocl-icd(x86-64) needed by rocm-opencl-5.4.4-1.el9.x86_64 from epel - nothing provides ocl-icd needed by rocm-opencl-2.0.0.60403-128.el9.x86_64 from ROCm ```

jiridanek · 2025-08-20T18:37:14Z

RHOAIENG-27434: create ROCm Tensorflow Python 3.12 Image #1259 (comment)

@EmilienM hopefully this is the last time we're doing something like this on our own

``` ./jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm:84:1 unexpected 'R' expecting a new line followed by the next instruction ```

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (5)

jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml (2)
18-60: Missing minimal pod/container securityContext (tracked)

Hardening fields like runAsNonRoot, allowPrivilegeEscalation, and seccompProfile are missing. This is already tracked across images and managed centrally; no need to block this PR.

Reference: Issue #1212 (repository-wide pod security hardening).

12-16: Empty selector and template labels will be rejected by Kubernetes

The selector must match labels on the pod template. Use a non-empty matchLabels and mirror it on the template.

Apply:
-  selector: {}
+  selector:
+    matchLabels:
+      app: notebook
   serviceName: notebook
   template:
     metadata:
-      labels: {}
+      labels:
+        app: notebook
jupyter/rocm/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb (3)

22-29: Improve load_expected_versions() error handling (tracked)

Lacks handling for missing file or invalid JSON. Already tracked for systematic fix across notebooks.

Reference: Issue #1266.

31-34: Guard get_expected_version() for missing keys (tracked)

Accessing a missing key leads to a TypeError in re.sub. Handle not-found and empty versions. Already tracked.

Reference: Issue #1243.

80-95: Verify TensorBoard logs are actually created (tracked)

The callback is used, but the test doesn’t check that event files were written. Prefer a TemporaryDirectory and assert non-empty contents. Already tracked.

Reference: Issue #1263.

🧹 Nitpick comments (5)

runtimes/rocm-tensorflow/ubi9-python-3.12/Pipfile (1)
13-15: Direct wheel reference is correct for AMD’s manylinux repo; drop redundant version to avoid drift

AMD’s repo is not a PyPI index, so using file=… with a hash is the right approach. Minor: keeping both version==… and file=… is redundant and risks divergence in the future. Suggest removing the version field and relying on the pinned wheel URL + hash.

Apply within this hunk:
- tensorflow-rocm = {version = "==2.18.1", file = "https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4/tensorflow_rocm-2.18.1-cp312-cp312-manylinux_2_28_x86_64.whl", hashes = ["sha256:814a5e4842b0c92d63c7a0bb4df0baf51ff2db8615535d83fe8958204b840598"]}
+ tensorflow-rocm = {file = "https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4/tensorflow_rocm-2.18.1-cp312-cp312-manylinux_2_28_x86_64.whl", hashes = ["sha256:814a5e4842b0c92d63c7a0bb4df0baf51ff2db8615535d83fe8958204b840598"]}
jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml (1)
23-31: Auth disabled in NOTEBOOK_ARGS — clarify test-only intent

Token/password are set to empty strings. If this manifest is used only in tests, consider adding an inline comment to avoid accidental reuse in production contexts.

Example:
         - name: NOTEBOOK_ARGS
           value: |-
+            # NOTE: Authentication intentionally disabled for test deployments.
             --ServerApp.port=8888
             --ServerApp.token=''
             --ServerApp.password=''
jupyter/rocm/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb (3)
15-15: Remove unused import

The tensorboard module is not used (the callback is referenced via tf.keras), and Ruff flags it (F401). Safe to remove.
-    "import tensorboard\n",
48-53: Stabilize tf2onnx conversion by providing an input signature

Without an input_signature, tf2onnx may rely on inference and can behave inconsistently across versions. Also, from_keras often returns a tuple; capture accordingly.
-    "        model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(10,))])\n",
-    "        onnx_model = tf2onnx.convert.from_keras(model)\n",
-    "        \n",
-    "        self.assertTrue(onnx_model is not None)\n",
+    "        model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(10,))])\n",
+    "        spec = (tf.TensorSpec((None, 10), tf.float32, name=\"input\"),)\n",
+    "        onnx_model, _ = tf2onnx.convert.from_keras(model, input_signature=spec, opset=13)\n",
+    "        self.assertTrue(onnx_model is not None)\n",
66-69: Fix useless expression; add a minimal assertion

The bare “predictions” expression is a no-op (Ruff B018). Make it validate shape to turn this into an actual check.
-    "        predictions = model(x_train[:1]).numpy()\n",
-    "        predictions\n",
+    "        predictions = model(x_train[:1]).numpy()\n",
+    "        self.assertEqual(predictions.shape, (1, 10), \"Predictions should be (1, 10)\")\n",
Note: Broader assertions for accuracy/loss are already tracked separately (Issue #1261).

openshift-ci · 2025-08-20T20:56:53Z

@dibryant: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/notebook-rocm-jupyter-tf-ubi9-python-3-12-pr-image-mirror	`ae422fe`	link	true	`/test notebook-rocm-jupyter-tf-ubi9-python-3-12-pr-image-mirror`
ci/prow/rocm-notebooks-py312-e2e-tests	`ae422fe`	link	true	`/test rocm-notebooks-py312-e2e-tests`
ci/prow/rocm-notebooks-e2e-tests	`26b399f`	link	true	`/test rocm-notebooks-e2e-tests`
ci/prow/rocm-runtimes-ubi9-e2e-tests	`26b399f`	link	true	`/test rocm-runtimes-ubi9-e2e-tests`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

jiridanek · 2025-08-21T07:59:37Z

Review done on slack, https://redhat-internal.slack.com/archives/C096ZR053RQ/p1755633454094739

ckhordiasma · 2025-08-21T19:08:44Z

/retest

jiridanek · 2025-08-22T10:20:00Z

/kf-build odh-workbench-jupyter-tensorflow-rocm-py312-ubi9-on-pull-request

openshift-ci bot requested review from atheo89 and daniellutz July 2, 2025 14:14

openshift-ci bot added size/xxl and removed size/xxl labels Jul 2, 2025

coderabbitai bot reviewed Jul 2, 2025

View reviewed changes

jiridanek requested changes Jul 2, 2025

View reviewed changes

openshift-ci bot assigned jiridanek Jul 2, 2025

dibryant force-pushed the rocm-27434 branch from dc5976f to 94f4737 Compare July 3, 2025 03:18

openshift-ci bot removed the size/xxl label Jul 3, 2025

jiridanek and others added 10 commits August 20, 2025 20:30

NO-JIRA: add ROCm TensorFlow wheel file to Pipfile directly

481b11e

Update Pipfile.lock for Python 3.12

ef5814d

NO-JIRA: normalize tf2onnx version formatting across Pipfiles

159766e

NO-JIRA: remove --build-constraints from pip install due to unnamed…

d1fd612

… requirements error

NO-JIRA: remove missing OpenCL ICD packages from ROCm TensorFlow Dock…

bbdad6e

…erfile as these aren't available on ubi9+epel9

NO-JIRA: add hash verification for TensorFlow ROCm wheel in Pipfiles

2467404

Update Pipfile.lock for Python 3.11

3faaf84

Update Pipfile.lock for Python 3.12

89e3244

NO-JIRA: update ROCm and AMDGPU versions to 6.4.3 in TensorFlow Docke…

70309d2

…rfile

jiridanek force-pushed the rocm-27434 branch from da911e7 to c83cdda Compare August 20, 2025 18:34

openshift-ci bot added size/xxl and removed size/xxl labels Aug 20, 2025

NO-JIRA: placate hadolint with an empty line

26b399f

``` ./jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm:84:1 unexpected 'R' expecting a new line followed by the next instruction ```

openshift-ci bot added size/xxl and removed size/xxl labels Aug 20, 2025

coderabbitai bot reviewed Aug 20, 2025

View reviewed changes

openshift-ci bot added size/xxl and removed size/xxl labels Aug 20, 2025

jiridanek merged commit 93150ff into opendatahub-io:main Aug 21, 2025
21 of 27 checks passed

jiridanek pushed a commit that referenced this pull request Aug 21, 2025

RHOAIENG-27434: create ROCm Tensorflow Python 3.12 Image (#1259)

406690c

This was referenced Aug 27, 2025

RHAIENG-304: add uv pyproject.tomls for multiple notebooks and runtimes across Jupyter, RStudio, and VSCode #2145

Merged

RHAIENG-304, RHAIENG-785: remove all requirements.txt and Pipfile.lock #2182

Merged

RHOAIENG-27434: create ROCm Tensorflow Python 3.12 Image #1259

RHOAIENG-27434: create ROCm Tensorflow Python 3.12 Image #1259

Uh oh!

Conversation

dibryant commented Jul 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Merge criteria:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiridanek left a comment

Choose a reason for hiding this comment

Uh oh!

jiridanek commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Jul 2, 2025

1. For Pipfile lock updates (around line 445):

2. For all-images target (around line 515):

Uh oh!

openshift-ci bot commented Jul 3, 2025

Uh oh!

jiridanek commented Aug 20, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jiridanek commented Aug 21, 2025

Uh oh!

ckhordiasma commented Aug 21, 2025

Uh oh!

jiridanek commented Aug 22, 2025

Uh oh!

Uh oh!

dibryant commented Jul 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 2, 2025 •

edited

Loading

jiridanek commented Jul 2, 2025 •

edited

Loading

openshift-ci bot commented Aug 20, 2025 •

edited

Loading