Skip to content

Conversation

AaruniAggarwal
Copy link
Contributor

@AaruniAggarwal AaruniAggarwal commented Sep 16, 2025

Currently image building is failing for IBM Power due to issues with pyarrow installation. Hence updating the dockerfile to install pyarrow in separate builder stage for ppc64le.
It is not impacting other architectures and image is building fine locally.

Summary by CodeRabbit

  • New Features
    • PyArrow 17.0.0 is now preinstalled in the PPC64LE datascience Python 3.12 image, enabling Arrow-based workflows out of the box.
    • Rust and Cargo are available on PPC64LE, allowing installation/build of native Python packages that require Rust.
  • Chores
    • Improved wheel-based installation flow in the PPC64LE image for faster, more reliable package setup.
    • General image cleanup to reduce temporary artifacts.

Copy link
Contributor

coderabbitai bot commented Sep 16, 2025

Walkthrough

Added Rust tools for PPC64LE, introduced a new arrow-builder stage to build PyArrow 17.0.0 wheels, copied wheels into the runtime, updated PPC64LE wheel install to include PyArrow, cleaned up artifacts, exported PYARROW_VERSION in PPC64LE env, and duplicated the arrow-builder stage in the Dockerfile.

Changes

Cohort / File(s) Summary
Dockerfile updates (PPC64LE build and runtime)
runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
- Added rust and cargo to PPC64LE OS packages
- Introduced arrow-builder stage to clone/build Apache Arrow and produce PyArrow wheels in /arrowwheels (PYARROW_VERSION=17.0.0)
- Copied wheels into runtime: COPY --from=arrow-builder /arrowwheels /tmp/arrowwheels
- Installed ONNX and PyArrow wheels in PPC64LE runtime
- Extended cleanup to remove /tmp/arrowwheels
- Note: duplicate arrow-builder stage blocks present
PPC64LE environment script
.../ppc64le.sh
Exported PYARROW_VERSION=17.0.0 to propagate version into runtime builds

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description states the problem and the high-level fix (installing PyArrow in a separate builder stage for ppc64le) but does not follow the repository's required template: it lacks a full "Description" with file-level details, a complete "How Has This Been Tested?" section with exact commands, environment and test outputs, and the self-checklist/merge-criteria items are not addressed. Because those required template sections are missing or incomplete, the description is insufficient for reviewers to validate testing and merging readiness. Please update the PR body to match the repository template by expanding the Description to list changed files and rationale, adding a "How Has This Been Tested?" section with platform details, exact build/test commands and results or links to logs, and completing the self-checklist and merge-criteria items (or explain exceptions); also include explicit testing instructions for reviewers and confirm whether the test suite (e.g., make test) was run and commits are ready/squashed for merge.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Title Check ✅ Passed The title clearly and accurately summarizes the primary change: it identifies the affected architecture (ppc64le), the target component (runtime/datascience), the problem (pyarrow installation brought in by feast), and the implemented fix (adding a dedicated builder stage), so a reviewer scanning history will understand the main intent. The phrasing is concise and specific rather than vague or noisy. This aligns with the changes shown in the Dockerfile and PR description.

@github-actions github-actions bot added the review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel label Sep 16, 2025
@openshift-ci openshift-ci bot added the size/m label Sep 16, 2025
@AaruniAggarwal
Copy link
Contributor Author

Updated similar changes here: red-hat-data-services#1538 (comment)
and images are building fine on ppc64le through konflux.

konflux-runtime-datascience-pyarrow

@AaruniAggarwal
Copy link
Contributor Author

@jiridanek , Could you please review the PR
Thanks

@openshift-ci openshift-ci bot added size/m and removed size/m labels Sep 16, 2025
@AaruniAggarwal AaruniAggarwal changed the title add pyarrow installation for ppc64le Fix runtime-datascience Dockerfile for ppc64le Sep 16, 2025
@openshift-ci openshift-ci bot added size/m and removed size/m labels Sep 16, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu (7)

55-58: Don’t persist build-only variables in /etc/profile.d; remove PYARROW_VERSION export here.

Per prior guidance, keep versioning as ARG/ENV within relevant stages. Persisting to profile scripts adds drift and unreliable sourcing at build time.

Apply this minimal diff:

-        echo 'export PYARROW_VERSION=17.0.0' >> /etc/profile.d/ppc64le.sh && \

Optionally also drop the cargo PATH export for ppc64le since rust is installed via dnf, not rustup.


216-223: Tighten clone and pip; add shallow clone and no-cache.

Reduce network and layer size; make installs reproducible.

-        git clone -b apache-arrow-${PYARROW_VERSION} https://github.com/apache/arrow.git --recursive && \
+        git clone --depth 1 -b apache-arrow-${PYARROW_VERSION} https://github.com/apache/arrow.git --recursive && \
     	cd arrow && rm -rf .git && mkdir dist                                  && \
-    	pip3 install -r python/requirements-build.txt                          && \
+    	pip3 install --no-cache-dir -r python/requirements-build.txt           && \

225-243: Ensure codec dependencies are available; prefer BUNDLED deps and cpu-parallelism from nproc.

Without -DARROW_DEPENDENCY_SOURCE=BUNDLED, enabling ZSTD/LZ4/Snappy/Brotli can fail if system -devel libs are missing. Also replace fixed parallelism with nproc.

-    	cmake -S cpp -B cpp/build                                                 \
+    	cmake -S cpp -B cpp/build                                                 \
             -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
             -DCMAKE_BUILD_TYPE=release         \
+            -DARROW_DEPENDENCY_SOURCE=BUNDLED  \
             -DARROW_WITH_BZ2=ON                \
             -DARROW_WITH_ZLIB=ON               \
             -DARROW_WITH_ZSTD=ON               \
             -DARROW_WITH_LZ4=ON                \
             -DARROW_WITH_SNAPPY=ON             \
             -DARROW_WITH_BROTLI=ON             \
@@
-        make -j20 install                   && \
-        export PYARROW_PARALLEL=20          && \
+        make -j"$(nproc)" install           && \
+        export PYARROW_PARALLEL="$(nproc)"  && \

249-254: Prefer pip -m invocation and disable cache.

Slightly safer and avoids wheel cache growth in the layer.

-        pip3 install wheel                  && \
+        python -m pip install --no-cache-dir wheel && \

203-258: Consider BuildKit cache mounts for faster rebuilds.

Use pip cache mount in the arrow-builder RUN to speed iterative builds.

Example:

-RUN if [ "$TARGETARCH" = "ppc64le" ]; then \
+RUN --mount=type=cache,target=/root/.cache/pip \
+    if [ "$TARGETARCH" = "ppc64le" ]; then \
         ...

286-294: Use --no-cache-dir on wheel installs and guard empty globs.

Prevents pip cache bloat; also avoid failure when no wheels exist (future-proofing).

-    	echo "Installing ppc64le ONNX, pyarrow wheels and OpenBLAS..." && \
-    	HOME=/root pip install /tmp/onnx_wheels/*.whl /tmp/arrowwheels/*.whl && \
+    	echo "Installing ppc64le ONNX, pyarrow wheels and OpenBLAS..." && \
+    	shopt -s nullglob && \
+    	HOME=/root pip install --no-cache-dir /tmp/onnx_wheels/*.whl /tmp/arrowwheels/*.whl && \

169-173: wget progress and reproducibility nit.

Add a progress flag for CI logs and consider pinning checksum.

-        wget https://github.com/OpenMathLib/OpenBLAS/releases/download/v${OPENBLAS_VERSION}/OpenBLAS-${OPENBLAS_VERSION}.zip && \
+        wget --progress=dot:giga https://github.com/OpenMathLib/OpenBLAS/releases/download/v${OPENBLAS_VERSION}/OpenBLAS-${OPENBLAS_VERSION}.zip && \
+        echo "<sha256sum>  OpenBLAS-${OPENBLAS_VERSION}.zip" | sha256sum -c - && \

Replace with the official checksum.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e4c866 and 4ffa4ef.

📒 Files selected for processing (1)
  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu (4 hunks)
🧰 Additional context used
🧠 Learnings (12)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2460
File: jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu:206-221
Timestamp: 2025-09-16T10:39:23.295Z
Learning: jiridanek requested GitHub issue creation for OpenBLAS installation staging during ppc64le builds in jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu during PR #2460 review. Issue #2466 was created addressing permission errors where OpenBLAS make install fails when attempting to write to /usr/local system paths from USER 1001 context in final stage, proposing DESTDIR staging pattern to build and install OpenBLAS artifacts within openblas-builder stage then COPY pre-installed files to final stage, with comprehensive problem description covering specific permission denied errors, detailed technical solution with code examples, clear acceptance criteria for build reliability and multi-architecture compatibility, and proper context linking to PR #2460 review comment, continuing the systematic infrastructure improvement tracking methodology for Power architecture support.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1513
File: runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu:104-108
Timestamp: 2025-09-05T10:07:53.476Z
Learning: jiridanek requested GitHub issue creation for Arrow codec configuration problem during PR #1513 review. Issue #2305 was created addressing disabled core Arrow codecs (LZ4, Zstd, Snappy) in s390x pyarrow build that prevents reading compressed Parquet/Arrow datasets. The issue includes comprehensive problem description covering data compatibility impact, detailed solution enabling codecs with BUNDLED dependencies, clear acceptance criteria for functionality verification, and proper context linking to PR #1513 review comment, assigned to jiridanek.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2432
File: jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu:102-159
Timestamp: 2025-09-12T08:20:45.369Z
Learning: jiridanek requested GitHub issue creation for pyarrow build reproducibility improvement during PR #2432 review. Issue #2433 was created addressing HEAD builds causing reproducibility issues in s390x wheel-builder stage, proposing pinned apache-arrow-20.0.0 tag to match pylock.toml specification, with comprehensive problem description covering version mismatch risks, detailed solution with implementation steps, clear acceptance criteria for build consistency verification, and proper context linking to PR #2432 review comment, continuing the established pattern of systematic infrastructure improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-08-05T17:24:08.616Z
Learning: jiridanek requested PR review for #1521 covering s390x architecture support improvements, demonstrating continued focus on systematic multi-architecture compatibility enhancements in the opendatahub-io/notebooks repository through clean implementation with centralized configuration, proper CI integration, and architecture-aware testing patterns.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu:0-0
Timestamp: 2025-09-05T11:27:31.040Z
Learning: jiridanek requested GitHub issue creation for build toolchain optimization in datascience runtime during PR #2215 review. Issue #2308 was created addressing unnecessary build dependencies (gcc-toolset-13, cmake, ninja-build, rust, cargo) in final runtime image for ppc64le architecture, covering comprehensive problem analysis with specific line numbers, multiple solution options for builder-only toolchains, clear acceptance criteria for size reduction and security improvement, detailed implementation guidance for package segregation, and proper context linking to PR #2215 review comment, continuing the established pattern of systematic infrastructure improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2317
File: codeserver/ubi9-python-3.12/get_code_server_rpm.sh:31-37
Timestamp: 2025-09-05T13:16:48.754Z
Learning: jiridanek requested GitHub issue creation for build tools installation unification across builder images during PR #2317 review. Issue #2322 was created addressing inconsistent build dependency management patterns across different builder images, proposing multiple solution approaches including Development Tools group installation, centralized configuration, and layered approaches, with comprehensive acceptance criteria covering auditing, standardization, regression prevention, and multi-architecture support (x86_64, ppc64le, aarch64, s390x), continuing the established pattern of systematic infrastructure improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1179
File: jupyter/utils/install_pandoc.sh:1-1
Timestamp: 2025-09-05T07:46:50.781Z
Learning: jiridanek requested GitHub issue creation during PR #1179 review to explore installing Pandoc from EPEL repository for ppc64le architecture as an alternative to building from source, noting that EPEL packages are acceptable unlike CentOS Stream in red-hat-data-services/notebooks. Issue #2281 was successfully created with comprehensive problem description covering build complexity concerns, multiple solution options, clear acceptance criteria, and proper context linking, continuing the established pattern of systematic infrastructure improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1720
File: jupyter/pytorch+llmcompressor/ubi9-python-3.11/requirements.txt:2659-2680
Timestamp: 2025-08-07T12:41:48.997Z
Learning: For opendatahub-io/notebooks, rpds-py 0.27.0 provides manylinux wheels for Python 3.11 and 3.12 on x86_64 and aarch64, so no Rust build is needed for these platforms. For s390x and ppc64le, wheels are not available, so a Rust build stage or version pinning is required if those images are built.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2432
File: jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu:232-249
Timestamp: 2025-09-12T08:27:00.439Z
Learning: jiridanek requested GitHub issue creation for Rust toolchain availability during s390x builds in jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu during PR #2432 review. Issue #2435 was created addressing PATH availability problems where Rust/cargo installed in cpu-base stage at /opt/.cargo/bin may not be accessible during uv pip install step in jupyter-datascience stage, proposing three solution approaches: immediate environment variable fix, builder stage pattern following codeserver approach, and ENV declaration fix, with comprehensive acceptance criteria covering build reliability, multi-architecture compatibility, and alignment with established patterns, continuing the systematic infrastructure improvement tracking methodology.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2432
File: jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu:232-249
Timestamp: 2025-09-12T08:27:00.439Z
Learning: jiridanek requested GitHub issue creation for Rust toolchain availability during s390x builds in jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu during PR #2432 review. The issue addresses PATH availability problems where Rust/cargo installed in cpu-base stage at /opt/.cargo/bin may not be accessible during uv pip install step in jupyter-datascience stage, proposing three solution approaches: immediate environment variable fix, builder stage pattern following codeserver approach, and ENV declaration fix, with comprehensive acceptance criteria covering build reliability, multi-architecture compatibility, and alignment with established patterns, continuing the systematic infrastructure improvement tracking methodology.
📚 Learning: 2025-09-12T08:27:00.439Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2432
File: jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu:232-249
Timestamp: 2025-09-12T08:27:00.439Z
Learning: jiridanek requested GitHub issue creation for Rust toolchain availability during s390x builds in jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu during PR #2432 review. Issue #2435 was created addressing PATH availability problems where Rust/cargo installed in cpu-base stage at /opt/.cargo/bin may not be accessible during uv pip install step in jupyter-datascience stage, proposing three solution approaches: immediate environment variable fix, builder stage pattern following codeserver approach, and ENV declaration fix, with comprehensive acceptance criteria covering build reliability, multi-architecture compatibility, and alignment with established patterns, continuing the systematic infrastructure improvement tracking methodology.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-09-05T12:10:50.856Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu:0-0
Timestamp: 2025-09-05T12:10:50.856Z
Learning: jiridanek requested GitHub issue creation for Dockerfile environment variable refactoring during PR #2215 review. Issue addresses build-only variables (OPENBLAS_VERSION, ONNX_VERSION, GRPC_PYTHON_BUILD_SYSTEM_OPENSSL) being unnecessarily written to /etc/profile.d/ppc64le.sh in runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu, causing variable duplication across stages, unreliable sourcing in non-login build contexts, and violation of DRY principles. The issue includes comprehensive problem description covering affected lines 30-37, detailed impact analysis of build reliability and maintenance overhead, three solution options with centralized ARG/ENV approach as recommended, clear acceptance criteria for version centralization and build-only variable cleanup, and specific implementation guidance with code examples, continuing the established pattern of systematic infrastructure improvements through detailed issue tracking.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-09-12T08:27:00.439Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2432
File: jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu:232-249
Timestamp: 2025-09-12T08:27:00.439Z
Learning: jiridanek requested GitHub issue creation for Rust toolchain availability during s390x builds in jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu during PR #2432 review. The issue addresses PATH availability problems where Rust/cargo installed in cpu-base stage at /opt/.cargo/bin may not be accessible during uv pip install step in jupyter-datascience stage, proposing three solution approaches: immediate environment variable fix, builder stage pattern following codeserver approach, and ENV declaration fix, with comprehensive acceptance criteria covering build reliability, multi-architecture compatibility, and alignment with established patterns, continuing the systematic infrastructure improvement tracking methodology.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-09-05T12:10:50.856Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu:0-0
Timestamp: 2025-09-05T12:10:50.856Z
Learning: jiridanek requested GitHub issue creation for Dockerfile environment variable refactoring during PR #2215 review. Issue #2311 was created addressing build-only variables (OPENBLAS_VERSION, ONNX_VERSION, GRPC_PYTHON_BUILD_SYSTEM_OPENSSL) being unnecessarily written to /etc/profile.d/ppc64le.sh in runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu, causing variable duplication across stages, unreliable sourcing in non-login build contexts, and violation of DRY principles. The issue includes comprehensive problem description covering affected lines 30-37, detailed impact analysis of build reliability and maintenance overhead, three solution options with centralized ARG/ENV approach as recommended, clear acceptance criteria for version centralization and build-only variable cleanup, and specific implementation guidance with code examples, assigned to jiridanek, continuing the established pattern of systematic infrastructure improvements through detailed issue tracking.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-09-05T12:25:09.719Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2227
File: codeserver/ubi9-python-3.12/Dockerfile.cpu:122-123
Timestamp: 2025-09-05T12:25:09.719Z
Learning: jiridanek requested GitHub issue creation for Docker multi-stage synchronization improvement in codeserver/ubi9-python-3.12/Dockerfile.cpu during PR #2227 review. The issue addresses sentinel file pattern using /tmp/control copied to /dev/null for stage coordination between rpm-base, whl-cache, and codeserver stages, proposing semantic improvements with descriptive file names, inline documentation, and elimination of /dev/null hack while maintaining multi-architecture build functionality for ppc64le support.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-09-05T11:27:31.040Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2215
File: runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu:0-0
Timestamp: 2025-09-05T11:27:31.040Z
Learning: jiridanek requested GitHub issue creation for build toolchain optimization in datascience runtime during PR #2215 review. Issue #2308 was created addressing unnecessary build dependencies (gcc-toolset-13, cmake, ninja-build, rust, cargo) in final runtime image for ppc64le architecture, covering comprehensive problem analysis with specific line numbers, multiple solution options for builder-only toolchains, clear acceptance criteria for size reduction and security improvement, detailed implementation guidance for package segregation, and proper context linking to PR #2215 review comment, continuing the established pattern of systematic infrastructure improvements through detailed issue tracking.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-09-16T10:39:23.295Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2460
File: jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu:206-221
Timestamp: 2025-09-16T10:39:23.295Z
Learning: jiridanek requested GitHub issue creation for OpenBLAS installation staging during ppc64le builds in jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu during PR #2460 review. Issue #2466 was created addressing permission errors where OpenBLAS make install fails when attempting to write to /usr/local system paths from USER 1001 context in final stage, proposing DESTDIR staging pattern to build and install OpenBLAS artifacts within openblas-builder stage then COPY pre-installed files to final stage, with comprehensive problem description covering specific permission denied errors, detailed technical solution with code examples, clear acceptance criteria for build reliability and multi-architecture compatibility, and proper context linking to PR #2460 review comment, continuing the systematic infrastructure improvement tracking methodology for Power architecture support.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-08-07T12:41:48.997Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1720
File: jupyter/pytorch+llmcompressor/ubi9-python-3.11/requirements.txt:2659-2680
Timestamp: 2025-08-07T12:41:48.997Z
Learning: For opendatahub-io/notebooks, rpds-py 0.27.0 provides manylinux wheels for Python 3.11 and 3.12 on x86_64 and aarch64, so no Rust build is needed for these platforms. For s390x and ppc64le, wheels are not available, so a Rust build stage or version pinning is required if those images are built.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-08-27T14:49:24.112Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2145
File: runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda:152-159
Timestamp: 2025-08-27T14:49:24.112Z
Learning: jiridanek requested GitHub issue creation for CUDA version alignment in pytorch+llmcompressor runtime during PR #2145 review. Issue #2148 was created addressing the mismatch between Dockerfile CUDA 12.6 and pylock.toml cu124 PyTorch wheels. The issue includes comprehensive problem description covering affected files (runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda and pylock.toml), detailed solution with PyTorch index URL update from cu124 to cu126, lock regeneration steps using uv, clear acceptance criteria for wheel alignment verification, and proper context linking to PR #2145 review comment, assigned to jiridanek.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-09-05T12:29:07.819Z
Learnt from: jiridanek
PR: opendatahub-io/notebooks#2227
File: codeserver/ubi9-python-3.12/Dockerfile.cpu:218-218
Timestamp: 2025-09-05T12:29:07.819Z
Learning: jiridanek requested GitHub issue creation for uv multi-stage Docker build architectural investigation during PR #2227 review. The current implementation uses a three-stage build with whl-cache stage for wheel building/caching, base stage for OS setup, and final codeserver stage for offline installation using --offline flag and cache mounts. The pattern separates build phase (internet access, build tools) from install phase (offline, faster) while supporting multi-architecture builds (x86_64, ppc64le) with sentinel file coordination using /tmp/control files.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
📚 Learning: 2025-07-04T17:08:02.399Z
Learnt from: grdryn
PR: opendatahub-io/notebooks#1320
File: rstudio/rhel9-python-3.11/Dockerfile.cuda:40-42
Timestamp: 2025-07-04T17:08:02.399Z
Learning: In the opendatahub-io/notebooks repository, when using multi-architecture Dockerfiles with BuildKit, the implicit build argument TARGETARCH is automatically available in the global scope for FROM instructions without explicit declaration. However, if TARGETARCH is used within a build stage, it must be declared explicitly within that stage. The current placement pattern (declaring ARG TARGETARCH after FROM instructions that use it) is correct for modern Docker/Podman/Buildah environments and does not require compatibility with older Docker versions.

Applied to files:

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
🪛 GitHub Actions: Code static analysis
runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu

[warning] 21-21: DL3022: 'COPY --from' should reference a previously defined 'FROM' alias


[warning] 50-50: SC2016: Expressions don't expand in single quotes, use double quotes for that.


[warning] 61-61: SC2016: Expressions don't expand in single quotes, use double quotes for that.


[warning] 104-104: DL3002: Last USER should not be root


[warning] 159-159: DL3002: Last USER should not be root


[warning] 169-169: DL3047: Avoid use of wget without progress bar. Use wget --progress=dot:giga <url>.


[warning] 183-183: DL3002: Last USER should not be root


[warning] 191-191: SC2155: Declare and assign separately to avoid masking return values.


[warning] 216-216: DL3042: Avoid use of cache directory with pip. Use pip install --no-cache-dir <package>.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build (runtime-datascience-ubi9-python-3.12, 3.12, linux/s390x, false) / build
  • GitHub Check: build (runtime-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-datascience-cpu-py312-ubi9-on-pull-request
🔇 Additional comments (2)
runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu (2)

51-58: SC2016 warnings are intentional here.

Single quotes prevent expansion at build-time; profile scripts should expand at runtime. No change needed.

If ShellCheck remains noisy, consider adding a brief comment above explaining the intent.


214-214: Resolved — PYARROW_VERSION already matches pylock.toml (17.0.0).
runtimes/datascience/ubi9-python-3.12/pylock.toml pins pyarrow = "17.0.0", matching the builder variable.

Copy link
Contributor

openshift-ci bot commented Sep 16, 2025

@AaruniAggarwal: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/runtime-ds-ubi9-python-3-12-pr-image-mirror 4ffa4ef link true /test runtime-ds-ubi9-python-3-12-pr-image-mirror
ci/prow/images 4ffa4ef link true /test images

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@AaruniAggarwal
Copy link
Contributor Author

/assign @jiridanek

@jiridanek jiridanek changed the title Fix runtime-datascience Dockerfile for ppc64le ppc64le(runtime/datascience): fix pyarrow installation (brought in by feast) by adding a dedicated builder stage Sep 16, 2025
@openshift-ci openshift-ci bot added size/m and removed size/m labels Sep 16, 2025
@openshift-ci openshift-ci bot added the lgtm label Sep 16, 2025
Copy link
Contributor

openshift-ci bot commented Sep 16, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jiridanek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jiridanek jiridanek merged commit cb68d7e into opendatahub-io:main Sep 16, 2025
11 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel size/m

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants