Skip to content

Optimize datascience runtime image by removing build-only dependencies from final stage #2308

@coderabbitai

Description

@coderabbitai

Problem Description

The datascience runtime Dockerfile currently installs build toolchains (gcc-toolset-13, cmake, ninja-build, rust, cargo) directly into the final runtime image for ppc64le architecture. These build-only dependencies are only needed during OpenBLAS and ONNX compilation but remain in the production runtime, increasing image size and attack surface unnecessarily.

Affected Files

  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu
    • Lines 18-28: Base stage package installation
    • Lines 57-76: openblas-builder stage
    • Lines 81-101: onnx-builder stage
    • Lines 124-136: Runtime stage artifact copying
    • Lines 143-155: Runtime stage package installation

Current Issues

  1. Increased Image Size: Build toolchains (~500MB+) unnecessarily included in runtime
  2. Security Exposure: Compilers and build tools available in production environment
  3. Architectural Inconsistency: Build dependencies mixed with runtime dependencies

Solution Options

Option 1: Builder-Only Toolchains (Recommended)

  • Install build toolchains only in openblas-builder and onnx-builder stages
  • Keep only runtime libraries in final image (mesa-libGL, skopeo, libxcrypt-compat, unixODBC runtime)
  • Copy built artifacts (OpenBLAS to /usr/local, ONNX wheels) into runtime stage

Option 2: Multi-Stage Segregation

  • Create dedicated build environment stage for ppc64le
  • Segregate all development packages to builder stages
  • Install only runtime equivalents in final stage

Option 3: Conditional Runtime Cleanup

  • Install build dependencies conditionally
  • Remove build toolchains after compilation in same RUN layer
  • Keep runtime libraries for ongoing operation

Acceptance Criteria

Core Requirements

  • Build toolchains (gcc-toolset-13, cmake, ninja-build, rust, cargo) removed from final runtime image
  • Runtime libraries (mesa-libGL, skopeo, libxcrypt-compat, unixODBC) preserved in final image
  • OpenBLAS and ONNX functionality maintained through proper artifact copying
  • Successful multi-architecture builds for all supported platforms (x86_64, aarch64, ppc64le, s390x)

Verification Requirements

  • Final image size reduction measurable (target: ~500MB smaller)
  • No build tools accessible in runtime container (gcc, cmake, cargo commands unavailable)
  • OpenBLAS and ONNX libraries functional in runtime environment
  • All existing functionality tests pass

Implementation Guidance

Package Segregation Strategy

  1. Builder Stages Only: gcc-toolset-13, cmake, ninja-build, rust, cargo, git, wget, unzip
  2. Runtime Stage Only: mesa-libGL, skopeo, libxcrypt-compat, unixODBC (runtime components)
  3. Artifact Copying: OpenBLAS binaries to /usr/local, ONNX wheels for pip installation

Testing Approach

  • Build all architecture variants to ensure no regressions
  • Measure image size differences pre/post optimization
  • Validate runtime functionality with sample ML workloads
  • Verify security posture improvement through container scanning

Context

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

📋 Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions