Skip to content

onboard universal training images to ODH konflux-central#184

Merged
alexxfan merged 4 commits intomainfrom
onboard-universal-training-images
Mar 5, 2026
Merged

onboard universal training images to ODH konflux-central#184
alexxfan merged 4 commits intomainfrom
onboard-universal-training-images

Conversation

@alexxfan
Copy link
Contributor

@alexxfan alexxfan commented Mar 5, 2026

Description

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Chores
    • Added automated PipelineRun configurations to enable multi-architecture container builds for distributed workloads (CPU, CUDA, ROCm). Pipelines are wired to trigger on pull requests and pushes, standardizing build parameters and auth so hardware-specific images are built and published automatically, improving CI automation for model/runtime images.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 5, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 4315119f-8cb1-4a86-9f8a-76f562e4eb77

📥 Commits

Reviewing files that changed from the base of the PR and between f1748e0 and f75d332.

📒 Files selected for processing (6)
  • pipelineruns/distributed-workloads/odh-th06-cpu-torch291-py312-pull-request.yaml
  • pipelineruns/distributed-workloads/odh-th06-cpu-torch291-py312-push.yaml
  • pipelineruns/distributed-workloads/odh-th06-cuda130-torch291-py312-pull-request.yaml
  • pipelineruns/distributed-workloads/odh-th06-cuda130-torch291-py312-push.yaml
  • pipelineruns/distributed-workloads/odh-th06-rocm64-torch291-py312-pull-request.yaml
  • pipelineruns/distributed-workloads/odh-th06-rocm64-torch291-py312-push.yaml

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting


📝 Walkthrough

Walkthrough

Adds six new Tekton PipelineRun YAML manifests under pipelineruns/distributed-workloads for multi-arch container builds targeting CPU, CUDA 13.0, and ROCm 6.4 variants. Each variant includes both pull-request and push trigger manifests. Files declare metadata, pipeline parameters (git-url, revision, output-image, dockerfile, path-context, pipeline-type, additional-tags), a git resolver referencing opendatahub-io/odh-konflux-central.git (pathInRepo: pipeline/multi-arch-container-build.yaml), taskRunTemplate with a serviceAccountName, and a git-auth workspace backed by a secret. Annotations include pipelines-as-code settings and CEL expressions for trigger conditions. All changes are additive (≈305 lines across six files).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Security & Quality Issues

  • CWE-494 (Download of Code Without Integrity Check): pipelineRef.revision uses branch names (e.g., main) instead of pinned commit SHAs. Action: pin pipelineRef.revision to an immutable commit SHA to ensure reproducible and auditable pipeline behavior.
  • CWE-602 (Exposure of Sensitive Information to an Unauthorized Actor) / Credential Management: git-auth workspace references a secret but no policies for rotation, scope, or usage are declared. Action: ensure the secret follows least-privilege access, has rotation policies, and restrict secret mount/use to required tasks only.
  • CWE-276 (Incorrect Default Permissions): serviceAccountName is set but required RBAC permissions are not documented. Action: document and apply least-privilege RBAC roles for the service account; avoid granting cluster-admin.
  • CWE-494 (Improper Integrity Verification for Artifacts): output-image parameters allow tag-based image references without digest enforcement. Action: require image digests (sha256) for published artifacts or add a step to resolve and verify digests before publishing.
  • CEL trigger correctness risk: on-cel-expression annotations contain branch/event matching logic that may mis-trigger if event payloads differ. Action: validate CEL expressions against actual event payloads in CI and add unit tests or synthetic event checks to confirm trigger behavior.
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'onboard universal training images to ODH konflux-central' directly and clearly describes the primary change: adding new Tekton PipelineRun manifests to onboard multiple training image variants (CPU, CUDA, ROCm) to the ODH konflux-central repository.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pipelineruns/distributed-workloads/odh-th06-cpu-torch291-py312-pull-request.yaml`:
- Around line 16-18: The manifest for the CPU pull-request is incorrectly using
CUDA identifiers: replace occurrences of the CUDA variant in the
appstudio.openshift.io/component value and the resource name (currently
containing "odh-th06-cuda130-torch291-py312" and
"odh-th06-cuda130-torch291-py312-on-pull-request") with the correct CPU variant
identifier (e.g., "odh-th06-cpu-torch291-py312" and
"odh-th06-cpu-torch291-py312-on-pull-request"); also update any other keys in
this file that contain "cuda130" (notably the values referenced around the 27–32
and 42 areas) so the CPU pull-request manifest targets the CPU
image/context/service account instead of CUDA. Ensure all occurrences of the
string "cuda130" in resource names and component labels are swapped to the CPU
equivalent to prevent building/pushing the wrong artifact.

In `@pipelineruns/distributed-workloads/odh-th06-cpu-torch291-py312-push.yaml`:
- Around line 33-41: The PipelineRun uses a mutable branch reference: locate the
pipelineRef block (keys pipelineRef, resolver: git, params list with name:
revision currently set to "main") and replace the mutable "main" revision with a
specific immutable git commit SHA for the referenced repo; do this consistently
across all six manifests (odh-th06-rocm64-torch291-py312-push.yaml,
odh-th06-rocm64-torch291-py312-pull-request.yaml,
odh-th06-cuda130-torch291-py312-push.yaml,
odh-th06-cuda130-torch291-py312-pull-request.yaml,
odh-th06-cpu-torch291-py312-push.yaml,
odh-th06-cpu-torch291-py312-pull-request.yaml so the params entry where name:
revision uses the chosen commit SHA instead of "main").

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 8e91eb37-d994-404d-bd60-c0499c03888a

📥 Commits

Reviewing files that changed from the base of the PR and between f567ac0 and f1748e0.

📒 Files selected for processing (6)
  • pipelineruns/distributed-workloads/odh-th06-cpu-torch291-py312-pull-request.yaml
  • pipelineruns/distributed-workloads/odh-th06-cpu-torch291-py312-push.yaml
  • pipelineruns/distributed-workloads/odh-th06-cuda130-torch291-py312-pull-request.yaml
  • pipelineruns/distributed-workloads/odh-th06-cuda130-torch291-py312-push.yaml
  • pipelineruns/distributed-workloads/odh-th06-rocm64-torch291-py312-pull-request.yaml
  • pipelineruns/distributed-workloads/odh-th06-rocm64-torch291-py312-push.yaml

Comment on lines +33 to +41
pipelineRef:
resolver: git
params:
- name: url
value: https://github.com/opendatahub-io/odh-konflux-central.git
- name: revision
value: main
- name: pathInRepo
value: pipeline/multi-arch-container-build.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, verify the specific file exists and its contents
if [ -f "pipelineruns/distributed-workloads/odh-th06-cpu-torch291-py312-push.yaml" ]; then
  echo "=== File contents ==="
  cat -n "pipelineruns/distributed-workloads/odh-th06-cpu-torch291-py312-push.yaml"
else
  echo "File not found. Listing distributed-workloads directory:"
  find pipelineruns/distributed-workloads -type f -name "*.yaml" 2>/dev/null | head -20
fi

Repository: opendatahub-io/odh-konflux-central

Length of output: 2168


🏁 Script executed:

#!/bin/bash
# Search for git resolver patterns in the directory
echo "=== Git resolver patterns in distributed-workloads ==="
rg -l 'resolver:\s*git' pipelineruns/distributed-workloads/ 2>/dev/null || echo "No git resolvers found"

Repository: opendatahub-io/odh-konflux-central

Length of output: 608


🏁 Script executed:

#!/bin/bash
# Search for revision patterns in pipelineRef sections
echo "=== Revision value patterns in pipelineRef ==="
rg -B2 -A2 'name:\s*revision' pipelineruns/distributed-workloads/ 2>/dev/null | head -50

Repository: opendatahub-io/odh-konflux-central

Length of output: 4612


Pin pipelineRef to an immutable revision across all git resolvers (Critical, CWE-494/CWE-829).

All six PipelineRun manifests in pipelineruns/distributed-workloads/ use mutable main branch for remotely resolved pipeline definitions. Exploit scenario: upstream compromise or force-push enables attacker-controlled build logic execution, leading to secret exfiltration or poisoned container image publication.

🔧 Remediation (pin to commit SHA)
   pipelineRef:
     resolver: git
     params:
     - name: url
       value: https://github.com/opendatahub-io/odh-konflux-central.git
     - name: revision
-      value: main
+      value: "<40-char-immutable-commit-sha>"
     - name: pathInRepo
       value: pipeline/multi-arch-container-build.yaml

Apply to all files:

  • odh-th06-rocm64-torch291-py312-push.yaml
  • odh-th06-rocm64-torch291-py312-pull-request.yaml
  • odh-th06-cuda130-torch291-py312-push.yaml
  • odh-th06-cuda130-torch291-py312-pull-request.yaml
  • odh-th06-cpu-torch291-py312-push.yaml
  • odh-th06-cpu-torch291-py312-pull-request.yaml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pipelineruns/distributed-workloads/odh-th06-cpu-torch291-py312-push.yaml`
around lines 33 - 41, The PipelineRun uses a mutable branch reference: locate
the pipelineRef block (keys pipelineRef, resolver: git, params list with name:
revision currently set to "main") and replace the mutable "main" revision with a
specific immutable git commit SHA for the referenced repo; do this consistently
across all six manifests (odh-th06-rocm64-torch291-py312-push.yaml,
odh-th06-rocm64-torch291-py312-pull-request.yaml,
odh-th06-cuda130-torch291-py312-push.yaml,
odh-th06-cuda130-torch291-py312-pull-request.yaml,
odh-th06-cpu-torch291-py312-push.yaml,
odh-th06-cpu-torch291-py312-pull-request.yaml so the params entry where name:
revision uses the chosen commit SHA instead of "main").

@alexxfan alexxfan merged commit 534c066 into main Mar 5, 2026
@alexxfan alexxfan deleted the onboard-universal-training-images branch March 5, 2026 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants