Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
bea5a97
Update AppWrappers from 0.27.0 to v0.30.0
dgrove-oss Dec 17, 2024
7de1ad4
Rename GPU GitHub runner to avoid version confusion
sutaakar Dec 20, 2024
1a14e90
Migrate GitHub runners to newest Ubuntu
sutaakar Jan 3, 2025
5e354dc
[CARRY]: Fix CVE-2024-45338 - update golang.org/x/net
varshaprasad96 Jan 6, 2025
d459dd6
Update dependency versions for release v1.13.0
codeflare-machine-account Jan 8, 2025
1945b65
Remove auto-merge workflow
ChristianZaccaria Jan 9, 2025
864bb41
Don't delete Ray head Pod when ImagePullSecret is provided
sutaakar Jan 17, 2025
028b67d
Upgrade openshift/client-go version
ChristianZaccaria Jan 21, 2025
d888985
Upgrade KubeRay to 1.2.2
sutaakar Jan 21, 2025
a6794de
Update dependency versions for release v1.14.0
codeflare-machine-account Jan 21, 2025
9519938
Add automated update step for component metadata file in CodeFlare re…
oksanabaza Jan 14, 2025
ed1339f
update codeflare common to pick up fix for nvidia gpu setup
dgrove-oss Jan 28, 2025
33dba99
bump AppWrapper to 1.0.0
dgrove-oss Jan 16, 2025
ee5038c
update AppWrapper RBACs for Kueue 0.10.0
dgrove-oss Jan 22, 2025
14ac0fc
workaround kuberay changes until a proper fix can be implemented
dgrove-oss Jan 22, 2025
2f499ba
Pull in Kueue 0.10.1
dgrove-oss Jan 27, 2025
3d3fde0
Go 1.23 workaround
dgrove-oss Jan 27, 2025
103580c
use publicly available images
dgrove-oss Jan 27, 2025
ef70cf9
fix golang image repo/tag
dgrove-oss Jan 28, 2025
1895a05
Create oauth objects when RayCluster spec suspend is false
sutaakar Feb 6, 2025
f59ec01
fix: CVE-2024-45339 - update golang/glog
chipspeak Feb 6, 2025
b7e3f34
Adjust test timeout on AppWrappers to decrease e2e tests duration
ChristianZaccaria Feb 10, 2025
e99e560
Upgrade AppWrapper from v1.0.0 to v1.0.4
dgrove-oss Feb 12, 2025
f639621
Update dependency versions for release v1.15.0
codeflare-machine-account Feb 14, 2025
bb927ef
Fix workflow generate-component-metadata step
oksanabaza Feb 14, 2025
bcfc7dd
Point links in compatibility matrix to upstream tags
ChristianZaccaria Feb 14, 2025
e2ba556
Upgrade cert generator image to use py3.11
Feb 27, 2025
643756e
Add constant namespaces for RayDashboard tests
ChristianZaccaria Mar 6, 2025
b525c59
Update codeflare-common Go import
ChristianZaccaria Mar 6, 2025
b612ce3
Update outdated README.md, update Makefile commands to support local …
Feb 7, 2025
98fe247
Add Github Action workflow for updating DW Components Release matrix …
Srihari1192 Mar 10, 2025
a96625a
Add AMD GPU test for ray clusters
ChughShilpa Feb 28, 2025
c03f1b5
update codeflare-common import
dgrove-oss Mar 21, 2025
3cd5eba
only use the applicationsNamespace in openshift
szaher Mar 27, 2025
133cc4c
use fallback namespaces if dsci cluster isn't found
szaher Mar 28, 2025
ef278c0
only use default-dsci namespace when available
szaher Mar 28, 2025
cf1d822
use ray-system as default ray namespace
szaher Apr 1, 2025
9d1836d
update copyrights for constants.go
szaher Apr 1, 2025
6ef824b
Update AppWrapper from 1.0.4 to 1.0.7
dgrove-oss Apr 7, 2025
bebce01
task(RHOAIENG-22446): Updated OWNERS with Ray Team
chipspeak Apr 14, 2025
35e76cb
Auto-Detect kuberay-operator namespace
szaher Apr 14, 2025
d2d22ed
update route link as previous file was removed
laurafitzgerald May 14, 2025
362e2e5
Merge pull request #685 from laurafitzgerald/unit-test-route-fix
laurafitzgerald May 15, 2025
caa859d
task(RHOAIENG-25519): Update CodeFlare Compat Matrix
chipspeak May 14, 2025
751cfa9
Bump AppWrapper version to 1.1.1
dgrove-oss Mar 20, 2025
167c65f
Looser coupling of Kueue versions
dgrove-oss Mar 24, 2025
a3e2846
Update to AppWrapper v1.1.2
dgrove-oss May 28, 2025
91eb170
deploy Kueue 0.11.5 in CI tests
dgrove-oss May 28, 2025
266891b
bump kueue from 0.11.5 to 0.11.6
dgrove-oss Jun 9, 2025
cf4026d
conf: update dependencies to match versions used in the latest planne…
pawelpaszki Jun 16, 2025
7e10e68
conf: revert kueue update
pawelpaszki Jun 16, 2025
9cd648c
Update dependency versions for release v1.16.0
codeflare-machine-account Jun 16, 2025
ec0e562
Update dockerfile to start supporting multi-arch builds
AjayJagan Jul 2, 2025
b696c86
Separate args
AjayJagan Jul 10, 2025
7fc158c
task(RHOAIENG-27838): Update CodeFlare Compat matrix
chipspeak Jul 11, 2025
dbf405b
task(RHOAIENG-33036):Update Codeflare compatibility matrix
LilyLinh Sep 11, 2025
114f101
task(RHOAIENG-34228): Updated compatibility matrix after SDK patch re…
chipspeak Sep 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/e2e_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ concurrency:
jobs:
kubernetes-e2e:

runs-on: ubuntu-20.04-4core-gpu
runs-on: gpu-t4-4-core

steps:
- name: Checkout code
Expand Down Expand Up @@ -89,7 +89,8 @@ jobs:
export CODEFLARE_TEST_OUTPUT_DIR=${{ env.TEMP_DIR }}

set -euo pipefail
go test -timeout 60m -v -skip "^Test.*Cpu$" ./test/e2e -json 2>&1 | tee ${CODEFLARE_TEST_OUTPUT_DIR}/gotest.log | gotestfmt
go test -timeout 120m -v -skip "^Test.*Cpu$|^Test.*ROCmGpu$" ./test/e2e -json 2>&1 | tee ${CODEFLARE_TEST_OUTPUT_DIR}/gotest.log | gotestfmt


- name: Print CodeFlare operator logs
if: always() && steps.deploy.outcome == 'success'
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/olm_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ concurrency:

jobs:
kubernetes-olm-upgrade:
runs-on: ubuntu-20.04-4core
runs-on: ubuntu-latest-4core
timeout-minutes: 60
env:
OLM_VERSION: v0.25.0
Expand Down
55 changes: 48 additions & 7 deletions .github/workflows/project-codeflare-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,13 @@ on:
description: 'GitHub organization/user containing repositories used for release'
required: true
default: 'project-codeflare'
quay-organization:
description: 'Quay organization used to push the built images to'
required: true
default: 'project-codeflare'
community-operators-prod-organization:
description: 'Owner of target community-operators-prod repository used to open a PR against'
required: true
default: 'redhat-openshift-ecosystem'
rhoai-release-version:
description: "RHOAI Release version for updating Component Release Matrix Versions Info "
required: true

jobs:
release-parameters:
Expand All @@ -54,7 +53,6 @@ jobs:
echo "Tested Kueue Version: ${{ github.event.inputs.kueue-version }}"
echo "Is Stable: ${{ github.event.inputs.is-stable }}"
echo "CodeFlare Repository Organization: ${{ github.event.inputs.codeflare-repository-organization }}"
echo "Quay Organization: ${{ github.event.inputs.quay-organization }}"
echo "Community Operators Prod Organization: ${{ github.event.inputs.community-operators-prod-organization }}"

release-codeflare-sdk:
Expand All @@ -74,7 +72,7 @@ jobs:
run: |
semver_version="${{ github.event.inputs.codeflare-sdk-version }}"
plain_version="${semver_version:1}"
gh workflow run release.yaml --repo ${{ github.event.inputs.codeflare-repository-organization }}/codeflare-sdk --ref ${{ github.ref }} --field release-version=${plain_version} --field is-stable=${{ github.event.inputs.is-stable }} --field quay-organization=${{ github.event.inputs.quay-organization }}
gh workflow run release.yaml --repo ${{ github.event.inputs.codeflare-repository-organization }}/codeflare-sdk --ref ${{ github.ref }} --field release-version=${plain_version} --field is-stable=${{ github.event.inputs.is-stable }} --field quay-organization=project-codeflare
env:
GITHUB_TOKEN: ${{ secrets.CODEFLARE_MACHINE_ACCOUNT_TOKEN }}
shell: bash
Expand Down Expand Up @@ -107,7 +105,7 @@ jobs:
--field appwrapper-version=${{ github.event.inputs.appwrapper-version }} \
--field kuberay-version=${{ github.event.inputs.kuberay-version }} \
--field kueue-version=${{ github.event.inputs.kueue-version }} \
--field quay-organization=${{ github.event.inputs.quay-organization }} \
--field quay-organization=project-codeflare \
--field community-operators-prod-fork-organization=${{ github.event.inputs.codeflare-repository-organization }} \
--field community-operators-prod-organization=${{ github.event.inputs.community-operators-prod-organization }}
env:
Expand Down Expand Up @@ -165,3 +163,46 @@ jobs:
echo "Kueue release with version ${{ github.event.inputs.kueue-version }} does not exist. Please select an existing version."
exit 1
fi

generate-component-metadata:
runs-on: ubuntu-latest

steps:
- name: Ensure config folder exists
run: mkdir -p config

- name: Generate component_metadata.yaml
run: |
cat <<EOL > config/component_metadata.yaml
releases:
- name: CodeFlare Operator
version: ${{ github.event.inputs.operator-version }}
repoUrl: https://github.com/project-codeflare/codeflare-operator
EOL

- name: Verify generated file
run: cat config/component_metadata.yaml

Update_release_version_info_to_confluence:
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v4

- name: Trigger and Update Component Release Matrix Versions Info to Confluence
run: |
gh workflow run update-release-matrix-to-confluence.yml --ref ${{ github.ref }} \
--field rhoai-release-version=${{ github.event.inputs.rhoai-release-version }} \
--field kueue-version=${{ github.event.inputs.kueue-version }} \
--field codeflare-sdk-version=${{ github.event.inputs.codeflare-sdk-version }} \
--field codeflare-operator-version=${{ github.event.inputs.operator-version }} \
--field kuberay-version=${{ github.event.inputs.kuberay-version }} \
--field appwrapper-version=${{ github.event.inputs.appwrapper-version }}

# wait for a while for Run to be started
sleep 5
run_id=$(gh run list --workflow update-release-matrix-to-confluence.yml --repo https://github.com/project-codeflare/codeflare-operator --limit 1 --json databaseId --jq .[].databaseId)
gh run watch ${run_id} --repo https://github.com/project-codeflare/codeflare-operator --interval 10 --exit-status
env:
GITHUB_TOKEN: ${{ secrets.CODEFLARE_MACHINE_ACCOUNT_TOKEN }}
shell: bash
121 changes: 121 additions & 0 deletions .github/workflows/update-release-matrix-to-confluence.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
name: Update Release Matrix to Confluence

on:
workflow_dispatch:
inputs:
rhoai-release-version:
description: 'RHOAI Release Version'
required: true
kueue-version:
description: 'Kueue Version'
required: true
codeflare-sdk-version:
description: 'CodeFlare SDK Version'
required: true
codeflare-operator-version:
description: 'CodeFlare operator Version'
required: true
kuberay-version:
description: 'Tested KubeRay version'
required: true
appwrapper-version:
description: 'Tested appwrapper version'
required: true

jobs:
update-confluence:
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v4

- name: Release info Parameters
run: |
echo "RHOAI_RELEASE_VERSION=${{ github.event.inputs.rhoai-release-version }}" >> $GITHUB_ENV
echo "KUEUE_VERSION=${{ github.event.inputs.kueue-version }}" >> $GITHUB_ENV
echo "CODEFLARE_SDK_VERSION=${{ github.event.inputs.codeflare-sdk-version }}" >> $GITHUB_ENV
echo "CODEFLARE_OPERATOR_VERSION=${{ github.event.inputs.codeflare-operator-version }}" >> $GITHUB_ENV
echo "KUBERAY_VERSION=${{ github.event.inputs.kuberay-version }}" >> $GITHUB_ENV
echo "APPWRAPPER_VERSION=${{ github.event.inputs.appwrapper-version }}" >> $GITHUB_ENV

- name: Fetch and Update Existing Release Matrix Page Content
run: |
echo "Fetching Release Matrix Confluence Page..."
response=$(curl -H "Authorization: Bearer ${{ secrets.CONFLUENCE_API_TOKEN }}" \
"${{ secrets.CONFLUENCE_BASE_URL }}/rest/api/content?title=${{ secrets.PAGE_TITLE }}&spaceKey=${{ secrets.SPACE_KEY }}&expand=body.storage,version")

echo "$response" | jq '.' > page_data.json
echo "Raw API Response: $response"

PAGE_VERSION=$(jq '.results[0].version.number' page_data.json)

if [[ -z "$PAGE_VERSION" || "$PAGE_VERSION" == "null" ]]; then
echo "Error: Could not retrieve current page version."
exit 1
fi
echo "PAGE_VERSION=$PAGE_VERSION" >> $GITHUB_ENV

EXISTING_CONTENT=$(jq -r '.results[0].body.storage.value' page_data.json)

echo "Existing Release Matrix Page Content: $EXISTING_CONTENT"

if [[ -z "$EXISTING_CONTENT" || "$EXISTING_CONTENT" == "null" ]]; then
echo "Error: Could not retrieve existing page content."
exit 1
fi

# Convert newlines to a placeholder to handle multi-line processing
PLACEHOLDER="__NL__"
MODIFIED_CONTENT=$(echo "$EXISTING_CONTENT" | tr '\n' "$PLACEHOLDER")

# Update the page content with release info also check and update if the release version already exists in the table
if echo "$MODIFIED_CONTENT" | grep -q "<tr[^>]*><td>$RHOAI_RELEASE_VERSION</td>"; then
UPDATED_PAGE_CONTENT=$(echo "$MODIFIED_CONTENT" | sed -E "s|(<tr[^>]*><td>$RHOAI_RELEASE_VERSION</td><td>)[^<]+(</td><td>)[^<]+(</td><td>)[^<]+(</td><td>)[^<]+(</td><td>)[^<]+(</td></tr>)|\1$KUEUE_VERSION\2$CODEFLARE_SDK_VERSION\3$CODEFLARE_OPERATOR_VERSION\4$KUBERAY_VERSION\5$APPWRAPPER_VERSION\6|")
else
UPDATED_ROW="<tr class=\"\"><td>$RHOAI_RELEASE_VERSION</td><td>$KUEUE_VERSION</td><td>$CODEFLARE_SDK_VERSION</td><td>$CODEFLARE_OPERATOR_VERSION</td><td>$KUBERAY_VERSION</td><td>$APPWRAPPER_VERSION</td></tr>"
UPDATED_PAGE_CONTENT=$(echo "$MODIFIED_CONTENT" | sed "s|</tbody>|$UPDATED_ROW</tbody>|")
fi

# Correct JSON encoding without double escaping
UPDATED_PAGE_CONTENT=$(echo "$UPDATED_PAGE_CONTENT" | sed 's/_$//') # Remove trailing underscores
UPDATED_PAGE_CONTENT=$(jq -n --arg content "$UPDATED_PAGE_CONTENT" '$content' | tr -d '\r')
# Store as output
echo "UPDATED_PAGE_CONTENT=$UPDATED_PAGE_CONTENT" >> "$GITHUB_ENV"

- name: Publish updated page content to confluence
run: |

NEW_VERSION=$(( PAGE_VERSION + 1 ))

if [[ -n "$UPDATED_PAGE_CONTENT" && "$UPDATED_PAGE_CONTENT" != "null" ]]; then
echo "Updating Confluence Page using PUT request..."
HTTP_RESPONSE=$(curl -s -o response.json -w "%{http_code}" -X PUT "${{ secrets.CONFLUENCE_BASE_URL }}/rest/api/content/${{ secrets.CONFLUENCE_PAGE_ID }}" \
-H "Authorization: Bearer ${{ secrets.CONFLUENCE_API_TOKEN }}" \
-H "Content-Type: application/json" \
-d "{
\"id\": \"${{ secrets.CONFLUENCE_PAGE_ID }}\",
\"type\": \"page\",
\"title\": \"Distributed Workloads Release Details\",
\"space\": { \"key\": \"${{ secrets.SPACE_KEY }}\" },
\"body\": {
\"storage\": {
\"value\": $UPDATED_PAGE_CONTENT,
\"representation\": \"storage\"
}
},
\"version\": {
\"number\": $NEW_VERSION
}
}")
if [[ "$HTTP_RESPONSE" == "200" || "$HTTP_RESPONSE" == "201" ]]; then
echo "Successfully updated Confluence Page with release version details !"
echo "Response from Confluence:"
cat response.json
else
echo "Error: Failed to update Confluence page. HTTP Response Code: $HTTP_RESPONSE"
exit 1
fi
else
echo "Error: UPDATED_PAGE_CONTENT is null or empty."
exit 1
fi
18 changes: 0 additions & 18 deletions .github/workflows/upstream-downstream-sync.yml

This file was deleted.

9 changes: 5 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Build the manager binary
FROM registry.access.redhat.com/ubi9/go-toolset:1.23 AS builder

FROM registry.access.redhat.com/ubi8/go-toolset:1.22@sha256:780ab5f3874a6e2b1e04bb3719e614e835af3f8ab150922d6e84c2f9fd2bdb27 AS builder
ARG TARGETOS
ARG TARGETARCH

WORKDIR /workspace
# Copy the Go Modules manifests
Expand All @@ -13,11 +15,10 @@ RUN go mod download
COPY main.go main.go
COPY pkg/ pkg/

# Build
USER root
RUN CGO_ENABLED=1 GOOS=linux GOARCH=${GOARCH} make go-build-for-image
RUN CGO_ENABLED=1 GOOS=linux GOARCH=${TARGETARCH:-amd64} make go-build-for-image

FROM registry.access.redhat.com/ubi8/ubi-minimal:8.8
FROM registry.access.redhat.com/ubi9/ubi-minimal:latest
WORKDIR /
COPY --from=builder /workspace/manager .

Expand Down
11 changes: 5 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,16 @@ VERSION ?= v0.0.0-dev
BUNDLE_VERSION ?= $(VERSION:v%=%)

# APPWRAPPER_VERSION defines the default version of the AppWrapper controller
APPWRAPPER_VERSION ?= v0.27.0
APPWRAPPER_VERSION ?= v1.1.2
APPWRAPPER_REPO ?= github.com/project-codeflare/appwrapper
APPWRAPPER_CRD ?= ${APPWRAPPER_REPO}/config/crd?ref=${APPWRAPPER_VERSION}

# KUEUE_VERSION defines the default version of Kueue (used for testing)
KUEUE_VERSION ?= v0.8.3
# KUEUE_VERSION defines the version of Kueue deployed for testing
KUEUE_VERSION ?= v0.11.6

USE_RHOAI ?= true
# KUBERAY_VERSION defines the default version of the KubeRay operator (used for testing)
KUBERAY_VERSION ?= v1.1.0
KUBERAY_VERSION ?= v1.3.2

# RAY_VERSION defines the default version of Ray (used for testing)
RAY_VERSION ?= 2.5.0
Expand Down Expand Up @@ -160,7 +160,6 @@ vet: ## Run go vet against code.
.PHONY: modules
modules: ## Update Go dependencies.
go get github.com/ray-project/kuberay/ray-operator@$(KUBERAY_VERSION)
go get sigs.k8s.io/kueue@$(KUEUE_VERSION)
go get github.com/project-codeflare/appwrapper@$(APPWRAPPER_VERSION)
go mod tidy

Expand Down Expand Up @@ -393,7 +392,7 @@ test-component: envtest ginkgo ## Run component tests.

.PHONY: test-e2e
test-e2e: manifests fmt vet ## Run e2e tests.
go test -timeout 30m -v ./test/e2e
CODEFLARE_TEST_OUTPUT_DIR=/tmp/ CLUSTER_HOSTNAME=kind CODEFLARE_TEST_TIMEOUT_MEDIUM=5m CODEFLARE_TEST_TIMEOUT_LONG=40m go test -v -skip "^Test.*Gpu$$" ./test/e2e -timeout=60m

.PHONY: kind-e2e
kind-e2e: ## Set up e2e KinD cluster
Expand Down
10 changes: 10 additions & 0 deletions OWNERS
Original file line number Diff line number Diff line change
@@ -1,21 +1,31 @@
approvers:
- astefanutti
- chipspeak
- ChristianZaccaria
- jbusche
- kpostoffice
- kryanbeane
- laurafitzgerald
- pawelpaszki
- sutaakar
- szaher
- tedhtchang
- varshaprasad96

reviewers:
- astefanutti
- Bobbins228
- chipspeak
- ChristianZaccaria
- dimakis
- Fiona-Waters
- jbusche
- kpostoffice
- kryanbeane
- laurafitzgerald
- pawelpaszki
- sutaakar
- szaher
- tedhtchang
- varshaprasad96

Expand Down
Loading