Skip to content

Commit 0d64b75

Browse files
dayshahkevin85421
andauthored
[cherry-pick] Integrate with rayci (#3215) (#3234)
Signed-off-by: dayshah <[email protected]> Signed-off-by: Dhyey Shah <[email protected]> Co-authored-by: Kai-Hsun Chen <[email protected]>
1 parent 4d7e43c commit 0d64b75

File tree

5 files changed

+120
-12
lines changed

5 files changed

+120
-12
lines changed

.buildkite/build-start-operator.sh

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
#!/bin/bash
2+
3+
# This script is used to start the operator in the buildkite test-e2e steps.
4+
5+
# When starting from the ray ci release automation, we want to install the latest
6+
# released version from helm as actual users might. Ray ci is also always expected
7+
# to kick off from the release branch so tests should match up accordingly.
8+
9+
if [ "$IS_FROM_RAY_RELEASE_AUTOMATION" = 1 ]; then
10+
helm repo update && helm install kuberay/kuberay-operator
11+
KUBERAY_TEST_IMAGE="rayproject/ray:nightly.$(date +'%y%m%d').${RAY_NIGHTLY_COMMIT:0:6}-py39" && export KUBERAY_TEST_IMAGE
12+
else
13+
IMG=kuberay/operator:nightly make docker-image &&
14+
kind load docker-image kuberay/operator:nightly &&
15+
IMG=kuberay/operator:nightly make deploy
16+
fi

.buildkite/format.awk

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
###############################################################################
2+
# This AWK script processes lines from test logs.
3+
#
4+
# - If a line starts with "=== RUN":
5+
# 1) Split the third field on the "/" delimiter to check if it's a subtest.
6+
# (e.g., $3 might be "Test/SomeSubtest")
7+
# 2) If it is a subtest, print as is.
8+
# (=== RUN Test/SomeSubtest)
9+
# 3) Otherwise, replace "===" with "---" (indicating a main test).
10+
# (--- RUN Test)
11+
#
12+
# - If a line contains "---" but does NOT contain "RUN", we interpret it as
13+
# non-RUN log lines. We replace "---" with "###".
14+
# ("--- PASS" to "### PASS")
15+
#
16+
# - All other lines are printed unchanged.
17+
###############################################################################
18+
19+
# Match lines starting with "=== RUN"
20+
/^=== RUN/ {
21+
22+
# Split the 3rd field on "/"
23+
split($3, parts, "/")
24+
25+
# Check if more than one part was created after splitting on "/"
26+
# This indicates it's a subtest (e.g., "Test/SomeSubtest").
27+
if (length(parts) > 1) {
28+
# For subtests, print the line unchanged
29+
print
30+
} else {
31+
# If not a subtest, it's the main test. Replace "===" with "---".
32+
gsub(/===/, "---")
33+
print
34+
}
35+
36+
# Skip any further rules for this line
37+
next
38+
}
39+
40+
# Match lines containing "---" but not containing "RUN"
41+
/---/ && !/RUN/ {
42+
43+
# Replace "---" with "###"
44+
gsub(/---/, "###")
45+
print
46+
next
47+
}
48+
49+
# Print all other lines unchanged
50+
{ print }

.buildkite/ray-ci-integration.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
2+
# Ray CI Integration
3+
4+
| Version | Latest Ray Release | Ray Nightly |
5+
| ----------- | :------------------- | :-------------- |
6+
| Latest KubeRay Release | During Ray & KubeRay Releases | Nightly from Ray Release Automation |
7+
| KubeRay Nightly | In KubeRay CI | Not tested |
8+
9+
This table lays out the state of testing between Ray and KubeRay nightlies and releases.
10+
The goal is to have all 4 of these being consistently tested eventually.
11+
All tests run in KubeRay CI pipeline, the difference is just where the pipeline is actually kicked off from.
12+
"KubeRay Nightly" just refers to running on master right now, And "Latest KubeRay Release" refers to running
13+
on the latest release branch. The "Latest Ray Release" will be pulled from DockerHub and same with "Ray Nightly".
14+
15+
In the future, if we have a test needs the ray nightly to run, add a step to .buildkite/test-e2e.yaml that
16+
follows the other steps but sets KUBERAY_TEST_RAY_IMAGE env variable to "rayproject/ray:nightly".
17+
When ray releases a new version, you can change the step to just use the latest ray release.

.buildkite/test-e2e.yml

Lines changed: 35 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,14 @@
77
- kubectl config set clusters.kind-kind.server https://docker:6443
88
# Build nightly KubeRay operator image
99
- pushd ray-operator
10-
- IMG=kuberay/operator:nightly make docker-image
11-
- kind load docker-image kuberay/operator:nightly
12-
- IMG=kuberay/operator:nightly make deploy
10+
- bash ../.buildkite/build-start-operator.sh
1311
- kubectl wait --timeout=90s --for=condition=Available=true deployment kuberay-operator
1412
# Run e2e tests and print KubeRay operator logs if tests fail
15-
- KUBERAY_TEST_TIMEOUT_SHORT=1m KUBERAY_TEST_TIMEOUT_MEDIUM=5m KUBERAY_TEST_TIMEOUT_LONG=10m go test -timeout 30m -v ./test/e2e || (kubectl logs --tail -1 -l app.kubernetes.io/name=kuberay && exit 1)
13+
- echo "--- START:Running e2e rayservice (nightly operator) tests"
14+
- if [ -n "${KUBERAY_TEST_RAY_IMAGE}"]; then echo "Using Ray Image ${KUBERAY_TEST_RAY_IMAGE}"; fi
15+
- set -o pipefail
16+
- KUBERAY_TEST_TIMEOUT_SHORT=1m KUBERAY_TEST_TIMEOUT_MEDIUM=5m KUBERAY_TEST_TIMEOUT_LONG=10m go test -timeout 30m -v ./test/e2e 2>&1 | awk -f ../.buildkite/format.awk || (kubectl logs --tail -1 -l app.kubernetes.io/name=kuberay && exit 1)
17+
- echo "--- END:e2e rayservice (nightly operator) tests finished"
1618

1719
- label: 'Test E2E rayservice (nightly operator)'
1820
instance_size: large
@@ -23,12 +25,14 @@
2325
- kubectl config set clusters.kind-kind.server https://docker:6443
2426
# Build nightly KubeRay operator image
2527
- pushd ray-operator
26-
- IMG=kuberay/operator:nightly make docker-image
27-
- kind load docker-image kuberay/operator:nightly
28-
- IMG=kuberay/operator:nightly make deploy
28+
- bash ../.buildkite/build-start-operator.sh
2929
- kubectl wait --timeout=90s --for=condition=Available=true deployment kuberay-operator
3030
# Run e2e tests and print KubeRay operator logs if tests fail
31-
- KUBERAY_TEST_TIMEOUT_SHORT=1m KUBERAY_TEST_TIMEOUT_MEDIUM=5m KUBERAY_TEST_TIMEOUT_LONG=10m go test -timeout 30m -v ./test/e2erayservice || (kubectl logs --tail -1 -l app.kubernetes.io/name=kuberay && exit 1)
31+
- echo "--- START:Running e2e rayservice (nightly operator) tests"
32+
- if [ -n "${KUBERAY_TEST_RAY_IMAGE}"]; then echo "Using Ray Image ${KUBERAY_TEST_RAY_IMAGE}"; fi
33+
- set -o pipefail
34+
- KUBERAY_TEST_TIMEOUT_SHORT=1m KUBERAY_TEST_TIMEOUT_MEDIUM=5m KUBERAY_TEST_TIMEOUT_LONG=10m go test -timeout 30m -v ./test/e2erayservice 2>&1 | awk -f ../.buildkite/format.awk || (kubectl logs --tail -1 -l app.kubernetes.io/name=kuberay && exit 1)
35+
- echo "--- END:e2e rayservice (nightly operator) tests finished"
3236

3337
- label: 'Test Autoscaler E2E (nightly operator)'
3438
instance_size: large
@@ -39,9 +43,28 @@
3943
- kubectl config set clusters.kind-kind.server https://docker:6443
4044
# Build nightly KubeRay operator image
4145
- pushd ray-operator
42-
- IMG=kuberay/operator:nightly make docker-image
43-
- kind load docker-image kuberay/operator:nightly
44-
- IMG=kuberay/operator:nightly make deploy
46+
- bash ../.buildkite/build-start-operator.sh
4547
- kubectl wait --timeout=90s --for=condition=Available=true deployment kuberay-operator
4648
# Run e2e tests and print KubeRay operator logs if tests fail
47-
- KUBERAY_TEST_TIMEOUT_SHORT=1m KUBERAY_TEST_TIMEOUT_MEDIUM=5m KUBERAY_TEST_TIMEOUT_LONG=10m go test -timeout 30m -v ./test/e2eautoscaler || (kubectl logs --tail -1 -l app.kubernetes.io/name=kuberay && exit 1)
49+
- echo "--- START:Running Autoscaler e2e (nightly operator) tests"
50+
- if [ -n "${KUBERAY_TEST_RAY_IMAGE}"]; then echo "Using Ray Image ${KUBERAY_TEST_RAY_IMAGE}"; fi
51+
- set -o pipefail
52+
- KUBERAY_TEST_TIMEOUT_SHORT=1m KUBERAY_TEST_TIMEOUT_MEDIUM=5m KUBERAY_TEST_TIMEOUT_LONG=10m go test -timeout 30m -v ./test/e2eautoscaler 2>&1 | awk -f ../.buildkite/format.awk || (kubectl logs --tail -1 -l app.kubernetes.io/name=kuberay && exit 1)
53+
- echo "--- END:Autoscaler e2e (nightly operator) tests finished"
54+
55+
- label: 'Test E2E Operator Version Upgrade (v1.3.0)'
56+
instance_size: large
57+
image: golang:1.22
58+
commands:
59+
- source .buildkite/setup-env.sh
60+
- kind create cluster --wait 900s --config ./tests/framework/config/kind-config-buildkite.yml
61+
- kubectl config set clusters.kind-kind.server https://docker:6443
62+
# Deploy previous KubeRay operator release (v1.2.2) using helm
63+
- echo Deploying KubeRay operator
64+
- pushd ray-operator
65+
- helm install kuberay-operator kuberay/kuberay-operator --version 1.2.2
66+
- kubectl wait --timeout=90s --for=condition=Available=true deployment kuberay-operator
67+
# Run e2e tests and print KubeRay operator logs if tests fail
68+
- echo "--- START:Running e2e Operator upgrade (v1.2.2 to v1.3.0 operator) tests"
69+
- KUBERAY_TEST_TIMEOUT_SHORT=1m KUBERAY_TEST_TIMEOUT_MEDIUM=5m KUBERAY_TEST_TIMEOUT_LONG=10m KUBERAY_TEST_UPGRADE_IMAGE=v1.3.0 go test -timeout 30m -v ./test/e2eupgrade | awk -f ../.buildkite/format.awk || (kubectl logs --tail -1 -l app.kubernetes.io/name=kuberay && exit 1)
70+
- echo "--- END:e2e Operator upgrade (v1.2.2 to v1.3.0 operator) tests finished"

docs/development/release.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,8 @@ You will be prompted for a commit reference and an image tag. The commit referen
9898
9999
* Open a PR into the Ray repo updating the operator version used in the autoscaler integration test. Make any adjustments necessary for the test to pass ([example](https://github.com/ray-project/ray/pull/40918)). Make sure the test labelled [kubernetes-operator](https://buildkite.com/ray-project/oss-ci-build-pr/builds/17146#01873a69-5ccf-4c71-b06c-ae3a4dd9aecb) passes before merging.
100100
101+
* Open another PR in the Ray repo to update the branch used to kick off tests from the Ray release automation pipeline to test with the nightly (context and step location: <https://github.com/ray-project/ray/pull/51539>).
102+
101103
* Announce the `rc0` release on the KubeRay slack, with deployment instructions ([example](https://ray-distributed.slack.com/archives/C02GFQ82JPM/p1680555251566609)).
102104
103105
#### Step 4. Create more release candidates (`rc1`, `rc2`, ...) if necessary

0 commit comments

Comments
 (0)