Skip to content

Conversation

@kryanbeane
Copy link
Contributor

@kryanbeane kryanbeane commented Oct 7, 2025

Issue link

RHOAIENG-33283

What changes have been made

  • if you omit runtime_env for local files, it'll parse imports etc and mount what it thinks is needed. Same for existing and lifecycled.
  • if you have a remote working_dir it doesn't need to mount anything, just parses the requirements file and populates runtimeEnvYAML with what it needs from runtime_env in the SDK.
  • allows for passing a github zip url as the working directory, ray will ƒetch and unzip this on the head for us

Verification steps

**IMPORTANT: ** These steps assume you have some kind of files in the directory to use. Create a directory and a sample python file inside to use for testing!

Setup

poetry build
%pip install dist/codeflare_sdk-0.31.1-py3-none-any.whl --force-reinstall

Test 1: Lifecycled Cluster

from codeflare_sdk import RayJob, ManagedClusterConfig

rayjob = RayJob(
    job_name="test-lifecycled",
    namespace="rhods-notebooks",
    cluster_config=ManagedClusterConfig(),
    entrypoint="python test.py",
    runtime_env={"working_dir": "./<directory>"},
)

rayjob.submit()
rayjob.status()

Expected: Creates ConfigMap, runs job, auto-deletes cluster when done.

Test 2: Long-Lived Cluster

from codeflare_sdk import Cluster, ClusterConfiguration, RayJob

# Create cluster
cluster = Cluster(ClusterConfiguration(
    name='existing-cluster',
    num_workers=1,
    namespace='rhods-notebooks'
))
cluster.apply()
cluster.wait_ready()

# Submit job
rayjob = RayJob(
    job_name="test-existing",
    cluster_name="existing-cluster",
    namespace="rhods-notebooks",
    entrypoint="python -c 'import ray; ray.init(); print(\"Hello\")'",
)

rayjob.submit()
rayjob.status()

# Cleanup
rayjob.delete()
cluster.down()

Expected: Job runs on existing cluster, cluster persists after job completes.

Validation Test (Should Fail)

rayjob = RayJob(
    job_name="test-validation",
    namespace="rhods-notebooks",
    cluster_config=ManagedClusterConfig(),
    entrypoint="python ./<directory>/test.py",  # Wrong!
    runtime_env={"working_dir": "./<directory>"},
)
rayjob.submit()

Expected: Fails with clear error about working_dir conflict.

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Oct 7, 2025

@kryanbeane: This pull request references RHOAIENG-33283 which is a valid jira issue.

In response to this:

Issue link

RHOAIENG-33283

What changes have been made

  • if you omit runtime_env for local files, it'll parse imports etc and mount what it thinks is needed. Same for existing and lifecycled.
  • if you have a remote working_dir it doesn't need to mount anything, just parses the requirements file and populates runtimeEnvYAML with what it needs from runtime_env in the SDK.
  • allows for passing a github zip url as the working directory, ray will ƒetch and unzip this on the head for us

Verification steps

See verification doc

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@kryanbeane kryanbeane requested review from LilyLinh and pawelpaszki and removed request for dimakis October 7, 2025 14:58
@codecov
Copy link

codecov bot commented Oct 7, 2025

Codecov Report

❌ Patch coverage is 98.15498% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.26%. Comparing base (219d1c5) to head (a97d969).
⚠️ Report is 4 commits behind head on ray-jobs-feature.

Files with missing lines Patch % Lines
src/codeflare_sdk/ray/rayjobs/rayjob.py 94.93% 4 Missing ⚠️
src/codeflare_sdk/ray/rayjobs/runtime_env.py 99.37% 1 Missing ⚠️
Additional details and impacted files
@@                 Coverage Diff                  @@
##           ray-jobs-feature     #922      +/-   ##
====================================================
+ Coverage             94.04%   94.26%   +0.22%     
====================================================
  Files                    22       24       +2     
  Lines                  1914     2040     +126     
====================================================
+ Hits                   1800     1923     +123     
- Misses                  114      117       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@kryanbeane kryanbeane force-pushed the RHOAIENG-33283 branch 6 times, most recently from 39320fc to 8974240 Compare October 8, 2025 14:06
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 8, 2025
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 8, 2025
@kryanbeane
Copy link
Contributor Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 8, 2025
@kryanbeane kryanbeane force-pushed the RHOAIENG-33283 branch 3 times, most recently from 7cd5a95 to b195de7 Compare October 10, 2025 11:18
@kryanbeane
Copy link
Contributor Author

/override codecov/patch

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 10, 2025

@kryanbeane: Overrode contexts on behalf of kryanbeane: codecov/patch

In response to this:

/override codecov/patch

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Oct 10, 2025

@kryanbeane: This pull request references RHOAIENG-33283 which is a valid jira issue.

In response to this:

Issue link

RHOAIENG-33283

What changes have been made

  • if you omit runtime_env for local files, it'll parse imports etc and mount what it thinks is needed. Same for existing and lifecycled.
  • if you have a remote working_dir it doesn't need to mount anything, just parses the requirements file and populates runtimeEnvYAML with what it needs from runtime_env in the SDK.
  • allows for passing a github zip url as the working directory, ray will ƒetch and unzip this on the head for us

Verification steps

**IMPORTANT: ** These steps assume you have some kind of files in the directory to use. Create a directory and a sample python file inside to use for testing!

Setup

poetry build
%pip install dist/codeflare_sdk-0.31.1-py3-none-any.whl --force-reinstall

Test 1: Lifecycled Cluster

from codeflare_sdk import RayJob, ManagedClusterConfig

rayjob = RayJob(
   job_name="test-lifecycled",
   namespace="rhods-notebooks",
   cluster_config=ManagedClusterConfig(),
   entrypoint="python test.py",
   runtime_env={"working_dir": "./<directory>"},
)

rayjob.submit()
rayjob.status()

Expected: Creates ConfigMap, runs job, auto-deletes cluster when done.

Test 2: Long-Lived Cluster

from codeflare_sdk import Cluster, ClusterConfiguration, RayJob

# Create cluster
cluster = Cluster(ClusterConfiguration(
   name='existing-cluster',
   num_workers=1,
   namespace='rhods-notebooks'
))
cluster.apply()
cluster.wait_ready()

# Submit job
rayjob = RayJob(
   job_name="test-existing",
   cluster_name="existing-cluster",
   namespace="rhods-notebooks",
   entrypoint="python -c 'import ray; ray.init(); print(\"Hello\")'",
)

rayjob.submit()
rayjob.status()

# Cleanup
rayjob.delete()
cluster.down()

Expected: Job runs on existing cluster, cluster persists after job completes.

Validation Test (Should Fail)

rayjob = RayJob(
   job_name="test-validation",
   namespace="rhods-notebooks",
   cluster_config=ManagedClusterConfig(),
   entrypoint="python ./<directory>/test.py",  # Wrong!
   runtime_env={"working_dir": "./<directory>"},
)
rayjob.submit()

Expected: Fails with clear error about working_dir conflict.

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Oct 10, 2025

@kryanbeane: This pull request references RHOAIENG-33283 which is a valid jira issue.

In response to this:

Issue link

RHOAIENG-33283

What changes have been made

  • if you omit runtime_env for local files, it'll parse imports etc and mount what it thinks is needed. Same for existing and lifecycled.
  • if you have a remote working_dir it doesn't need to mount anything, just parses the requirements file and populates runtimeEnvYAML with what it needs from runtime_env in the SDK.
  • allows for passing a github zip url as the working directory, ray will ƒetch and unzip this on the head for us

Verification steps

**IMPORTANT: ** These steps assume you have some kind of files in the directory to use. Create a directory and a sample python file inside to use for testing!

Setup

poetry build
%pip install dist/codeflare_sdk-0.31.1-py3-none-any.whl --force-reinstall

Test 1: Lifecycled Cluster

from codeflare_sdk import RayJob, ManagedClusterConfig

rayjob = RayJob(
   job_name="test-lifecycled",
   namespace="rhods-notebooks",
   cluster_config=ManagedClusterConfig(),
   entrypoint="python test.py",
   runtime_env={"working_dir": "./<directory>"},
)

rayjob.submit()
rayjob.status()

Expected: Creates ConfigMap, runs job, auto-deletes cluster when done.

Test 2: Long-Lived Cluster

from codeflare_sdk import Cluster, ClusterConfiguration, RayJob

# Create cluster
cluster = Cluster(ClusterConfiguration(
   name='existing-cluster',
   num_workers=1,
   namespace='rhods-notebooks'
))
cluster.apply()
cluster.wait_ready()

# Submit job
rayjob = RayJob(
   job_name="test-existing",
   cluster_name="existing-cluster",
   namespace="rhods-notebooks",
   entrypoint="python -c 'import ray; ray.init(); print(\"Hello\")'",
)

rayjob.submit()
rayjob.status()

# Cleanup
rayjob.delete()
cluster.down()

Expected: Job runs on existing cluster, cluster persists after job completes.

Validation Test (Should Fail)

rayjob = RayJob(
   job_name="test-validation",
   namespace="rhods-notebooks",
   cluster_config=ManagedClusterConfig(),
   entrypoint="python ./<directory>/test.py",  # Wrong!
   runtime_env={"working_dir": "./<directory>"},
)
rayjob.submit()

Expected: Fails with clear error about working_dir conflict.

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@pawelpaszki
Copy link
Contributor

Thanks! Verified the changes against ROSA cluster and executed rayjob e2e tests (also against ROSA cluster) successfully. what do we do with the failing e2e tests here?

pawelpaszki
pawelpaszki previously approved these changes Oct 13, 2025
@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 13, 2025
@laurafitzgerald
Copy link
Contributor

Verification

Test 1: Lifecycled Cluster - Verified as described

Test 2: Long-Lived Cluster - Verified

The verification steps in the description needed an update as they were not providing a runtime_env but given the below it worked.

from codeflare_sdk import Cluster, ClusterConfiguration, RayJob

# Create cluster
cluster = Cluster(ClusterConfiguration(
    name='existing-cluster',
    num_workers=1,
    namespace='rhods-notebooks'
))
cluster.apply()
cluster.wait_ready()
# Submit job
# Submit job
rayjob = RayJob(
    job_name="test-existing",
    cluster_name="existing-cluster",
    namespace="rhods-notebooks",
    entrypoint="python test.py",
    runtime_env={"working_dir": "./"},
)
rayjob.submit()

Submitting a second RayJob to that Raycluster also worked!!

One thing is that it would be good to label the config map with an ownerRef so that it's deleted when the sumbitter pod is deleted. cc @kryanbeane but we can add that in a seperate pr if you'd prefer to get this merged. 

@laurafitzgerald
Copy link
Contributor

laurafitzgerald commented Oct 14, 2025

I didn't see a failure on

rayjob = RayJob(
    job_name="test-validation",
    namespace="rhods-notebooks",
    cluster_config=ManagedClusterConfig(),
    entrypoint="python ./test.py",  # Wrong!
    runtime_env={"working_dir": "./"},
)
rayjob.submit()

OR

rayjob = RayJob(
    job_name="test-validation",
    namespace="rhods-notebooks",
    cluster_name="existing-cluster",
    entrypoint="python ./test.py",  # Wrong!
    runtime_env={"working_dir": "./"},
)
rayjob.submit()

@openshift-ci openshift-ci bot removed lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 14, 2025
@kryanbeane kryanbeane force-pushed the RHOAIENG-33283 branch 2 times, most recently from c2d866b to c5ef9a6 Compare October 14, 2025 12:13
@laurafitzgerald
Copy link
Contributor

laurafitzgerald commented Oct 14, 2025

Verification - Existing Cluster

  • Secret is created rather than config map
  • other functionality working fine.

Verification - Lifecycled Cluster

  • secret is created rather than config map

Verifying that file not present produces an error

ValueError: ❌ Entrypoint file not found:
   Looking for: '../test.py'
   (working_dir: '.[./](<redacted>', entrypoint file: 'test.py')

Please ensure the file exists at the expected location.

Verification - .ipynb file is not included

@laurafitzgerald
Copy link
Contributor

/approve
/lgtm

Verified and feel free to remove the hold when you are ready @kryanbeane

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 14, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 14, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: laurafitzgerald

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 14, 2025
@kryanbeane
Copy link
Contributor Author

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 14, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 1a61bfd into project-codeflare:ray-jobs-feature Oct 14, 2025
11 checks passed
@kryanbeane kryanbeane deleted the RHOAIENG-33283 branch October 14, 2025 15:55
pawelpaszki added a commit to red-hat-data-services/ods-ci that referenced this pull request Oct 20, 2025
## WHAT
Addition of rayjob tests (with Bryan's latest changes (matching
timestamp of this PR) fro m [this
PR](project-codeflare/codeflare-sdk#922) and
bumping codeflare-sdk tag (to match the upcoming release version)

## VERIFICATION
Verified locally with a [test
branch](https://github.com/red-hat-data-services/ods-ci/compare/master...pawelpaszki:ods-ci:[RHOAIENG-33408](https://issues.redhat.com//browse/RHOAIENG-33408)-test)
with temp tags to execute only relevant tests
```
bash-5.2# git checkout RHOAIENG-33408-test
branch 'RHOAIENG-33408-test' set up to track 'pawel/RHOAIENG-33408-test'.
Switched to a new branch 'RHOAIENG-33408-test'

bash-5.2# export WORKSPACE="/workspace/ods-ci/ods_ci"
bash-5.2# cd ods_ci/
bash-5.2# ./run_robot_test.sh --include ttt --skip-oclogin true
test-variables.yml
./run_robot_test.sh: line 211: distro: command not found
INFO: we found a yq executable
skipping OC login as per parameter --skip-oclogin
Git revision refname='RHOAIENG-33408-test', venvdir='RHOAIENG-33408-test'.
Checking whether '/root/.local/ods-ci/RHOAIENG-33408-test/.venv' exists.
Checking whether '/root/.local/ods-ci/master/.venv' exists.
Pre-created virtual environment has not been found in '/root/.local/ods-ci/master/.venv'. All dependencies will be installed from scratch.
Python '' is not of the correct version
Configuring poetry to use Python /root/.pyenv/shims/python3.11
Creating virtualenv ods-ci-VVJNOhYl-py3.11 in /root/.cache/pypoetry/virtualenvs
Using virtualenv: /root/.cache/pypoetry/virtualenvs/ods-ci-VVJNOhYl-py3.11
Installing dependencies from lock file

Package operations: 239 installs, 0 updates, 0 removals

  - Installing attrs (24.2.0)
  - Installing pyasn1 (0.6.1)

...

  - Installing robotframework-openshift (1.0.0 1297347)

Installing the current project: ods-ci (0.1.0)
==============================================================================
Tests                                                                         
==============================================================================
Tests.Distributed Workloads                                                   
==============================================================================
Tests.Distributed Workloads.Workloads Orchestration                           
==============================================================================
2025-10-14 13:59:32,344 - RPA.core.certificates - INFO - Truststore not in use, HTTPS traffic validated against `certifi` package. (requires Python 3.10.12 and 'pip' 23.2.1 at minimum)
Tests.Distributed Workloads.Workloads Orchestration.Test-Run-Codeflare-Sdk-...
==============================================================================
Cloning into 'codeflare-sdk'...
Note: switching to 'c5ef9a6c5384e167a24c5c1ac261d3a1f6e3d432'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

Product:RHODS Version:2.24.0
[ WARN ] No Prometheus found
Run TestRayJobRayVersionValidationOauth test with Python 3.11 :: R... "Running codeflare-sdk test: ray_version_validation_oauth_test.py"
HEAD is now at c5ef9a6 RHOAIENG-33283: Change ConfigMaps to Secrets
* (no branch)
Creating virtualenv codeflare-sdk-_B-kuLxP-py3.11 in /root/.cache/pypoetry/virtualenvs
Using virtualenv: /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11
Installing dependencies from lock file

Package operations: 175 installs, 0 updates, 0 removals

  - Installing attrs (25.3.0)
  - Installing rpds-py (0.26.0)
 
..

  - Installing python-client (0.0.0-dev b2fd91b)
Warning: The file chosen for install of virtualenv 20.35.1 (virtualenv-20.35.1-py3-none-any.whl) is yanked. Reason for being yanked: Backwards incompatible changes

Installing the current project: codeflare-sdk (0.31.1)
============================= test session starts ==============================
platform linux -- Python 3.11.5, pytest-7.4.0, pluggy-1.6.0 -- /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/bin/python
cachedir: .pytest_cache
rootdir: /workspace/ods-ci/ods_ci/codeflare-sdk
configfile: pyproject.toml
plugins: anyio-4.9.0, mock-3.11.1, timeout-2.3.1
timeout: 900.0s
timeout method: signal
timeout func_only: False
collecting ... collected 2 items

tests/e2e/rayjob/ray_version_validation_oauth_test.py::TestRayJobRayVersionValidationOauth::test_rayjob_lifecycled_cluster_incompatible_ray_version_oauth creating Kueue resources ...
'test-resource-flavor-zdzxw' created!
'test-cluster-queue-6p6d6' created
'test-local-queue-97jxx' created in namespace 'test-ns-11cqv'
Creating RayJob with incompatible Ray image in cluster config: quay.io/modh/ray:2.46.1-py311-cu121
Attempting to submit RayJob 'incompatible-lifecycle-rayjob' with incompatible Ray version...
✅ Ray version validation correctly prevented RayJob submission with incompatible cluster config!
PASSED
'test-cluster-queue-6p6d6' cluster-queue deleted
'test-resource-flavor-zdzxw' resource-flavor deleted

tests/e2e/rayjob/ray_version_validation_oauth_test.py::TestRayJobRayVersionValidationOauth::test_rayjob_lifecycled_cluster_unknown_ray_version_oauth creating Kueue resources ...
'test-resource-flavor-apb9z' created!
'test-cluster-queue-lfkds' created
'test-local-queue-6tos0' created in namespace 'test-ns-nlaen'
Creating RayJob with image where Ray version cannot be determined: quay.io/modh/ray@sha256:6d076aeb38ab3c34a6a2ef0f58dc667089aa15826fa08a73273c629333e12f1e
Attempting to submit RayJob 'unknown-version-rayjob' with unknown Ray version...
✅ RayJob submission succeeded with warning for unknown Ray version!
Note: RayJob 'unknown-version-rayjob' was submitted successfully but may need manual cleanup.
PASSED
'test-cluster-queue-lfkds' cluster-queue deleted
'test-resource-flavor-apb9z' resource-flavor deleted


=============================== warnings summary ===============================
../../../../root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/lib/python3.11/site-packages/_pytest/config/__init__.py:1373
  /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/lib/python3.11/site-packages/_pytest/config/__init__.py:1373: PytestConfigWarning: Unknown config option: collect_ignore
  
    self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")

tests/e2e/rayjob/ray_version_validation_oauth_test.py::TestRayJobRayVersionValidationOauth::test_rayjob_lifecycled_cluster_incompatible_ray_version_oauth
tests/e2e/rayjob/ray_version_validation_oauth_test.py::TestRayJobRayVersionValidationOauth::test_rayjob_lifecycled_cluster_incompatible_ray_version_oauth
tests/e2e/rayjob/ray_version_validation_oauth_test.py::TestRayJobRayVersionValidationOauth::test_rayjob_lifecycled_cluster_incompatible_ray_version_oauth
tests/e2e/rayjob/ray_version_validation_oauth_test.py::TestRayJobRayVersionValidationOauth::test_rayjob_lifecycled_cluster_unknown_ray_version_oauth
tests/e2e/rayjob/ray_version_validation_oauth_test.py::TestRayJobRayVersionValidationOauth::test_rayjob_lifecycled_cluster_unknown_ray_version_oauth
tests/e2e/rayjob/ray_version_validation_oauth_test.py::TestRayJobRayVersionValidationOauth::test_rayjob_lifecycled_cluster_unknown_ray_version_oauth
  /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/lib/python3.11/site-packages/kubernetes/client/rest.py:44: DeprecationWarning: HTTPResponse.getheaders() is deprecated and will be removed in urllib3 v2.6.0. Instead access HTTPResponse.headers directly.
    return self.urllib3_response.getheaders()

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================== 2 passed, 7 warnings in 8.21s =========================
Run TestRayJobRayVersionValidationOauth test with Python 3.11 :: R... | PASS |
------------------------------------------------------------------------------
Run TestRayJobExistingCluster test with Python 3.11 :: Run Python ... "Running codeflare-sdk test: rayjob_existing_cluster_test.py"
HEAD is now at c5ef9a6 RHOAIENG-33283: Change ConfigMaps to Secrets
* (no branch)
Using virtualenv: /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11
Installing dependencies from lock file

No dependencies to install or update

Installing the current project: codeflare-sdk (0.31.1)
============================= test session starts ==============================
platform linux -- Python 3.11.5, pytest-7.4.0, pluggy-1.6.0 -- /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/bin/python
cachedir: .pytest_cache
rootdir: /workspace/ods-ci/ods_ci/codeflare-sdk
configfile: pyproject.toml
plugins: anyio-4.9.0, mock-3.11.1, timeout-2.3.1
timeout: 900.0s
timeout method: signal
timeout func_only: False
collecting ... collected 1 item

tests/e2e/rayjob/rayjob_existing_cluster_test.py::TestRayJobExistingCluster::test_existing_kueue_cluster creating Kueue resources ...
'test-resource-flavor-ly938' created!
'test-cluster-queue-q79cp' created
'test-local-queue-tn4bl' created in namespace 'test-ns-kh658'
Insecure request warnings have been disabled
Warning: TLS verification has been disabled - Endpoint checks will be bypassed
Written to: /root/.codeflare/resources/kueue-cluster.yaml
Written to: /root/.codeflare/resources/kueue-cluster.yaml
Ray Cluster: 'kueue-cluster' has successfully been applied. For optimal resource management, you should delete this Ray Cluster when no longer in use.
Waiting for cluster 'kueue-cluster' to be ready...
Waiting for requested resources to be set up...
Requested cluster is up and running!
Dashboard is ready!
✓ Cluster 'kueue-cluster' is ready
Ray Cluster: 'kueue-cluster' has successfully been deleted
PASSED
'test-cluster-queue-q79cp' cluster-queue deleted
'test-resource-flavor-ly938' resource-flavor deleted


=============================== warnings summary ===============================
../../../../root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/lib/python3.11/site-packages/_pytest/config/__init__.py:1373
  /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/lib/python3.11/site-packages/_pytest/config/__init__.py:1373: PytestConfigWarning: Unknown config option: collect_ignore
  
    self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")

tests/e2e/rayjob/rayjob_existing_cluster_test.py::TestRayJobExistingCluster::test_existing_kueue_cluster
tests/e2e/rayjob/rayjob_existing_cluster_test.py::TestRayJobExistingCluster::test_existing_kueue_cluster
tests/e2e/rayjob/rayjob_existing_cluster_test.py::TestRayJobExistingCluster::test_existing_kueue_cluster
  /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/lib/python3.11/site-packages/kubernetes/client/rest.py:44: DeprecationWarning: HTTPResponse.getheaders() is deprecated and will be removed in urllib3 v2.6.0. Instead access HTTPResponse.headers directly.
    return self.urllib3_response.getheaders()

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=================== 1 passed, 4 warnings in 76.72s (0:01:16) ===================
Run TestRayJobExistingCluster test with Python 3.11 :: Run Python ... | PASS |
------------------------------------------------------------------------------
Run TestRayJobLifecycledCluster test with Python 3.11 :: Run Pytho... "Running codeflare-sdk test: rayjob_lifecycled_cluster_test.py"
HEAD is now at c5ef9a6 RHOAIENG-33283: Change ConfigMaps to Secrets
* (no branch)
Using virtualenv: /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11
Installing dependencies from lock file

No dependencies to install or update

Installing the current project: codeflare-sdk (0.31.1)
============================= test session starts ==============================
platform linux -- Python 3.11.5, pytest-7.4.0, pluggy-1.6.0 -- /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/bin/python
cachedir: .pytest_cache
rootdir: /workspace/ods-ci/ods_ci/codeflare-sdk
configfile: pyproject.toml
plugins: anyio-4.9.0, mock-3.11.1, timeout-2.3.1
timeout: 900.0s
timeout method: signal
timeout func_only: False
collecting ... collected 2 items

tests/e2e/rayjob/rayjob_lifecycled_cluster_test.py::TestRayJobLifecycledCluster::test_lifecycled_kueue_managed creating Kueue resources ...
'test-resource-flavor-4eshz' created!
'test-cluster-queue-6kxf0' created
'test-local-queue-6l8bq' created in namespace 'test-ns-u48bn'
✓ Secret kueue-lifecycled-files verified with proper owner reference
PASSED
'test-cluster-queue-6kxf0' cluster-queue deleted
'test-resource-flavor-4eshz' resource-flavor deleted

tests/e2e/rayjob/rayjob_lifecycled_cluster_test.py::TestRayJobLifecycledCluster::test_lifecycled_kueue_resource_queueing Creating limited Kueue resources for preemption testing...
'limited-flavor-lvt8y' created!
✓ Created limited ClusterQueue: limited-cq-slkj3
'limited-lq-8k4tm' created in namespace 'test-ns-95l04'
✓ Limited Kueue resources created successfully
Waiting for Kueue admission of job 'waiter'...
✓ Job 'waiter' admitted by Kueue (no longer suspended)
PASSED
'limited-cq-slkj3' cluster-queue deleted
'limited-flavor-lvt8y' resource-flavor deleted


=============================== warnings summary ===============================
../../../../root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/lib/python3.11/site-packages/_pytest/config/__init__.py:1373
  /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/lib/python3.11/site-packages/_pytest/config/__init__.py:1373: PytestConfigWarning: Unknown config option: collect_ignore
  
    self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")

tests/e2e/rayjob/rayjob_lifecycled_cluster_test.py::TestRayJobLifecycledCluster::test_lifecycled_kueue_managed
tests/e2e/rayjob/rayjob_lifecycled_cluster_test.py::TestRayJobLifecycledCluster::test_lifecycled_kueue_managed
tests/e2e/rayjob/rayjob_lifecycled_cluster_test.py::TestRayJobLifecycledCluster::test_lifecycled_kueue_managed
tests/e2e/rayjob/rayjob_lifecycled_cluster_test.py::TestRayJobLifecycledCluster::test_lifecycled_kueue_resource_queueing
tests/e2e/rayjob/rayjob_lifecycled_cluster_test.py::TestRayJobLifecycledCluster::test_lifecycled_kueue_resource_queueing
  /root/.cache/pypoetry/virtualenvs/codeflare-sdk-_B-kuLxP-py3.11/lib/python3.11/site-packages/kubernetes/client/rest.py:44: DeprecationWarning: HTTPResponse.getheaders() is deprecated and will be removed in urllib3 v2.6.0. Instead access HTTPResponse.headers directly.
    return self.urllib3_response.getheaders()

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=================== 2 passed, 6 warnings in 98.01s (0:01:38) ===================
Run TestRayJobLifecycledCluster test with Python 3.11 :: Run Pytho... | PASS |
------------------------------------------------------------------------------
"Removing directory codeflare-sdk"
"Log back as cluster admin"
"Logging in as cluster admin to cleanup RBAC permissions"
"Removing kueue-batch-user-rolebinding ClusterRoleBinding"
Error from server (NotFound): error when deleting "tests/Resources/Files/kueue-batch-user-rolebinding.yaml": clusterrolebindings.rbac.authorization.k8s.io "kueue-batch-user-rolebinding" not found
"Warning: Unable to delete kueue-batch-user-rolebinding ClusterRoleBinding (may not exist)"
"Removing kueue-batch-user-specific-rolebinding ClusterRoleBinding"
Error from server (NotFound): error when deleting "tests/Resources/Files/kueue-batch-user-specific-rolebinding.yaml": clusterrolebindings.rbac.authorization.k8s.io "kueue-batch-user-specific-rolebinding" not found
"Warning: Unable to delete kueue-batch-user-specific-rolebinding ClusterRoleBinding (may not exist)"
"Removing kueue-batch-user-role ClusterRole"
Error from server (NotFound): error when deleting "tests/Resources/Files/kueue-batch-user-role.yaml": clusterroles.rbac.authorization.k8s.io "kueue-batch-user-role" not found
"Warning: Unable to delete kueue-batch-user-role ClusterRole (may not exist)"
Tests.Distributed Workloads.Workloads Orchestration.Test-Run-Codef... | PASS |
3 tests, 3 passed, 0 failed
==============================================================================
Tests.Distributed Workloads.Workloads Orchestration                   | PASS |
3 tests, 3 passed, 0 failed
==============================================================================
Tests.Distributed Workloads                                           | PASS |
3 tests, 3 passed, 0 failed
==============================================================================
Tests                                                                 | PASS |
3 tests, 3 passed, 0 failed
==============================================================================
Output:  /workspace/ods-ci/ods_ci/test-output/ods-ci-2025-10-14-13-59-qDi05w2uXs/output.xml
XUnit:   /workspace/ods-ci/ods_ci/test-output/ods-ci-2025-10-14-13-59-qDi05w2uXs/xunit_test_result.xml
Log:     /workspace/ods-ci/ods_ci/test-output/ods-ci-2025-10-14-13-59-qDi05w2uXs/log.html
Report:  /workspace/ods-ci/ods_ci/test-output/ods-ci-2025-10-14-13-59-qDi05w2uXs/test_report.html
0
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants