-
Notifications
You must be signed in to change notification settings - Fork 58
RHOAIENG-32532: Update kueue integration #910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RHOAIENG-32532: Update kueue integration #910
Conversation
|
@kryanbeane: This pull request references RHOAIENG-32532 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@kryanbeane: This pull request references RHOAIENG-32532 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@kryanbeane: This pull request references RHOAIENG-32532 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@kryanbeane: This pull request references RHOAIENG-32532 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
338b0ce to
d259ec1
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## ray-jobs-feature #910 +/- ##
====================================================
- Coverage 94.17% 94.04% -0.14%
====================================================
Files 22 22
Lines 1924 1914 -10
====================================================
- Hits 1812 1800 -12
- Misses 112 114 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
cb5dbdf to
80d1307
Compare
|
if these tests are to executed during any release testing - can you explain if there are any setup (or other) steps required for the tests execution? |
|
I have verified the changes on a ROSA cluster by running the tests - looks good. waiting for kind e2e tests to pass |
All we'll need is RHBoK installed and RayJob and RayCluster integrations enabled |
|
E2E is still failing. I'll add a fix in a new commit so we only need to re-review that commit 👍🏻 |
aeccfc5 to
e5950a0
Compare
f06d22e to
2f0e79c
Compare
6051ec4 to
eb21aac
Compare
538d345 to
b556fab
Compare
ad304cc to
665dcb2
Compare
eb21aac to
f2a4cd0
Compare
f2a4cd0 to
01c8a7b
Compare
|
/approve Verified this works as expected. We do need to ensure that for cluster level kueue that the config map named |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: laurafitzgerald The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
219d1c5
into
project-codeflare:ray-jobs-feature
Issue link
RHOAIENG-32532
What changes have been made
Verification steps
Be oc logged into an openshift cluster with Kuberay enabled. Kueue should be disabled in the DSC, and you should install the latest version of RHBoK (RedHat Build of Kueue) installed. Search for Kueue in Operator Hub to find this. Run the below:
pip install -e . poetry run pytest tests/e2e/rayjob/ -v -s -xYou can also test this manually. Try to create a RayJob with the usual Kueue resources created (follow Pat's KS to set up the resources). You should see the local queue admit the job and the cluster get created.
Check the Workloads CR, there should only be one for the job, and no workloads for the RayCluster.
Checks