Skip to content

Conversation

jstourac
Copy link
Member

@jstourac jstourac commented Jul 1, 2025

This commit propagates labels from Notebooks to statefulsets. This is required to enable NBs to be managed by Kueue. When a user adds the label to point a NB to the right LocalQueue where it needs to be admitted, it needs to propagate to the underlying StatefulSet, so that Kueue can manage it.

(cherry picked from commit f6abf13) --- kubeflow#7674

https://issues.redhat.com/browse/RHOAIENG-28544

How Has This Been Tested?

I created a very simple test to the existing bdd ginkgo suite.

Regarding the manual testing I did:

  1. Installed the RHOAI 2.22RC1
  2. Created one simple workbench in a new DS project
  3. Scaled down the rhods-operator deployment to 0
  4. Updated the Deployment of the notebook-controller to use my custom image build from this branch (quay.io/opendatahub/kubeflow-notebook-controller:pr-646)
  5. I can see that the running workbench StatefulSet now contains the same labels as are defined for its Notebook CR:
  labels:
    app: ffff
    opendatahub.io/dashboard: 'true'
    opendatahub.io/odh-managed: 'true'
    opendatahub.io/user: htpasswd-2dcluster-2dadmin-2duser
  1. I created a new workbench and I can see that the labels are identical between Notebook and StatefulSet CRs (similarly as above)
  2. When I manually add a new label to the Notebook CR, it is propagated to the StatefulSet right away
  3. When I manually remove this new label from the Notebook CR, it is then removed from the StatefulSet too right away
  4. When I manually remove the new label from the StatefulSet CR, it is added back to match the value in the Notebook CR

I haven't performed any actual functional checks with the kueue yet.

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Bug Fixes

    • Ensured that all labels from the Notebook resource are correctly applied to the associated StatefulSet, improving label consistency.
  • Tests

    • Added and updated tests to verify that labels on the StatefulSet exactly match those on the Notebook resource.

Copy link

openshift-ci bot commented Jul 1, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

coderabbitai bot commented Jul 1, 2025

Walkthrough

The changes update the notebook controller to ensure that all labels from a Notebook resource are copied to the corresponding StatefulSet's metadata. The associated test is updated to include a custom label and asserts that the StatefulSet's labels match those of the Notebook.

Changes

Files/Paths Change Summary
components/notebook-controller/controllers/notebook_controller.go Initializes StatefulSet Labels as an empty map and copies all Notebook labels into it.
components/notebook-controller/controllers/notebook_controller_bdd_test.go Adds test label to Notebook, simplifies existence check, and asserts StatefulSet labels match source.

Suggested labels

tide/merge-method-squash, size/l


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5db657d and 80438c6.

📒 Files selected for processing (2)
  • components/notebook-controller/controllers/notebook_controller.go (2 hunks)
  • components/notebook-controller/controllers/notebook_controller_bdd_test.go (4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1126-1131
Timestamp: 2025-07-02T04:00:16.948Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, OAuth finalizer tests should be streamlined into single test blocks rather than multiple "It" blocks, as checking finalizer removal races with object deletion and is typically unobservable in Kubernetes.
components/notebook-controller/controllers/notebook_controller_bdd_test.go (3)
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.926Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1126-1131
Timestamp: 2025-07-02T04:00:16.948Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, OAuth finalizer tests should be streamlined into single test blocks rather than multiple "It" blocks, as checking finalizer removal races with object deletion and is typically unobservable in Kubernetes.
components/notebook-controller/controllers/notebook_controller.go (3)
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1126-1131
Timestamp: 2025-07-02T04:00:16.948Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, OAuth finalizer tests should be streamlined into single test blocks rather than multiple "It" blocks, as checking finalizer removal races with object deletion and is typically unobservable in Kubernetes.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (5)
components/notebook-controller/controllers/notebook_controller.go (1)

442-442: LGTM! Proper initialization prevents nil pointer issues.

The empty map initialization is necessary to safely assign labels in the subsequent code block.

components/notebook-controller/controllers/notebook_controller_bdd_test.go (4)

40-42: LGTM! Test constants are well-defined.

The test constants follow proper naming conventions and will be used to verify label propagation functionality.


53-55: LGTM! Notebook object properly includes test labels.

The test Notebook object now includes the test label which is essential for verifying the label propagation functionality.


71-71: LGTM! Simplified error checking is more concise.

The direct return of err == nil is cleaner and more readable than the previous implementation.


89-90: LGTM! Test assertion correctly verifies label propagation.

The assertion properly verifies that the StatefulSet labels exactly match the Notebook labels, which is the core functionality being tested. This ensures the label propagation feature works as expected for Kueue integration.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@openshift-ci openshift-ci bot added size/xs and removed size/xs labels Jul 1, 2025
@codecov-commenter
Copy link

codecov-commenter commented Jul 1, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 35.09%. Comparing base (5db657d) to head (80438c6).

❗ There is a different number of reports uploaded between BASE (5db657d) and HEAD (80438c6). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (5db657d) HEAD (80438c6)
2 1
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #646       +/-   ##
===========================================
- Coverage   57.47%   35.09%   -22.38%     
===========================================
  Files          11        2        -9     
  Lines        3071      966     -2105     
===========================================
- Hits         1765      339     -1426     
+ Misses       1136      586      -550     
+ Partials      170       41      -129     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This commit propagates labels from Notebooks to statefulsets.
This is required to enable NBs to be managed by Kueue. When a user
adds the label to point a NB to the right LocalQueue where it
needs to be admitted, it needs to propagate to the underlying
StatefulSet, so that Kueue can manage it.

Signed-off-by: Varsha Prasad Narsing <[email protected]>
(cherry picked from commit f6abf13)
@jstourac jstourac changed the title [Add] Propogate labels from NB to StatefulSets [Add] Propagate labels from NB to StatefulSets Jul 3, 2025
@openshift-ci openshift-ci bot added size/xs and removed size/xs labels Jul 3, 2025
@jiridanek
Copy link
Member

/lgtm

@jiridanek
Copy link
Member

@varshaprasad96 can you please review this? we're cherry-picking your change that you've originally proposed in kubeflow notebooks controller upstream repo

Copy link

@varshaprasad96 varshaprasad96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek the last I tested this change in NBs, it worked as intended to get labels into the pods from SS for Kueue to pickup. So looks good from my end.
/lgtm

I'm not super familiar with the latest changes that may have been made in NB controller implementation, so probably someone from that team can verify it again. But all good from my end.

@jstourac
Copy link
Member Author

jstourac commented Jul 9, 2025

@varshaprasad96 do you recall whether you just tested the labels propagation or also whether the kueue integration works as expected with such change?

@jstourac jstourac marked this pull request as ready for review July 9, 2025 10:33
@jiridanek
Copy link
Member

/lgtm
still

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Labels are propogated as expected
Tested on cluster:
On setting up a kueue enabled ODH cluster

  • including labels on Notebook CR
metadata:
  labels:
    kueue-managed: 'true'
    kueue.x-k8s.io/queue-name: kueue-local-queue
  • successfully set the label in stateful set
Screenshot from 2025-07-16 02-11-29 Screenshot from 2025-07-16 02-12-20

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further can confirm:
kueue is able to attach to the statefulset
Screenshot from 2025-07-16 02-09-39
Screenshot from 2025-07-16 02-04-33

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks for the work 💯

Copy link

openshift-ci bot commented Jul 16, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: harshad16, varshaprasad96

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jstourac jstourac merged commit af2d995 into opendatahub-io:main Jul 16, 2025
16 of 18 checks passed
@jstourac jstourac deleted the RHOAIENG-28544 branch July 16, 2025 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants