-
Notifications
You must be signed in to change notification settings - Fork 44
RHOAIENG-23751: Add finalizer to clean up OAuthClients on notebook deletion #605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
By("Manually deleting OAuthClient to simulate controller behavior") | ||
Expect(cli.Delete(ctx, oauthClient)).Should(Succeed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the controller running in the background? Should not it issue this deletion? How come the test looks almost like controller is not present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for raising this up, I believe this was good information that I should have put in the description (and I will do so as well).
during the tests, I've had a scenario where I could see all logs from the reconciler and finalizer appearing, but on the test environment this wasn't happening (sometimes, I believe I have even found the testing environment kind of not consistent enough to test finalizers), but I don't know if this is some extra configuration that needs to be added, if it's related to Kubebuilder, or something else.
if you have any suggestion or idea, I will look forward to trying it out and changing this to something more suitable, but since I couldn't exactly test the finalizers, I forced some scenarios to try to replicate the most portion of its execution, the way was possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is weird. We shouldn't simulate controller behavior in these tests. We should test it - so we should have test for:
- a Notebook creation - where we check that it gets the OAuthClient finalizer in it
- a Notebook deletion - where we check that both Notebook and OAuthClient objects are deleted properly (only these and not some other - e.g. in case we have two similar Notebooks present)
Or am I missing something? 🤔
// a clean up, without overloading OpenShift resources. | ||
func (r *OpenshiftNotebookReconciler) cleanupOAuthClient(notebook *nbv1.Notebook, ctx context.Context) error { | ||
// Name of the OAuthClient based on the Notebook name | ||
oauthClientName := fmt.Sprintf("%s-%s-%s", notebook.Name, notebook.Namespace, "oauth-client") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if we (in the original oauth pr) though about what implications this has for the limit on maximal notebook name length...
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #605 +/- ##
===========================================
+ Coverage 56.21% 68.39% +12.17%
===========================================
Files 11 9 -2
Lines 3072 2161 -911
===========================================
- Hits 1727 1478 -249
+ Misses 1185 551 -634
+ Partials 160 132 -28 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
notebook.Finalizers = append(notebook.Finalizers, Finalizer) | ||
Expect(cli.Update(ctx, notebook)).Should(Succeed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who's to add the finalizer? the controller or the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, actually, it's complicated! https://book.kubebuilder.io/reference/envtest.html#testing-considerations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and, to answer my question, controller should add finalizer IMO, so please don't set it in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I did mention before, the full test to check if it was deleted didn't work out properly with me before (maybe the explanation is in the link you sent above, or my lack of knowledge, or both), so I tried to test as much of the code to ensure the code would be consistent, even if the test itself wasn't what we expected in terms of functionality (the real deletion of the objects, etc)
feel free to suggest otherwise, as I believe we have some limitations on kubeflow itself and might need to get to an understanding when this kind of scenario happens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I believe we have some limitations on kubeflow itself and might need to get to an understanding when this kind of scenario happens
that could be a consideration, we'd need a more powerful test framework then
GinkgoT().Logf("OAuthClient still exists, which suggests the controller isn't handling deletion in the test environment") | ||
} | ||
|
||
// No assertion here - we're just observing behavior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's not possible to assert that OAuthClient
has been actually deleted? Because the test does not assert that now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is why this pull request has been sent before as "test if finalizers get added, checks if it gets removed after", etc, because I couldn't make it work entirely, maybe because of the limitation of the https://book.kubebuilder.io/reference/envtest.html#testing-considerations
I can try again and send so you can check, but before it didn't worked out for me before
we did not check if it did get deleted, but tried to assert that the majority of the code was doing work towards the deletion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try again, and if you know a way to make it work properly, I'd be really up for suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try something. Worst case, something like https://book-v3.book.kubebuilder.io/reference/envtest where they mention KinD, and https://github.com/kubeflow/notebooks/blob/notebooks-v2/workspaces/controller/test/e2e/e2e_suite_test.go#L47 which lets the kubebuilder test infra interacti with genuine kind should solve all problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take a look into it, thanks!
@daniellutz can you fix conflicts, I'll try to look into the tests then |
I'll try to rebase myself, so I don't bother you with such chores. |
I take that back. Resolving conflicts requires some thought, so I won't be resolving it in this pr and wait for @daniellutz to take care of it. |
I will rebase it now, as I was trying a couple of things yesterday. Just give me one moment. Thanks for trying it out! |
d09726f
to
105b60a
Compare
@CodeRabbit review |
✅ Actions performedReview triggered.
|
WalkthroughA new CustomResourceDefinition (CRD) for Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant K8s API
participant NotebookController
participant OAuthClient (CRD)
User->>K8s API: Create Notebook (with OAuth)
K8s API->>NotebookController: Trigger reconcile
NotebookController->>Notebook: Add finalizer (if missing)
NotebookController->>OAuthClient: Create/Update OAuthClient
User->>K8s API: Delete Notebook
K8s API->>NotebookController: Trigger reconcile (deletion)
NotebookController->>OAuthClient: Delete OAuthClient
NotebookController->>Notebook: Remove finalizer
K8s API->>Notebook: Complete deletion
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms (2)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
@coderabbitai full |
Tip For best results, initiate chat on the files or code changes. @jiridanek I'll perform a full review of this PR from scratch as requested. ✅ Actions performed Full review triggered.
📥 Commits being reviewedReviewing all files that changed from the base of the PR and the current head commit 105b60a. 📒 Files selected for processing (6)
The comprehensive review will analyze all aspects of the changes including:
Please note that the PR currently has merge conflicts that will need to be resolved before it can be merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
components/odh-notebook-controller/controllers/notebook_controller_test.go (1)
1076-1077
: Manual finalizer addition contradicts controller responsibility.The test manually adds the finalizer to the Notebook, but the controller should be responsible for adding finalizers during normal operation. This approach doesn't test the actual controller behavior.
Based on previous reviewer feedback, the controller should add the finalizer automatically. Consider testing:
- Notebook creation → verify controller adds finalizer
- Notebook deletion → verify controller removes finalizer and deletes OAuthClient
- notebook.Finalizers = append(notebook.Finalizers, Finalizer) - Expect(cli.Update(ctx, notebook)).Should(Succeed()) + // Wait for controller to add finalizer + Eventually(func() bool { + _ = cli.Get(ctx, key, notebook) + return containsFinalizer(notebook.Finalizers, Finalizer) + }, duration, interval).Should(BeTrue())components/odh-notebook-controller/controllers/notebook_oauth.go (1)
366-366
: Potential issue with OAuthClient name length.The constructed name pattern
{notebook-name}-{namespace}-oauth-client
could exceed Kubernetes' 253 character limit for object names if notebook names or namespaces are long. Consider adding validation or truncation logic to ensure the name stays within limits.func (r *OpenshiftNotebookReconciler) cleanupOAuthClient(notebook *nbv1.Notebook, ctx context.Context) error { // Name of the OAuthClient based on the Notebook name oauthClientName := fmt.Sprintf("%s-%s-%s", notebook.Name, notebook.Namespace, "oauth-client") + + // Ensure the name doesn't exceed Kubernetes name limits + const maxNameLength = 253 + if len(oauthClientName) > maxNameLength { + // Consider logging a warning or implementing a truncation strategy + log := logf.FromContext(ctx) + log.Error(nil, "OAuthClient name exceeds maximum length", "name", oauthClientName, "length", len(oauthClientName)) + return fmt.Errorf("OAuthClient name exceeds maximum length of %d characters", maxNameLength) + }
🧹 Nitpick comments (1)
components/odh-notebook-controller/controllers/notebook_oauth.go (1)
328-359
: Consider adding logging for better observability.The finalizer implementation is correct, but adding log statements would help with debugging and monitoring the lifecycle of OAuthClient cleanup.
func (r *OpenshiftNotebookReconciler) HandleOAuthClientFinalizer(notebook *nbv1.Notebook, ctx context.Context) (bool, error) { finalizerName := "notebooks.kubeflow.org/oauthclient" + log := logf.FromContext(ctx) // Check if object is being deleted if notebook.GetDeletionTimestamp() != nil { if controllerutil.ContainsFinalizer(notebook, finalizerName) { + log.Info("Notebook deletion detected, cleaning up OAuthClient", "notebook", notebook.Name) if err := r.cleanupOAuthClient(notebook, ctx); err != nil { return true, err } controllerutil.RemoveFinalizer(notebook, finalizerName) if err := r.Update(ctx, notebook); err != nil { return true, err } + log.Info("OAuthClient cleanup completed", "notebook", notebook.Name) } // Return true to indicate finalization is in progress return true, nil } // Add finalizer if it doesn't exist yet if !controllerutil.ContainsFinalizer(notebook, finalizerName) { + log.Info("Adding finalizer to notebook", "notebook", notebook.Name) controllerutil.AddFinalizer(notebook, finalizerName) if err := r.Update(ctx, notebook); err != nil { return true, err } // Return true to indicate we're setting up finalization return true, nil } // Return false to indicate normal reconciliation should proceed return false, nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
components/odh-notebook-controller/config/crd/external/oauthclients.oauth.openshift.io.yaml
(1 hunks)components/odh-notebook-controller/config/rbac/role.yaml
(2 hunks)components/odh-notebook-controller/controllers/notebook_controller.go
(1 hunks)components/odh-notebook-controller/controllers/notebook_controller_test.go
(2 hunks)components/odh-notebook-controller/controllers/notebook_oauth.go
(3 hunks)components/odh-notebook-controller/controllers/suite_test.go
(2 hunks)
🔇 Additional comments (12)
components/odh-notebook-controller/config/crd/external/oauthclients.oauth.openshift.io.yaml (3)
4-5
: Clarify the purpose of the "fake" annotation.The annotation
crd/fake: "true"
suggests this CRD is only for testing purposes. Please clarify whether this is intended for production use or if this is a test-only CRD definition.
14-14
: Cluster scope is appropriate for OAuthClient resources.The cluster-scoped nature aligns with the PR objective of managing cluster-level OAuthClient resources that cannot be garbage-collected using standard owner references.
17-65
: Schema definition is comprehensive and correct.The OpenAPI schema properly defines all standard OAuth client properties including grant methods, secrets, redirect URIs, token timeouts, and scope restrictions. The structure follows OpenShift OAuth client conventions.
components/odh-notebook-controller/controllers/suite_test.go (2)
40-40
: Import addition is necessary for OAuthClient support.Adding the OAuth API import enables the test environment to handle OAuthClient resources as part of the finalizer testing.
160-160
: Scheme registration enables OAuthClient testing.Registering the OAuth scheme with the test environment is essential for the new OAuthClient lifecycle tests to function properly.
components/odh-notebook-controller/config/rbac/role.yaml (2)
45-45
: Update permission on notebooks enables finalizer management.Adding the "update" verb is necessary for the controller to manage finalizers on Notebook resources during the deletion lifecycle.
76-76
: Delete permission on OAuthClients enables cleanup.Adding the "delete" verb allows the controller to properly clean up OAuthClient resources when Notebooks are deleted, which is the core functionality of this PR.
components/odh-notebook-controller/controllers/notebook_controller.go (2)
69-69
: RBAC annotation correctly includes update permission.Adding the "update" verb to the notebooks RBAC annotation ensures the generated ClusterRole will allow finalizer management, matching the manual role.yaml changes.
76-76
: RBAC annotation correctly includes delete permission.Adding the "delete" verb to the oauthclients RBAC annotation enables OAuthClient cleanup functionality, consistent with the role.yaml updates.
components/odh-notebook-controller/controllers/notebook_controller_test.go (2)
31-31
: Import addition supports OAuth testing.Adding the OAuth API import is necessary for the new OAuthClient finalizer tests.
1015-1042
: Test setup is well-structured.The test context setup properly defines constants, variables, and BeforeEach initialization for consistent test execution.
components/odh-notebook-controller/controllers/notebook_oauth.go (1)
307-318
: Finalizer handling placement is correct.The finalizer check before OAuthClient creation/update ensures proper cleanup during notebook deletion.
components/odh-notebook-controller/controllers/notebook_controller_test.go
Outdated
Show resolved
Hide resolved
components/odh-notebook-controller/controllers/notebook_controller_test.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
components/odh-notebook-controller/controllers/notebook_controller_test.go (1)
1218-1285
: Consider refactoring to reduce code duplicationThe helper function is functionally correct but duplicates significant code from the existing OAuth test setup (lines 687-715). The notebook structure, labels, annotations, and volumes are nearly identical.
Consider extracting common OAuth notebook configuration into a shared helper or parameterizing the existing
createNotebook
function to handle OAuth scenarios:+func createNotebookWithOAuth(name, namespace string) *nbv1.Notebook { + notebook := createNotebook(name, namespace) + return configureNotebookForOAuth(notebook, name, namespace) +} + +func configureNotebookForOAuth(notebook *nbv1.Notebook, name, namespace string) *nbv1.Notebook { + // Extract common OAuth configuration logic here + // This can be reused by both test contexts +}This would improve maintainability and ensure consistency between OAuth test configurations.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
components/odh-notebook-controller/controllers/notebook_controller_test.go
(3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
components/odh-notebook-controller/controllers/notebook_controller_test.go (2)
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
🪛 golangci-lint (1.64.8)
components/odh-notebook-controller/controllers/notebook_controller_test.go
1060-1060: Error return value of cli.Get
is not checked
(errcheck)
1082-1082: Error return value of cli.Get
is not checked
(errcheck)
🔇 Additional comments (1)
components/odh-notebook-controller/controllers/notebook_controller_test.go (1)
31-31
: LGTM: Import addition for OAuth testingThe OAuth API import is correctly added and necessary for the new OAuthClient testing functionality.
components/odh-notebook-controller/controllers/notebook_controller_test.go
Outdated
Show resolved
Hide resolved
70b92d0
to
c64b93d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
components/odh-notebook-controller/controllers/notebook_controller_test.go (4)
1018-1133
: Consider test environment limitations for finalizer testing.Based on previous review discussions, the envtest environment may not properly support controller lifecycle testing with finalizers. This aligns with the concerns raised by previous reviewers about the need for more robust testing approaches like Kind clusters.
The current tests validate the logic but may not fully verify the controller's runtime behavior. Consider documenting these limitations or exploring integration tests against a real cluster for complete validation.
1084-1088
: Fix unchecked error return in Eventually block.Another instance of ignoring
cli.Get
errors that should be handled properly.Eventually(func() []string { var updated nbv1.Notebook - cli.Get(ctx, notebookKey, &updated) + if err := cli.Get(ctx, notebookKey, &updated); err != nil { + return nil + } return updated.Finalizers }, duration, interval).Should(ContainElement(Finalizer))
1062-1066
: Fix unchecked error return in Eventually block.The
cli.Get
call ignores errors, which could lead to panics or misleading test results.Eventually(func() []string { var updated nbv1.Notebook - cli.Get(ctx, notebookKey, &updated) + if err := cli.Get(ctx, notebookKey, &updated); err != nil { + return nil + } return updated.Finalizers }, duration, interval).Should(ContainElement(Finalizer))
1073-1074
: Reduce excessive timeout duration.The 90-second timeout is extremely long and suggests underlying test environment issues. This could significantly slow down the test suite.
- }, time.Second*90, time.Second*3).Should(Succeed(), + }, duration, interval).Should(Succeed(),If the longer timeout is truly necessary due to test environment limitations, document why this specific case requires extended timing.
🧹 Nitpick comments (1)
components/odh-notebook-controller/controllers/notebook_controller_test.go (1)
1038-1045
: Improve test timing consistency.The hardcoded 3-second sleep suggests timing dependencies that could lead to test flakiness. Consider using Eventually assertions instead.
- // Wait for the Reconciler to create the route, or else the OAuth reconciler - // cannot create the OAuthClient object - time.Sleep(3 * time.Second) + By("Waiting for route to be created") + Eventually(func() error { + route := &routev1.Route{} + key := types.NamespacedName{Name: Name, Namespace: Namespace} + return cli.Get(ctx, key, route) + }, duration, interval).Should(Succeed())
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
components/odh-notebook-controller/config/crd/external/oauthclients.oauth.openshift.io.yaml
(1 hunks)components/odh-notebook-controller/config/rbac/role.yaml
(2 hunks)components/odh-notebook-controller/controllers/notebook_controller.go
(1 hunks)components/odh-notebook-controller/controllers/notebook_controller_test.go
(3 hunks)components/odh-notebook-controller/controllers/notebook_oauth.go
(3 hunks)components/odh-notebook-controller/controllers/suite_test.go
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- components/odh-notebook-controller/config/rbac/role.yaml
- components/odh-notebook-controller/controllers/suite_test.go
- components/odh-notebook-controller/controllers/notebook_controller.go
- components/odh-notebook-controller/config/crd/external/oauthclients.oauth.openshift.io.yaml
- components/odh-notebook-controller/controllers/notebook_oauth.go
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
components/odh-notebook-controller/controllers/notebook_controller_test.go (2)
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
🪛 GitHub Check: golangci-lint (components/odh-notebook-controller)
components/odh-notebook-controller/controllers/notebook_controller_test.go
[failure] 1104-1104:
Error return value of cli.Get
is not checked (errcheck)
[failure] 1082-1082:
Error return value of cli.Get
is not checked (errcheck)
🪛 golangci-lint (1.64.8)
components/odh-notebook-controller/controllers/notebook_controller_test.go
1064-1064: Error return value of cli.Get
is not checked
(errcheck)
1086-1086: Error return value of cli.Get
is not checked
(errcheck)
🪛 GitHub Actions: Code static analysis
components/odh-notebook-controller/controllers/notebook_controller_test.go
[error] 1082-1082: golangci-lint errcheck: Error return value of cli.Get
is not checked.
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: build
- GitHub Check: build
🔇 Additional comments (3)
components/odh-notebook-controller/controllers/notebook_controller_test.go (3)
32-32
: LGTM: OAuth API import added.The import is correctly added to support the new OAuthClient testing functionality.
1018-1037
: Test context setup looks good.The test structure and variable initialization are well-organized, with clear constants and proper context setup.
1467-1534
: Well-structured helper function.The
createNotebookWithOAuth
function follows good testing patterns by providing a clean way to create notebooks with OAuth configuration. The structure matches the expected notebook specification with proper OAuth annotations and volumes.
components/odh-notebook-controller/controllers/notebook_controller_test.go
Show resolved
Hide resolved
components/odh-notebook-controller/controllers/notebook_controller_test.go
Outdated
Show resolved
Hide resolved
0e264a1
to
ba7e765
Compare
) | ||
|
||
// Initialize context and keys once | ||
ctx = context.Background() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai we already have a perfectly good context at https://github.com/opendatahub-io/kubeflow/blob/main/components/odh-notebook-controller/controllers/suite_test.go#L68 defined, what's the reason to define new context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
@jiridanek You're absolutely right! Looking at the code patterns in the test file, there are two issues with the current approach:
-
Unnecessary variable declaration: The test defines
ctx
as a variable in theWhen
block scope, but then only assignscontext.Background()
to it once. This is inconsistent with the existing pattern where most tests usectx := context.Background()
directly within eachIt
block. -
Missing context reuse: If there's already a context available in the test suite setup, that should be preferred for consistency.
The current code should be simplified to follow the existing pattern. Instead of:
var (
notebook *nbv1.Notebook
ctx context.Context
notebookKey types.NamespacedName
)
// Initialize context and keys once
ctx = context.Background()
It should use the standard pattern seen throughout the file:
var (
notebook *nbv1.Notebook
notebookKey types.NamespacedName
)
And then in each It
block where context is needed:
It("Should create a Notebook CR successfully", func() {
ctx := context.Background()
// ... rest of test
})
This maintains consistency with the existing test structure and follows Go testing best practices.
🏁 Script executed:
#!/bin/bash
# Check the context definition in suite_test.go around line 68
cat components/odh-notebook-controller/controllers/suite_test.go | head -80 | tail -20
Length of output: 577
🏁 Script executed:
#!/bin/bash
# Look for context usage patterns in suite_test.go
rg "ctx" components/odh-notebook-controller/controllers/suite_test.go -A 2 -B 2
Length of output: 871
🏁 Script executed:
#!/bin/bash
# Check if there's a global context variable definition in suite_test.go
rg "var.*ctx|ctx.*=" components/odh-notebook-controller/controllers/suite_test.go
Length of output: 139
@jiridanek You're absolutely correct! I can see from the suite_test.go
file that there's indeed a global context properly defined and initialized:
var (
ctx context.Context
cancel context.CancelFunc
// ...
)
// In BeforeSuite:
ctx, cancel = context.WithCancel(context.Background())
This global context is:
- Properly managed - Created with cancellation support in
BeforeSuite
- Already used - Used for manager setup and namespace creation in the test suite
- Available everywhere - Accessible throughout all test files in the package
The new test should definitely use this existing global ctx
instead of creating its own. The variable declarations should be simplified to:
var (
notebook *nbv1.Notebook
notebookKey types.NamespacedName
)
And remove the ctx = context.Background()
assignment since the global ctx
is already available and properly initialized.
This would be more consistent with the test suite architecture and take advantage of the cancelable context that's properly managed in the test lifecycle. Thank you for catching this - it's a good example of following existing patterns and reusing properly set up infrastructure!
🐰 ears perked with curiosity
🧬 Code Graph Analysis Results
Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller_test.go
(full file provided)
- The test file defines multiple test cases for the Openshift Notebook controller, including creation, update, deletion, and reconciliation of various Kubernetes resources related to Notebooks such as Routes, RoleBindings, NetworkPolicies, OAuth sidecars, ServiceAccounts, Secrets, and ConfigMaps.
- The tests consistently create a new context with
ctx := context.Background()
inside eachIt
block before performing client operations likecli.Create(ctx, ...)
,cli.Get(ctx, ...)
,cli.Delete(ctx, ...)
, andcli.Update(ctx, ...)
. - The context is used as a standard Go idiomatic way to pass cancellation, deadlines, and request-scoped values to client calls.
- There is no shared or global context reused across tests; each test creates its own fresh context.
- Example snippet from multiple tests:
It("Should create a Route to expose the traffic externally", func() { ctx := context.Background() Expect(cli.Create(ctx, notebook)).Should(Succeed()) ... })
- The context is not imported or referenced from any external test suite file like
suite_test.go
.
Possibly relevant snippets from components/odh-notebook-controller/controllers/suite_test.go
(referenced by user comment)
[No direct snippet provided here, but user references line 68 in suite_test.go
where a context is defined.]
Relevant snippets from other files in components/odh-notebook-controller/controllers/
notebook_controller.go
(lines 85-89)
func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
reflect.DeepEqual(nb1.Spec, nb2.Spec)
}
- Used in tests to compare Notebook resources for equality.
notebook_route.go
(lines 90-97)
func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
// Omit the host field since it is reconciled by the ingress controller
r1.Spec.Host, r2.Spec.Host = "", ""
// Two routes will be equal if the labels and spec are identical
return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
reflect.DeepEqual(r1.Spec, r2.Spec)
}
- Used in tests to compare Route resources.
notebook_network.go
(lines 124-128)
func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
// Two network policies will be equal if the labels and specs are identical
return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
reflect.DeepEqual(np1.Spec, np2.Spec)
}
- Used in tests to compare NetworkPolicy resources.
notebook_oauth.go
(lines 87-92)
func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
// Two service accounts will be equal if the labels and annotations are
// identical
return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}
- Used in tests to compare ServiceAccount resources.
Summary of context usage in notebook_controller_test.go
- Each test case creates its own
context.Background()
instance locally. - No shared or global context is imported or reused from
suite_test.go
. - The context is used to scope client calls (
cli.Create
,cli.Get
,cli.Delete
,cli.Update
). - This is a common pattern in Go tests to ensure isolation and avoid side effects between tests.
Summary of relevant helper functions in notebook_controller_test.go
-
createNotebook(name, namespace string) *nbv1.Notebook
Creates a basic Notebook resource with given name and namespace. -
createNotebookWithOAuth(name, namespace string) *nbv1.Notebook
Creates a Notebook resource with OAuth sidecar and related volumes. -
getRouteFromList(route *routev1.Route, notebook *nbv1.Notebook, name, namespace string) (*routev1.Route, error)
Lists Routes in the namespace with labelnotebook-name
and returns the one controlled by the given Notebook. -
createOAuthServiceAccount(name, namespace string) corev1.ServiceAccount
Returns a ServiceAccount object configured for OAuth. -
createOAuthContainer(name, namespace string) corev1.Container
Returns a container spec for the OAuth proxy sidecar. -
createOAuthNetworkPolicy(name, namespace string, npProtocol corev1.Protocol, port int32) netv1.NetworkPolicy
Returns a NetworkPolicy object for OAuth traffic. -
createOAuthConfigmap(name, namespace string, label map[string]string, configMapData map[string]string) *corev1.ConfigMap
Creates a ConfigMap with given data. -
checkCertConfigMap(ctx context.Context, namespace string, configMapName string, certFileName string, expNumberCerts int)
Checks that a ConfigMap contains the expected number of valid PEM certificates.
Summary of relevant external helper from components/common/reconcilehelper/util.go
(lines 46-71)
func Service(ctx context.Context, r client.Client, service *corev1.Service, log logr.Logger) error {
foundService := &corev1.Service{}
justCreated := false
if err := r.Get(ctx, types.NamespacedName{Name: service.Name, Namespace: service.Namespace}, foundService); err != nil {
if apierrs.IsNotFound(err) {
log.Info("Creating Service", "namespace", service.Namespace, "name", service.Name)
if err = r.Create(ctx, service); err != nil {
log.Error(err, "unable to create service")
return err
}
justCreated = true
} else {
log.Error(err, "error getting service")
return err
}
}
if !justCreated && CopyServiceFields(service, foundService) {
log.Info("Updating Service\n", "namespace", service.Namespace, "name", service.Name)
if err := r.Update(ctx, foundService); err != nil {
log.Error(err, "unable to update Service")
return err
}
}
return nil
}
- Shows typical usage of context in reconcile helper functions for Kubernetes client operations.
Summary for human reviewer
- The test file
notebook_controller_test.go
usescontext.Background()
freshly in each test case. - There is no reuse of a shared context from
suite_test.go
or elsewhere. - This is standard practice in Go tests to avoid shared mutable state and to keep tests isolated.
- The user comment questions why a new context is defined when a "perfectly good context" exists in
suite_test.go
at line 68. - The code here shows no usage of that external context; instead, it creates new contexts locally.
- The relevant code snippets for understanding this are the multiple
ctx := context.Background()
lines inside eachIt
block in the test file. - Also relevant are the helper functions that use this context for client calls.
- The comparison functions from other files are relevant for understanding resource equality checks in tests but not directly related to context usage.
[End of relevant code snippets and summaries.]
✏️ Learnings added
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see other 10 places where ctx := context.Background()
is called on the tests.
any specific reason of why we should have it on this file if the suite_test.go file provides a complete global ctx context or should we remove them all as well?
just to clarify the usage based on these comments above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CodeRabbit sorry, I meant to talk to you in the comment above, let me repeat it then:
I see other 10 places where ctx := context.Background() is called on the tests.
any specific reason of why we should have it on this file if the suite_test.go file provides a complete global ctx context or should we remove them all as well?
just to clarify the usage based on these comments above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
@daniellutz Great question! Let me analyze all the context usage patterns in the test file to give you a comprehensive answer.
Based on the analysis, here's my recommendation:
You should update ALL instances to use the global context
Here's why:
Benefits of using the global context:
- Proper cancellation support - The global context has
context.WithCancel()
which allows proper cleanup when tests are interrupted - Consistency - All tests would use the same context management approach
- Resource management - The global context is properly managed in
BeforeSuite
/AfterSuite
lifecycle - Test isolation - The global context is still isolated per test run, just shared within a single test suite execution
Current inconsistency:
The pattern you've found shows that tests are inconsistently creating their own contexts instead of leveraging the properly managed global one. This creates unnecessary overhead and bypasses the cancellation capabilities.
Recommended changes:
Replace all instances of:
It("Should do something", func() {
ctx := context.Background()
// test code...
})
With:
It("Should do something", func() {
// Use the global ctx directly
Expect(cli.Create(ctx, resource)).Should(Succeed())
// test code...
})
No downside to this change:
- Test isolation is maintained (each test run gets a fresh context)
- Performance is slightly better (no repeated context creation)
- Cancellation works properly if tests need to be interrupted
- Code is more consistent with the test suite architecture
This is a good cleanup that improves consistency across the entire test suite. The global context in suite_test.go
was specifically designed for this purpose.
🏁 Script executed:
#!/bin/bash
# Find all instances of context creation in the test file
echo "=== All context creation patterns ==="
rg "ctx.*:=.*context\." components/odh-notebook-controller/controllers/notebook_controller_test.go -n
echo -e "\n=== Count of context.Background() calls ==="
rg "context\.Background\(\)" components/odh-notebook-controller/controllers/notebook_controller_test.go -c
echo -e "\n=== Context usage in each test block ==="
rg "ctx.*:=.*context\.Background\(\)" components/odh-notebook-controller/controllers/notebook_controller_test.go -A 2 -B 2 -n
Length of output: 2740
🏁 Script executed:
#!/bin/bash
# Check if the global context is actually used anywhere in the test file
echo "=== Usage of global ctx variable ==="
rg "(?<!:= )ctx(?! :=)" components/odh-notebook-controller/controllers/notebook_controller_test.go -n
echo -e "\n=== Check for any existing global context usage ==="
rg "cli\.(Create|Get|Update|Delete|List)\(ctx," components/odh-notebook-controller/controllers/notebook_controller_test.go -n | head -10
Length of output: 1172
🏁 Script executed:
#!/bin/bash
# Look at the global context definition and usage in suite_test.go
echo "=== Global context setup in suite_test.go ==="
rg "var.*ctx|ctx.*=" components/odh-notebook-controller/controllers/suite_test.go -A 3 -B 1
echo -e "\n=== Global context initialization ==="
rg "ctx.*context\.WithCancel|BeforeSuite" components/odh-notebook-controller/controllers/suite_test.go -A 5 -B 2
Length of output: 743
@daniellutz Perfect! The script analysis confirms exactly what we suspected. Here's the complete picture:
Current State:
- 15 total instances of
context.Background()
calls in the test file - 10 instances following the
ctx := context.Background()
pattern inIt
blocks - Global context is properly set up in
suite_test.go
withctx, cancel = context.WithCancel(context.Background())
inBeforeSuite
Clear Recommendation: Replace ALL instances
You should remove all 15 instances of local context creation and use the global context consistently. Here's why:
The global context is superior:
- Cancellation support -
context.WithCancel()
allows proper cleanup - Lifecycle management - Properly initialized in
BeforeSuite
and cleaned up - Consistency - One context management strategy across the entire test suite
Current inconsistency:
The architecture shows that:
- Someone intentionally set up a global context with proper cancellation
- But tests are bypassing it and creating their own basic contexts
- This defeats the purpose of the global context setup
Simple transformation:
Replace this pattern (10+ times):
It("Should do something", func() {
ctx := context.Background()
Expect(cli.Create(ctx, resource)).Should(Succeed())
})
With this:
It("Should do something", func() {
Expect(cli.Create(ctx, resource)).Should(Succeed())
})
Benefits:
- ✅ Proper cancellation if tests are interrupted
- ✅ Consistent architecture - use the infrastructure that's already there
- ✅ Cleaner code - no repetitive context creation
- ✅ Better resource management - leverage the suite lifecycle
No drawbacks:
- Test isolation is still maintained (global context is per test run)
- Performance is slightly better
- Code is simpler and more consistent
Bottom line: The global context exists for exactly this purpose. Using it consistently across all tests is the right architectural choice and improves code quality.
🧬 Code Graph Analysis Results
Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller_test.go
(entire file provided)
- The test file defines multiple test cases for the Openshift Notebook controller.
- It uses
context.Background()
repeatedly inside individual test cases (It
blocks) to create a fresh context for each test. - The file imports Ginkgo and Gomega for BDD-style testing.
- There is a mention in the user comment about a global
ctx
context provided bysuite_test.go
which is not shown here. - The tests create and manipulate Kubernetes resources such as Notebooks, Routes, RoleBindings, NetworkPolicies, ServiceAccounts, Secrets, and ConfigMaps.
- The tests use
cli
(likely a controller-runtime client) to interact with the Kubernetes API. - The
ctx := context.Background()
is declared inside manyIt
blocks, e.g.:
It("Should create a Route to expose the traffic externally", func() {
ctx := context.Background()
// ...
})
- The user is asking about the reason for using
context.Background()
here instead of a globalctx
fromsuite_test.go
.
Possibly relevant snippets from components/odh-notebook-controller/controllers/notebook_controller.go
func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
reflect.DeepEqual(nb1.Spec, nb2.Spec)
}
- This comparator function is used in tests to compare Notebook resources.
Possibly relevant snippets from components/odh-notebook-controller/controllers/notebook_route.go
func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
// Omit the host field since it is reconciled by the ingress controller
r1.Spec.Host, r2.Spec.Host = "", ""
// Two routes will be equal if the labels and spec are identical
return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
reflect.DeepEqual(r1.Spec, r2.Spec)
}
- Comparator for Route resources used in tests.
Possibly relevant snippets from components/odh-notebook-controller/controllers/notebook_network.go
func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
// Two network policies will be equal if the labels and specs are identical
return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
reflect.DeepEqual(np1.Spec, np2.Spec)
}
- Comparator for NetworkPolicy resources used in tests.
Possibly relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go
func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
// Two service accounts will be equal if the labels and annotations are
// identical
return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}
- Comparator for ServiceAccount resources used in tests.
Possibly relevant snippets from components/common/reconcilehelper/util.go
func Service(ctx context.Context, r client.Client, service *corev1.Service, log logr.Logger) error {
foundService := &corev1.Service{}
justCreated := false
if err := r.Get(ctx, types.NamespacedName{Name: service.Name, Namespace: service.Namespace}, foundService); err != nil {
if apierrs.IsNotFound(err) {
log.Info("Creating Service", "namespace", service.Namespace, "name", service.Name)
if err = r.Create(ctx, service); err != nil {
log.Error(err, "unable to create service")
return err
}
justCreated = true
} else {
log.Error(err, "error getting service")
return err
}
}
if !justCreated && CopyServiceFields(service, foundService) {
log.Info("Updating Service\n", "namespace", service.Namespace, "name", service.Name)
if err := r.Update(ctx, foundService); err != nil {
log.Error(err, "unable to update Service")
return err
}
}
return nil
}
- This utility function uses a
context.Context
parameter for Kubernetes client operations. - Shows typical usage pattern of passing context to client calls.
Summary of relevant context for the user comment:
- The test file
notebook_controller_test.go
usesctx := context.Background()
repeatedly inside individual test cases. - The user mentions a global
ctx
provided bysuite_test.go
which is not shown here but presumably exists. - The question is about whether to keep or remove these local
context.Background()
calls in favor of the global context. - The code snippets show that the context is passed to client calls (
cli.Create
,cli.Get
,cli.Delete
, etc.). - The utility functions and controller-runtime client calls expect a
context.Context
parameter. - The test code currently creates a fresh background context per test case, which is a common pattern to isolate tests.
- The global context from
suite_test.go
might provide a shared context with cancellation or timeout, or some test-specific setup. - The user is asking for clarification on the usage pattern and whether to unify context usage across tests.
[End of relevant snippets and context.]
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai, please create an issue from this your finding - let's replace these contexts in a followup PR.
@daniellutz For this PR, we may try to update the code so it uses the existing global context right away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
@jstourac I'll create an issue to track this context usage improvement right away!
For this current PR, updating the new test code to use the global context would be a great start! The specific lines that need updating in the new OAuth deletion test are:
// Remove this:
ctx = context.Background()
// And remove ctx from the variable declarations:
var (
notebook *nbv1.Notebook
notebookKey types.NamespacedName
)
This would make the new test consistent with the recommended pattern while the broader cleanup can be handled in the follow-up issue.
🏁 Script executed:
#!/bin/bash
# Create GitHub issue for context usage inconsistency
gh issue create \
--title "Replace local context.Background() calls with global context in notebook controller tests" \
--body "## Problem
The notebook controller tests in \`components/odh-notebook-controller/controllers/notebook_controller_test.go\` contain **15 instances** of local \`context.Background()\` calls that should be replaced with the global context defined in \`suite_test.go\`.
## Current State
- Tests create individual contexts with \`ctx := context.Background()\` in each \`It\` block
- Global context is properly set up in \`suite_test.go\` with \`ctx, cancel = context.WithCancel(context.Background())\` in \`BeforeSuite\`
- This creates architectural inconsistency and bypasses proper cancellation capabilities
## Benefits of Using Global Context
1. **Proper cancellation support** - Global context has \`context.WithCancel()\` for cleanup when tests are interrupted
2. **Consistency** - All tests use the same context management approach
3. **Resource management** - Global context is properly managed in test lifecycle
4. **Performance** - Eliminates unnecessary context creation overhead
## Scope
Replace all 15 instances of:
\`\`\`go
It(\"Should do something\", func() {
ctx := context.Background()
// test code...
})
\`\`\`
With:
\`\`\`go
It(\"Should do something\", func() {
// Use global ctx directly
Expect(cli.Create(ctx, resource)).Should(Succeed())
// test code...
})
\`\`\`
## References
- Original discussion: https://github.com/opendatahub-io/kubeflow/pull/605#discussion_r2177890036
- Related PR: https://github.com/opendatahub-io/kubeflow/pull/605
## Acceptance Criteria
- [ ] Remove all 15 instances of local \`context.Background()\` creation
- [ ] Verify all tests use the global \`ctx\` variable consistently
- [ ] Ensure test functionality remains unchanged
- [ ] Update any similar patterns in other test files if found" \
--assignee jstourac
Length of output: 1926
Perfect! I've successfully created the issue to track this improvement. However, I notice the issue was created in the kubeflow/kubeflow
repository, but this PR is in opendatahub-io/kubeflow
.
Issue created: kubeflow#7733
You may want to either:
- Move this issue to the correct
opendatahub-io/kubeflow
repository, or - Create a new issue in
opendatahub-io/kubeflow
and close the one in the main kubeflow repo
For the current PR, @daniellutz could indeed update the new test code to use the global context right away. The specific changes needed in the new OAuth deletion test are:
Remove these lines:
var (
notebook *nbv1.Notebook
ctx context.Context // ← Remove this line
notebookKey types.NamespacedName
)
// Initialize context and keys once
ctx = context.Background() // ← Remove this line
Keep only:
var (
notebook *nbv1.Notebook
notebookKey types.NamespacedName
)
This would make the new test immediately consistent with the recommended architecture, while the broader cleanup across all 15 instances can be handled in the follow-up issue. This approach gives us the best of both worlds - immediate consistency for new code and a tracked improvement for existing code.
🧬 Code Graph Analysis Results
Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller.go
(lines 85-89)
func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
reflect.DeepEqual(nb1.Spec, nb2.Spec)
}
- Compares two Notebook objects by their labels, annotations, and spec for equality.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_route.go
(lines 90-97)
func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
// Omit the host field since it is reconciled by the ingress controller
r1.Spec.Host, r2.Spec.Host = "", ""
// Two routes will be equal if the labels and spec are identical
return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
reflect.DeepEqual(r1.Spec, r2.Spec)
}
- Compares two Route objects ignoring the host field, focusing on labels and spec equality.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_network.go
(lines 124-128)
func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
// Two network policies will be equal if the labels and specs are identical
return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
reflect.DeepEqual(np1.Spec, np2.Spec)
}
- Compares two NetworkPolicy objects by labels and spec.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go
(lines 87-92)
func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
// Two service accounts will be equal if the labels and annotations are
// identical
return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}
- Compares two ServiceAccount objects by labels and annotations.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go
(lines 162-166)
func CompareNotebookServices(s1 corev1.Service, s2 corev1.Service) bool {
// Two services will be equal if the labels and annotations are identical
return reflect.DeepEqual(s1.ObjectMeta.Labels, s2.ObjectMeta.Labels) &&
reflect.DeepEqual(s1.ObjectMeta.Annotations, s2.ObjectMeta.Annotations)
}
- Compares two Service objects by labels and annotations.
Relevant snippet from components/odh-notebook-controller/controllers/notebook_controller_test.go
(entire file)
- The file contains extensive Ginkgo/Gomega test cases for the Openshift Notebook controller.
- Tests cover creation, update, reconciliation, and deletion of Notebooks and related resources such as Routes, RoleBindings, NetworkPolicies, OAuth sidecars, ServiceAccounts, Secrets, and ConfigMaps.
- Contexts are created using
context.Background()
repeatedly in tests. - The user comment references replacing contexts with a global context, which is relevant here since the tests instantiate contexts locally.
- Helper functions like
createNotebook
,createNotebookWithOAuth
,getRouteFromList
, andcreateOAuthContainer
are defined to create test objects. - The tests use
cli
(a controller-runtime client) to create, get, update, patch, and delete Kubernetes resources. - The tests use
Eventually
andExpect
assertions to verify resource states. - The tests also handle cleanup and simulate resource lifecycle events.
Relevant snippet from components/common/reconcilehelper/util.go
(lines 46-71)
func Service(ctx context.Context, r client.Client, service *corev1.Service, log logr.Logger) error {
foundService := &corev1.Service{}
justCreated := false
if err := r.Get(ctx, types.NamespacedName{Name: service.Name, Namespace: service.Namespace}, foundService); err != nil {
if apierrs.IsNotFound(err) {
log.Info("Creating Service", "namespace", service.Namespace, "name", service.Name)
if err = r.Create(ctx, service); err != nil {
log.Error(err, "unable to create service")
return err
}
justCreated = true
} else {
log.Error(err, "error getting service")
return err
}
}
if !justCreated && CopyServiceFields(service, foundService) {
log.Info("Updating Service\n", "namespace", service.Namespace, "name", service.Name)
if err := r.Update(ctx, foundService); err != nil {
log.Error(err, "unable to update Service")
return err
}
}
return nil
}
- Utility function to reconcile a Kubernetes Service resource.
- Uses a context parameter
ctx
for API calls. - Handles creation if not found, or updates if fields differ.
- Logs errors and returns them.
Summary of relevance to user comment
- The test file
notebook_controller_test.go
usescontext.Background()
repeatedly to create contexts locally in each test. - The user comment suggests replacing these contexts with a global context in a follow-up PR.
- The utility reconcile function in
util.go
also uses context parameters, showing typical usage patterns. - The comparison functions in other controller files are used in tests to verify resource equality.
- The test file and these snippets provide the context needed to understand how contexts are currently used and how they might be replaced or unified.
[End of relevant snippets.]
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1126-1131
Timestamp: 2025-07-02T04:00:16.948Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, OAuth finalizer tests should be streamlined into single test blocks rather than multiple "It" blocks, as checking finalizer removal races with object deletion and is typically unobservable in Kubernetes.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.926Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai you were supposed to create this issue on the opendatahub-io/kubeflow repository; please create it there and I will close this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
@jiridanek You're absolutely right! I apologize for creating the issue on the wrong repository. Let me create it on the correct opendatahub-io/kubeflow
repository right now.
🏁 Script executed:
#!/bin/bash
# Create GitHub issue on the correct opendatahub-io/kubeflow repository
gh issue create \
--repo opendatahub-io/kubeflow \
--title "Replace local context.Background() calls with global context in notebook controller tests" \
--body "## Problem
The notebook controller tests in \`components/odh-notebook-controller/controllers/notebook_controller_test.go\` contain **15 instances** of local \`context.Background()\` calls that should be replaced with the global context defined in \`suite_test.go\`.
## Current State
- Tests create individual contexts with \`ctx := context.Background()\` in each \`It\` block
- Global context is properly set up in \`suite_test.go\` with \`ctx, cancel = context.WithCancel(context.Background())\` in \`BeforeSuite\`
- This creates architectural inconsistency and bypasses proper cancellation capabilities
## Benefits of Using Global Context
1. **Proper cancellation support** - Global context has \`context.WithCancel()\` for cleanup when tests are interrupted
2. **Consistency** - All tests use the same context management approach
3. **Resource management** - Global context is properly managed in test lifecycle
4. **Performance** - Eliminates unnecessary context creation overhead
## Scope
Replace all 15 instances of:
\`\`\`go
It(\"Should do something\", func() {
ctx := context.Background()
// test code...
})
\`\`\`
With:
\`\`\`go
It(\"Should do something\", func() {
// Use global ctx directly
Expect(cli.Create(ctx, resource)).Should(Succeed())
// test code...
})
\`\`\`
## References
- Original discussion: https://github.com/opendatahub-io/kubeflow/pull/605#discussion_r2177890036
- Related PR: https://github.com/opendatahub-io/kubeflow/pull/605
## Acceptance Criteria
- [ ] Remove all 15 instances of local \`context.Background()\` creation
- [ ] Verify all tests use the global \`ctx\` variable consistently
- [ ] Ensure test functionality remains unchanged
- [ ] Update any similar patterns in other test files if found" \
--assignee jstourac
Length of output: 1962
Perfect! ✅ I've successfully created the issue on the correct repository:
Issue created: #648
The issue has been properly assigned to @jstourac and contains all the details about replacing the 15 instances of local context.Background()
calls with the global context from suite_test.go
.
You can now close the incorrectly created issue kubeflow#7733 on the main kubeflow/kubeflow repository. This new issue #648 is properly placed in the opendatahub-io/kubeflow
repository where this PR and the actual code changes need to happen.
🐰 hops back satisfied with getting the right burrow this time
🧬 Code Graph Analysis Results
Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller.go
(lines 85-89)
func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
reflect.DeepEqual(nb1.Spec, nb2.Spec)
}
This function compares two Notebook custom resources by checking equality of their labels, annotations, and spec fields.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_route.go
(lines 90-97)
func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
// Omit the host field since it is reconciled by the ingress controller
r1.Spec.Host, r2.Spec.Host = "", ""
// Two routes will be equal if the labels and spec are identical
return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
reflect.DeepEqual(r1.Spec, r2.Spec)
}
This function compares two OpenShift Route objects ignoring the host field, focusing on labels and spec equality.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_network.go
(lines 124-128)
func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
// Two network policies will be equal if the labels and specs are identical
return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
reflect.DeepEqual(np1.Spec, np2.Spec)
}
This function compares two Kubernetes NetworkPolicy objects by labels and spec.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go
(lines 87-92)
func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
// Two service accounts will be equal if the labels and annotations are
// identical
return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}
This function compares two ServiceAccount objects by labels and annotations.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go
(lines 162-166)
func CompareNotebookServices(s1 corev1.Service, s2 corev1.Service) bool {
// Two services will be equal if the labels and annotations are identical
return reflect.DeepEqual(s1.ObjectMeta.Labels, s2.ObjectMeta.Labels) &&
reflect.DeepEqual(s1.ObjectMeta.Annotations, s2.ObjectMeta.Annotations)
}
This function compares two Kubernetes Service objects by labels and annotations.
Relevant snippet from components/odh-notebook-controller/controllers/notebook_controller_test.go
(entire file)
The file contains extensive Ginkgo/Gomega test cases for the Openshift Notebook controller. It tests creation, update, reconciliation, and deletion of various Kubernetes and OpenShift resources related to Notebook CRs, including:
- Notebook creation and Route creation/reconciliation/deletion.
- RoleBinding reconciliation when a Notebook is created.
- Mounting trusted CA certificates from ConfigMaps.
- Handling long-named Notebooks.
- Updating Notebook specs.
- NetworkPolicy creation and reconciliation.
- OAuth sidecar injection, ServiceAccount, Secret, Service, Route creation and reconciliation.
- OAuthClient creation and deletion with finalizers.
- Behavior when notebooks are created in Service Mesh namespaces.
- Lifecycle of ds-pipeline-config Secret related to DataSciencePipelinesApplication (DSPA) and Dashboard CRs.
Helper functions defined in this file include:
createNotebook(name, namespace string) *nbv1.Notebook
: creates a basic Notebook CR.createNotebookWithOAuth(name, namespace string) *nbv1.Notebook
: creates a Notebook CR with OAuth annotations and sidecar.getRouteFromList(route *routev1.Route, notebook *nbv1.Notebook, name, namespace string) (*routev1.Route, error)
: fetches a Route controlled by the Notebook.createOAuthServiceAccount(name, namespace string) corev1.ServiceAccount
: creates a ServiceAccount object for OAuth.createOAuthContainer(name, namespace string) corev1.Container
: creates the OAuth proxy container spec.createOAuthNetworkPolicy(name, namespace string, npProtocol corev1.Protocol, port int32) netv1.NetworkPolicy
: creates a NetworkPolicy for OAuth.createOAuthConfigmap(name, namespace string, label map[string]string, configMapData map[string]string) *corev1.ConfigMap
: creates a ConfigMap.checkCertConfigMap(ctx context.Context, namespace string, configMapName string, certFileName string, expNumberCerts int)
: verifies certificates in a ConfigMap.
Relevant snippet from components/common/reconcilehelper/util.go
(lines 46-71)
func Service(ctx context.Context, r client.Client, service *corev1.Service, log logr.Logger) error {
foundService := &corev1.Service{}
justCreated := false
if err := r.Get(ctx, types.NamespacedName{Name: service.Name, Namespace: service.Namespace}, foundService); err != nil {
if apierrs.IsNotFound(err) {
log.Info("Creating Service", "namespace", service.Namespace, "name", service.Name)
if err = r.Create(ctx, service); err != nil {
log.Error(err, "unable to create service")
return err
}
justCreated = true
} else {
log.Error(err, "error getting service")
return err
}
}
if !justCreated && CopyServiceFields(service, foundService) {
log.Info("Updating Service\n", "namespace", service.Namespace, "name", service.Name)
if err := r.Update(ctx, foundService); err != nil {
log.Error(err, "unable to update Service")
return err
}
}
return nil
}
This utility function reconciles a Kubernetes Service resource: it attempts to get the Service, creates it if not found, or updates it if fields differ.
Summary
The test file notebook_controller_test.go
is a comprehensive suite testing the reconciliation logic of the Openshift Notebook controller, including creation, update, and deletion of Notebook CRs and related resources such as Routes, RoleBindings, NetworkPolicies, OAuth sidecars, ServiceAccounts, Secrets, and OAuthClients. It uses helper functions and comparison functions defined in other controller files to verify resource equality and correctness.
The comparison functions from other files are relevant for understanding how the tests verify resource equality. The utility function for Service reconciliation is relevant for understanding how Services are managed in the controller logic, which is indirectly tested here.
These snippets provide context for the test file and assist in understanding the resources and reconciliation behaviors tested.
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1126-1131
Timestamp: 2025-07-02T04:00:16.948Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, OAuth finalizer tests should be streamlined into single test blocks rather than multiple "It" blocks, as checking finalizer removal races with object deletion and is typically unobservable in Kubernetes.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.926Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-25T07:05:15.410Z
Learning: When creating GitHub issues for the opendatahub-io/kubeflow repository, never create them in the kubeflow/kubeflow repository - always use opendatahub-io/kubeflow unless explicitly instructed otherwise.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:47-47
Timestamp: 2025-06-25T06:53:14.825Z
Learning: GitHub issues should be created in the opendatahub-io/kubeflow project unless explicitly instructed otherwise.
components/odh-notebook-controller/controllers/notebook_controller_test.go
Show resolved
Hide resolved
createdOAuthClient := &oauthv1.OAuthClient{} | ||
Eventually(func() error { | ||
return cli.Get(ctx, oauthClientKey, createdOAuthClient) | ||
}, time.Second*90, time.Second*3).Should(Succeed(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai suggest a suitable constant so that we don't have hardcoded magic numbers here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiridanek Looking at the hardcoded timeout values in the OAuth client test, I suggest adding these constants at the top of the test file alongside the existing duration
and interval
constants:
const (
duration = 10 * time.Second
interval = 200 * time.Millisecond
// OAuth-specific timeouts for longer operations
oauthTimeout = 90 * time.Second
oauthInterval = 3 * time.Second
)
Then replace the hardcoded values:
Eventually(func() error {
return cli.Get(ctx, oauthClientKey, createdOAuthClient)
}, oauthTimeout, oauthInterval).Should(Succeed(),
The longer timeout makes sense since OAuth client creation involves cluster-level resources and may take more time than typical namespace-scoped operations. The longer polling interval (3 seconds vs 200ms) is also appropriate to avoid overwhelming the API server when waiting for these slower operations.
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
By("Ensuring prerequisite resources exist") | ||
Eventually(func() []string { | ||
var updated nbv1.Notebook | ||
err := cli.Get(ctx, notebookKey, &updated) | ||
if err != nil { | ||
return nil | ||
} | ||
return updated.Finalizers | ||
}, duration, interval).Should(ContainElement(Finalizer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai does this make sense? the test is strictly going to be run from top to bottom, so the prerequisite resources will always exist, because the preceding tests did run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiridanek You raise an excellent point about test design. You're absolutely correct that in Ginkgo's sequential execution model within a When
block, these prerequisite checks are technically redundant since:
- The tests run strictly top-to-bottom
- Previous tests already created and verified these resources:
- "Should create a Notebook CR successfully" → creates the notebook
- "Should have finalizer set by controller or webhook" → verifies finalizer exists
- "Should create OAuthClient for the Notebook CR" → verifies OAuthClient creation
The prerequisite verification in "Should delete OAuthClient when Notebook is deleted" is defensive programming that adds test overhead without functional benefit in this context.
You could simplify this to:
It("Should delete OAuthClient when Notebook is deleted", func() {
By("Deleting the notebook")
var notebookToDelete nbv1.Notebook
Expect(cli.Get(ctx, notebookKey, ¬ebookToDelete)).Should(Succeed())
Expect(cli.Delete(ctx, ¬ebookToDelete)).Should(Succeed())
// Rest of deletion verification logic...
})
The only potential trade-off is losing some debugging clarity if an earlier test partially succeeded but left resources in an unexpected state. But given the sequential nature and that each previous test has proper assertions, your observation is spot-on - these prerequisite checks are unnecessary complexity.
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was added for context-awareness, about the OAuthClient object, the Notebook, Finalizer and that it need to have these pre-conditions to move forward, in such way that it's easier to understand.
No problem from me to remove it, it was more for readability and clarification purposes rather than a specific requirement itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jstourac and other ppl on the controller team, it's your call then
Eventually(func() error { | ||
var oauth oauthv1.OAuthClient | ||
return cli.Get(ctx, oauthClientKey, &oauth) | ||
}, time.Second*90, time.Second*3).Should( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai is there some constant for the magic numbers? also, 90 seconds seems surprisingly long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
@jiridanek, you're absolutely right to question these magic numbers! Looking at the test code, I can see some inconsistencies:
Existing Constants:
The file already defines standard timeout constants:
const (
duration = 10 * time.Second
interval = 200 * time.Millisecond
)
Issues with the new test:
-
Inconsistent timeout usage: Most tests in the file use the standard
duration
(10s) andinterval
(200ms) constants, but the OAuth deletion tests use hardcodedtime.Second*90, time.Second*3
. -
90 seconds is indeed surprisingly long: Looking at other tests in the same file, most OAuth operations complete within the standard 10-second timeout. Even complex operations like DSPA secret creation use 30 seconds maximum.
Recommendations:
- Use existing constants for consistency:
// Instead of: time.Second*90, time.Second*3
// Use: duration, interval (10s timeout, 200ms interval)
- If longer timeout is truly needed, define a constant:
const (
duration = 10 * time.Second
interval = 200 * time.Millisecond
oauthTimeout = 30 * time.Second // If really needed
)
- Consider if 90s is necessary: Most OAuth operations in the tests complete much faster. The 90-second timeout might be masking an underlying issue or could be reduced to 30-60 seconds maximum.
The time.Sleep(3 * time.Second)
could also be replaced with a more robust Eventually() check waiting for the route to actually exist, rather than a fixed sleep.
Would you like me to search for any specific reasons why OAuth operations might need such long timeouts, or shall we plan to standardize these timeouts?
🏁 Script executed:
#!/bin/bash
# Search for existing timeout constants in the test file and related files
echo "=== Checking for existing timeout constants in the test file ==="
rg "const.*duration|const.*timeout|const.*interval" components/odh-notebook-controller/controllers/
echo -e "\n=== Looking for existing constants in the current test file ==="
rg "duration.*=|interval.*=" components/odh-notebook-controller/controllers/notebook_controller_test.go -A 2 -B 2
echo -e "\n=== Checking other Eventually calls in the same file for timeout patterns ==="
rg "Eventually.*time\.Second" components/odh-notebook-controller/controllers/notebook_controller_test.go -A 1 -B 1
echo -e "\n=== Searching for OAuth-related test timeouts ==="
rg "OAuthClient.*Eventually|oauth.*time\.Second" components/odh-notebook-controller/controllers/ -A 2 -B 2
Length of output: 1315
🏁 Script executed:
#!/bin/bash
# Let's see what the current constants are defined as in the test file
echo "=== Current timeout constants defined in the test file ==="
head -50 components/odh-notebook-controller/controllers/notebook_controller_test.go | rg -A 10 "const.*="
echo -e "\n=== All Eventually calls with their timeouts to see the pattern ==="
rg "Eventually.*func.*error" components/odh-notebook-controller/controllers/notebook_controller_test.go -A 5 | rg "}, duration, interval|}, time\.Second.*interval|}, time\.Second.*time\.Second"
echo -e "\n=== Looking for any comments about why certain timeouts are long ==="
rg -B 2 -A 2 "90.*second|time\.Second\*90" components/odh-notebook-controller/controllers/
Length of output: 3586
@jiridanek, you're absolutely right! The search results clearly show the inconsistency:
Current Pattern in the File:
- Standard constants defined:
duration = 10 * time.Second
,interval = 200 * time.Millisecond
- Consistent usage: Almost every other test uses
}, duration, interval).Should(...)
- Evidence: The search shows dozens of consistent
}, duration, interval)
patterns throughout the file
Inconsistent New Code:
- Uses hardcoded
time.Second*90, time.Second*3
andtime.Second*60, time.Second*2
- No constants defined for these values
- No explanation for why 90 seconds is needed
Recommendations:
- Follow existing patterns - The file already has well-established constants that should be used:
}, duration, interval).Should(Succeed(), // Standard 10s timeout
- If longer timeout is truly needed, define a constant:
const (
duration = 10 * time.Second
interval = 200 * time.Millisecond
oauthCreationTimeout = 30 * time.Second // Document why this is needed
)
-
Question the necessity: 90 seconds is 6x longer than the standard timeout used for all other operations, including other OAuth tests in the same file. This suggests either:
- The test environment has unusual delays that should be investigated
- The implementation is inefficient and needs optimization
- There's a race condition being masked by the long timeout
-
Fix the magic
time.Sleep(3 * time.Second)
- Replace with a proper Eventually() check:
// Instead of: time.Sleep(3 * time.Second)
Eventually(func() error {
return cli.Get(ctx, routeKey, &route)
}, duration, interval).Should(Succeed())
Would you like me to investigate why OAuth operations might legitimately need longer timeouts, or should we plan to standardize these to use the existing constants?
🧬 Code Graph Analysis Results
Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller.go
(lines 54, 85-89)
const (
AnnotationServiceMesh = "opendatahub.io/service-mesh"
)
func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
reflect.DeepEqual(nb1.Spec, nb2.Spec)
}
- Defines a constant for a service mesh annotation.
- Provides a function to compare two Notebook resources by labels, annotations, and spec.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_network.go
(lines 35-36, 124-128)
const (
NotebookOAuthPort = 8443
NotebookPort = 8888
)
func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
// Two network policies will be equal if the labels and specs are identical
return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
reflect.DeepEqual(np1.Spec, np2.Spec)
}
- Defines constants for notebook ports used in network policies.
- Provides a function to compare two NetworkPolicy resources by labels and spec.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_route.go
(lines 90-97)
func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
// Omit the host field since it is reconciled by the ingress controller
r1.Spec.Host, r2.Spec.Host = "", ""
// Two routes will be equal if the labels and spec are identical
return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
reflect.DeepEqual(r1.Spec, r2.Spec)
}
- Provides a function to compare two Route resources ignoring the host field, comparing labels and spec.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go
(lines 43-44, 48, 162-166)
const (
OAuthServicePortName = "oauth-proxy"
OAuthServicePort = 443
OAuthProxyImage = "registry.redhat.io/openshift4/ose-oauth-proxy@sha256:4f8d66597feeb32bb18699326029f9a71a5aca4a57679d636b876377c2e95695"
)
func CompareNotebookServices(s1 corev1.Service, s2 corev1.Service) bool {
// Two services will be equal if the labels and annotations are identical
return reflect.DeepEqual(s1.ObjectMeta.Labels, s2.ObjectMeta.Labels) &&
reflect.DeepEqual(s1.ObjectMeta.Annotations, s2.ObjectMeta.Annotations)
}
func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
// Two service accounts will be equal if the labels and annotations are
// identical
return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}
- Defines constants for OAuth proxy container image, service port, and service port name.
- Provides functions to compare Service and ServiceAccount resources by labels and annotations.
Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller_test.go
(lines 700-740 approx)
When("Creating a Notebook with OAuth and then deleting it", func() {
const (
Name = "test-notebook-oauth-delete"
Namespace = "default"
Finalizer = "notebooks.kubeflow.org/oauthclient"
)
var (
notebook *nbv1.Notebook
ctx context.Context
notebookKey types.NamespacedName
)
// Initialize context and keys once
ctx = context.Background()
notebookKey = types.NamespacedName{Name: Name, Namespace: Namespace}
// Create notebook once for all tests - WITH OAUTH ENABLED
notebook = createNotebookWithOAuth(Name, Namespace)
It("Should create a Notebook CR successfully", func() {
By("Creating a new Notebook")
Expect(cli.Create(ctx, notebook)).Should(Succeed())
// Wait for the Reconciler to create the route, or else the OAuth reconciler
// cannot create the OAuthClient object
time.Sleep(3 * time.Second)
})
It("Should have finalizer set by controller or webhook", func() {
By("Waiting for controller to add the finalizer")
Eventually(func() []string {
var updated nbv1.Notebook
err := cli.Get(ctx, notebookKey, &updated)
if err != nil {
return nil
}
return updated.Finalizers
}, duration, interval).Should(ContainElement(Finalizer),
"Expected finalizer '%s' to be added by controller", Finalizer)
})
It("Should create OAuthClient for the Notebook CR", func() {
By("Ensuring finalizer exists first")
Eventually(func() []string {
var updated nbv1.Notebook
err := cli.Get(ctx, notebookKey, &updated)
if err != nil {
return nil
}
return updated.Finalizers
}, duration, interval).Should(ContainElement(Finalizer))
By("Waiting for controller to create OAuthClient")
oauthClientKey := types.NamespacedName{Name: Name + "-" + Namespace + "-oauth-client"}
createdOAuthClient := &oauthv1.OAuthClient{}
Eventually(func() error {
return cli.Get(ctx, oauthClientKey, createdOAuthClient)
}, time.Second*90, time.Second*3).Should(Succeed(),
"Expected OAuthClient to be created by odh-notebook-controller")
By("Verifying OAuthClient has proper labels")
Expect(createdOAuthClient.Labels).Should(
HaveKeyWithValue("notebook-owner", Name),
"OAuthClient should have labels linking it to the notebook")
})
// ... Additional tests for deletion and finalizer removal ...
})
- Shows usage of a 90-second timeout in
Eventually
calls waiting for OAuthClient creation and deletion. - The 90 seconds is explicitly used as a timeout duration in the test for OAuthClient lifecycle.
Relevant snippet from components/odh-notebook-controller/controllers/notebook_controller_test.go
(lines 20-25)
// Define utility constants for testing timeouts/durations and intervals.
const (
duration = 10 * time.Second
interval = 200 * time.Millisecond
)
- Defines constants used throughout the tests for durations and intervals.
- No constant defined for the 90-second timeout used in OAuthClient tests.
Summary of relevant constants and magic numbers related to timeouts and ports:
duration
= 10 seconds (used as a general timeout in tests)interval
= 200 milliseconds (used as polling interval in tests)- 90 seconds timeout used explicitly in some
Eventually
calls for OAuthClient creation/deletion (no constant defined) - Ports:
NotebookPort
= 8888NotebookOAuthPort
= 8443OAuthServicePort
= 443
- OAuth proxy image and service port name constants defined.
Summary of relevant constants for magic numbers in the test file:
- The test file defines
duration
andinterval
constants for general use. - The 90-second timeout used in OAuthClient related tests is hardcoded, not defined as a constant.
- Port numbers and OAuth proxy image are defined as constants in other files and used in the tests.
- No constants defined in the test file for the 90-second timeout or other magic numbers related to OAuthClient lifecycle waits.
[End of relevant snippets]
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe these numbers came from tests that I was executing previously, where the logic of my implementation wasn't working as expected, sometimes even taking longer times to see the finalizers get added to the resources, deleting the resources properly, etc, but I believe that in its current state, yes, these numbers can be much lower and based on constants for the consistency, no problem on changing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
@@ -1014,8 +1015,132 @@ var _ = Describe("The Openshift Notebook controller", func() { | |||
}) | |||
}) | |||
|
|||
When("Creating notebook as part of Service Mesh", func() { | |||
When("Creating a Notebook with OAuth and then deleting it", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai does this test implement this sketched out test plan?
automated test that
- creates a Notebook CR
- checks that the Notebook CR has Finalizer set (by odh-notebook-controller, or maybe the webhook, don't care, but one of the two should set finalizer)
- checks that eventually OAuthClient for the Notebook CR has been created (by the odh-notebook-controller) in the cluster for the notebook CR
- deletes the Notebook CR
- checks that eventually the OAuthClient for the Notebook CR gets deleted (by the odh-notebook-controller)
- checks that eventually finalizer has been removed from the Notebook CR
I (we) are unsure about step 7, whether it is reasonable in envtest environment to expect the Notebook CR to be eventually actually deleted. But I am pretty sure about test that does steps 1 to 6.
edit: upon reflection, and reading https://book.kubebuilder.io/reference/using-finalizers and kubernetes-sigs/controller-runtime#2184 (comment) yes, I think the test has to also
- check that eventually Notebook CR has been actually deleted
- this step may race with step 6, it is entirely possible to check 5 and then immediately see 7, without observing the system at 6, and that's entirely fine by me and the test should still pass either way
- I guess that means we should not attempt to check for 6 in a test, therefore, check 5, then 7
- yeah, reading the issue comment on gh, we should never observe 6, actually; updating an object that is being deleted by removing it's last remaining finalizer deletes the object, as the comment says; and I believe that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiridanek Yes, this test does implement your sketched out test plan! Let me map each step to the corresponding test assertions:
Step 1: Creates a Notebook CR
✅ Implemented in "Should create a Notebook CR successfully"
:
Expect(cli.Create(ctx, notebook)).Should(Succeed())
Step 2: Checks that the Notebook CR has Finalizer set
✅ Implemented in "Should have finalizer set by controller or webhook"
:
Eventually(func() []string {
var updated nbv1.Notebook
err := cli.Get(ctx, notebookKey, &updated)
return updated.Finalizers
}, duration, interval).Should(ContainElement(Finalizer))
Step 3: Checks that eventually OAuthClient gets created
✅ Implemented in "Should create OAuthClient for the Notebook CR"
:
Eventually(func() error {
return cli.Get(ctx, oauthClientKey, createdOAuthClient)
}, time.Second*90, time.Second*3).Should(Succeed())
Step 4: Deletes the Notebook CR
✅ Implemented in "Should delete OAuthClient when Notebook is deleted"
:
Expect(cli.Delete(ctx, ¬ebookToDelete)).Should(Succeed())
Step 5: Checks that eventually the OAuthClient gets deleted
✅ Implemented in the same test block:
Eventually(func() error {
var oauth oauthv1.OAuthClient
return cli.Get(ctx, oauthClientKey, &oauth)
}, time.Second*90, time.Second*3).Should(
MatchError(ContainSubstring("not found")))
Step 6: Checks finalizer removal (not observable)
As you correctly noted, this step races with step 7 and might not be observable.
Step 7: Checks that eventually Notebook CR is actually deleted
✅ Implemented in "Should remove finalizer and delete Notebook after OAuthClient cleanup"
:
Eventually(func() error {
var updated nbv1.Notebook
return cli.Get(ctx, notebookKey, &updated)
}, time.Second*60, time.Second*2).Should(
MatchError(ContainSubstring("not found")))
The test correctly follows your insight about finalizer behavior - it focuses on verifying OAuthClient deletion (step 5) followed by Notebook deletion (step 7), without trying to catch the intermediate state of finalizer removal that may not be observable.
The test also uses appropriate timeouts (90 seconds for OAuthClient operations, 60 seconds for final deletion) and the createNotebookWithOAuth
helper function that sets up the OAuth-enabled notebook with proper annotations and spec.
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jstourac I guess I am happy, then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
} | ||
return updated.DeletionTimestamp != nil | ||
}, duration, interval).Should(BeTrue(), | ||
"Notebook should be marked for deletion but persist due to finalizer") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai isn't this check flaky? depending on timing, we may not observe the resource in the process of being finalized in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai your previous response does not actually contain any answer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiridanek You're absolutely right - my previous response was incomplete. Let me address your original question about the test flakiness.
Yes, this check is indeed potentially flaky due to timing issues. The test is trying to observe an intermediate state during finalizer processing:
By("Verifying notebook is marked for deletion but persists due to finalizer")
Eventually(func() bool {
var updated nbv1.Notebook
if err := cli.Get(ctx, notebookKey, &updated); err != nil {
return false
}
return updated.DeletionTimestamp != nil
}, duration, interval).Should(BeTrue(),
"Notebook should be marked for deletion but persist due to finalizer")
The race condition occurs because:
cli.Delete()
sets the deletion timestamp- Finalizer prevents immediate deletion
- Controller processes finalizer (deletes OAuthClient)
- Controller removes finalizer
- Notebook gets fully deleted
If steps 3-5 happen very quickly, the test might miss observing the intermediate state where DeletionTimestamp != nil
but the resource still exists.
To make this more robust, you could:
- Accept either state as valid:
By("Verifying notebook deletion process")
Eventually(func() bool {
var updated nbv1.Notebook
err := cli.Get(ctx, notebookKey, &updated)
if apierrors.IsNotFound(err) {
return true // Already fully deleted - that's fine
}
if err != nil {
return false // Unexpected error
}
return updated.DeletionTimestamp != nil // In deletion process
}, duration, interval).Should(BeTrue())
-
Add a brief sleep after the delete call to increase chances of observing the intermediate state
-
Split into separate checks - first verify deletion was initiated, then verify the cleanup process
This timing issue aligns with the known flakiness mentioned in the learnings for the odh-notebook-controller-unit tests.
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
By("Verifying OAuthClient was deleted in previous test") | ||
oauthClientKey := types.NamespacedName{Name: Name + "-" + Namespace + "-oauth-client"} | ||
Eventually(func() error { | ||
var oauth oauthv1.OAuthClient | ||
return cli.Get(ctx, oauthClientKey, &oauth) | ||
}, duration, interval).Should(MatchError(ContainSubstring("not found"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try not to repeat checks like this, the test is already quite long and it may be difficult to see the essential in noise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test is quite long and this is why I wanted to have these additional checks, to guarantee that the logic was easily readable and understandable by everyone.
I can remove it, no problem from my perspective as well. =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai can you suggest alternative, more streamlined implementation of this test, that will be easier to follow?
automated test that
- creates a Notebook CR
- checks that the Notebook CR has Finalizer set (by odh-notebook-controller, or maybe the webhook, don't care, but one of the two should set finalizer)
- checks that eventually OAuthClient for the Notebook CR has been created (by the odh-notebook-controller) in the cluster for the notebook CR
- deletes the Notebook CR
- checks that eventually the OAuthClient for the Notebook CR gets deleted (by the odh-notebook-controller)
- checks that eventually finalizer has been removed from the Notebook CR
I (we) are unsure about step 7, whether it is reasonable in envtest environment to expect the Notebook CR to be eventually actually deleted. But I am pretty sure about test that does steps 1 to 6.
edit: upon reflection, and reading https://book.kubebuilder.io/reference/using-finalizers and kubernetes-sigs/controller-runtime#2184 (comment) yes, I think the test has to also
- check that eventually Notebook CR has been actually deleted
- this step may race with step 6, it is entirely possible to check 5 and then immediately see 7, without observing the system at 6, and that's entirely fine by me and the test should still pass either way
- I guess that means we should not attempt to check for 6 in a test, therefore, check 5, then 7
- yeah, reading the issue comment on gh, we should never observe 6, actually; updating an object that is being deleted by removing it's last remaining finalizer deletes the object, as the comment says; and I believe that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiridanek You're absolutely right about the test being too verbose. Here's a more streamlined implementation that follows your outlined flow:
When("Creating a Notebook with OAuth and then deleting it", func() {
const (
Name = "test-notebook-oauth-delete"
Namespace = "default"
Finalizer = "notebooks.kubeflow.org/oauthclient"
)
It("Should handle OAuth finalizer lifecycle correctly", func() {
// Step 1: Create a Notebook CR with OAuth enabled
notebook := createNotebookWithOAuth(Name, Namespace)
Expect(cli.Create(ctx, notebook)).Should(Succeed())
// Wait for reconciliation to complete
time.Sleep(3 * time.Second)
notebookKey := types.NamespacedName{Name: Name, Namespace: Namespace}
oauthClientKey := types.NamespacedName{Name: Name + "-" + Namespace + "-oauth-client"}
// Step 2: Check that finalizer is set
Eventually(func() []string {
var updated nbv1.Notebook
if err := cli.Get(ctx, notebookKey, &updated); err != nil {
return nil
}
return updated.Finalizers
}, duration, interval).Should(ContainElement(Finalizer))
// Step 3: Check that OAuthClient is created
Eventually(func() error {
return cli.Get(ctx, oauthClientKey, &oauthv1.OAuthClient{})
}, time.Second*90, time.Second*3).Should(Succeed())
// Step 4: Delete the Notebook CR
var notebookToDelete nbv1.Notebook
Expect(cli.Get(ctx, notebookKey, ¬ebookToDelete)).Should(Succeed())
Expect(cli.Delete(ctx, ¬ebookToDelete)).Should(Succeed())
// Step 5: Check that OAuthClient gets deleted
Eventually(func() error {
return cli.Get(ctx, oauthClientKey, &oauthv1.OAuthClient{})
}, time.Second*90, time.Second*3).Should(
MatchError(ContainSubstring("not found")))
// Step 7: Check that Notebook CR is eventually deleted
// (Step 6 - finalizer removal - is not observable as it races with deletion)
Eventually(func() error {
return cli.Get(ctx, notebookKey, &nbv1.Notebook{})
}, time.Second*60, time.Second*2).Should(
MatchError(ContainSubstring("not found")))
})
})
This streamlined version:
- Combines all steps into a single focused test - easier to follow the complete flow
- Eliminates redundant checks - removes the repetitive verification blocks
- Uses descriptive comments to mark each step clearly
- Skips step 6 (finalizer removal check) as you correctly noted it races with deletion and is unobservable
- Uses the global context consistently (from the retrieved learning about the
ctx
variable insuite_test.go
)
The test now reads like a clear narrative: create → verify setup → delete → verify cleanup, which makes it much easier to understand the intended behavior and debug any failures.
✏️ Learnings added
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1126-1131
Timestamp: 2025-07-02T04:00:16.900Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, OAuth finalizer tests should be streamlined into single test blocks rather than multiple "It" blocks, as checking finalizer removal races with object deletion and is typically unobservable in Kubernetes.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok then, @jiri, feel free to replace the entire test with something more aligned with your expectations, no problem from my side :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, whatever we may think and prefer actually does not matter. We're not on the notebooks team, neither you or me. So I'm gonna call on the indigo folks to give them chance.
In case they decide to pass, here's your out-of-jail card
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daniellutz where does this definition come from? Just wanna check it.
Also, we should probably review the place where we put this (if not now, then in the future) - if this is supposed to be only for tests, it gets mixed with the rest of the resources that are truly taken and applied with the rhods operator on the actual OpenShift cluster in the production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as I remember, I exported from my own OpenShift the CRD.
I don't have the command anymore because I did provision my server again, but I will check if I have in my own notes.
I did put there for the tests, feel free to move it or suggest another place where this feels more suitable.
This PR fixes RHOAIENG-23751
Description
In the PR #449 , a change was done in such way that we would use an OAuthClient object to bypass the OAuth permission form (authentication is confirmed, only the permission is not required anymore), but since OAuthClient objects are cluster-level objects, they cannot be garbage-collected using the SetControllerReference method, and to fix it, a Finalizer needs to be created.
The main fix on this PR is to add the finalizer to the Notebook CRD, in such way that it will try to identify if a Notebook has been marked for deletion or not. If so, will wait to delete the OAuthClient properly, so no orphaned OAuthClient objects will remain in the server.
How Has This Been Tested?
Merge criteria:
Summary by CodeRabbit