Skip to content

RHOAIENG-23751: Add finalizer to clean up OAuthClients on notebook deletion #605

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

daniellutz
Copy link

@daniellutz daniellutz commented May 14, 2025

This PR fixes RHOAIENG-23751

Description

In the PR #449 , a change was done in such way that we would use an OAuthClient object to bypass the OAuth permission form (authentication is confirmed, only the permission is not required anymore), but since OAuthClient objects are cluster-level objects, they cannot be garbage-collected using the SetControllerReference method, and to fix it, a Finalizer needs to be created.

The main fix on this PR is to add the finalizer to the Notebook CRD, in such way that it will try to identify if a Notebook has been marked for deletion or not. If so, will wait to delete the OAuthClient properly, so no orphaned OAuthClient objects will remain in the server.

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • New Features
    • Introduced support for managing OAuth clients as custom resources, including automatic cleanup of associated OAuth clients when a Notebook is deleted.
  • Bug Fixes
    • Improved handling of resource deletion to ensure proper cleanup of OAuth clients.
  • Tests
    • Added new tests to verify the creation and deletion flow of Notebooks with associated OAuth clients.
  • Chores
    • Updated permissions and test environment setup to support OAuth client management.

@openshift-ci openshift-ci bot requested review from jstourac and paulovmr May 14, 2025 12:26
Copy link

openshift-ci bot commented May 14, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign paulovmr for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the size/l label May 14, 2025
Comment on lines 991 to 992
By("Manually deleting OAuthClient to simulate controller behavior")
Expect(cli.Delete(ctx, oauthClient)).Should(Succeed())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the controller running in the background? Should not it issue this deletion? How come the test looks almost like controller is not present?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for raising this up, I believe this was good information that I should have put in the description (and I will do so as well).

during the tests, I've had a scenario where I could see all logs from the reconciler and finalizer appearing, but on the test environment this wasn't happening (sometimes, I believe I have even found the testing environment kind of not consistent enough to test finalizers), but I don't know if this is some extra configuration that needs to be added, if it's related to Kubebuilder, or something else.

if you have any suggestion or idea, I will look forward to trying it out and changing this to something more suitable, but since I couldn't exactly test the finalizers, I forced some scenarios to try to replicate the most portion of its execution, the way was possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird. We shouldn't simulate controller behavior in these tests. We should test it - so we should have test for:

  1. a Notebook creation - where we check that it gets the OAuthClient finalizer in it
  2. a Notebook deletion - where we check that both Notebook and OAuthClient objects are deleted properly (only these and not some other - e.g. in case we have two similar Notebooks present)

Or am I missing something? 🤔

// a clean up, without overloading OpenShift resources.
func (r *OpenshiftNotebookReconciler) cleanupOAuthClient(notebook *nbv1.Notebook, ctx context.Context) error {
// Name of the OAuthClient based on the Notebook name
oauthClientName := fmt.Sprintf("%s-%s-%s", notebook.Name, notebook.Namespace, "oauth-client")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if we (in the original oauth pr) though about what implications this has for the limit on maximal notebook name length...

@codecov-commenter
Copy link

codecov-commenter commented May 14, 2025

Codecov Report

Attention: Patch coverage is 73.07692% with 14 lines in your changes missing coverage. Please review.

Project coverage is 68.39%. Comparing base (eacf63c) to head (ba7e765).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
...-notebook-controller/controllers/notebook_oauth.go 73.07% 10 Missing and 4 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #605       +/-   ##
===========================================
+ Coverage   56.21%   68.39%   +12.17%     
===========================================
  Files          11        9        -2     
  Lines        3072     2161      -911     
===========================================
- Hits         1727     1478      -249     
+ Misses       1185      551      -634     
+ Partials      160      132       -28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@openshift-merge-robot openshift-merge-robot added the needs-rebase The PR needs a rebase or there are conflicts label May 21, 2025
@openshift-ci openshift-ci bot added size/l and removed size/l labels May 28, 2025
Comment on lines 950 to 1077
notebook.Finalizers = append(notebook.Finalizers, Finalizer)
Expect(cli.Update(ctx, notebook)).Should(Succeed())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who's to add the finalizer? the controller or the test?

Copy link
Member

@jiridanek jiridanek May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and, to answer my question, controller should add finalizer IMO, so please don't set it in the test

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I did mention before, the full test to check if it was deleted didn't work out properly with me before (maybe the explanation is in the link you sent above, or my lack of knowledge, or both), so I tried to test as much of the code to ensure the code would be consistent, even if the test itself wasn't what we expected in terms of functionality (the real deletion of the objects, etc)

feel free to suggest otherwise, as I believe we have some limitations on kubeflow itself and might need to get to an understanding when this kind of scenario happens

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I believe we have some limitations on kubeflow itself and might need to get to an understanding when this kind of scenario happens

that could be a consideration, we'd need a more powerful test framework then

GinkgoT().Logf("OAuthClient still exists, which suggests the controller isn't handling deletion in the test environment")
}

// No assertion here - we're just observing behavior
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's not possible to assert that OAuthClient has been actually deleted? Because the test does not assert that now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is why this pull request has been sent before as "test if finalizers get added, checks if it gets removed after", etc, because I couldn't make it work entirely, maybe because of the limitation of the https://book.kubebuilder.io/reference/envtest.html#testing-considerations

I can try again and send so you can check, but before it didn't worked out for me before

we did not check if it did get deleted, but tried to assert that the majority of the code was doing work towards the deletion

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try again, and if you know a way to make it work properly, I'd be really up for suggestions

Copy link
Member

@jiridanek jiridanek May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try something. Worst case, something like https://book-v3.book.kubebuilder.io/reference/envtest where they mention KinD, and https://github.com/kubeflow/notebooks/blob/notebooks-v2/workspaces/controller/test/e2e/e2e_suite_test.go#L47 which lets the kubebuilder test infra interacti with genuine kind should solve all problems.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a look into it, thanks!

@jiridanek
Copy link
Member

@daniellutz can you fix conflicts, I'll try to look into the tests then

@jiridanek
Copy link
Member

I'll try to rebase myself, so I don't bother you with such chores.

@jiridanek
Copy link
Member

I take that back. Resolving conflicts requires some thought, so I won't be resolving it in this pr and wait for @daniellutz to take care of it.

@daniellutz
Copy link
Author

I will rebase it now, as I was trying a couple of things yesterday. Just give me one moment. Thanks for trying it out!

@daniellutz daniellutz force-pushed the oauth-client-cleanup branch from d09726f to 105b60a Compare May 30, 2025 12:30
@openshift-ci openshift-ci bot added size/l and removed size/l labels May 30, 2025
@openshift-merge-robot openshift-merge-robot removed the needs-rebase The PR needs a rebase or there are conflicts label May 30, 2025
@jiridanek
Copy link
Member

@CodeRabbit review

Copy link

coderabbitai bot commented Jun 20, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

coderabbitai bot commented Jun 20, 2025

Walkthrough

A new CustomResourceDefinition (CRD) for OAuthClient resources was introduced, enabling the management of OAuth clients as cluster-scoped resources. The notebook controller and its RBAC permissions were updated to handle OAuthClient lifecycle via finalizers, ensuring proper cleanup on Notebook deletion. Corresponding tests and test environment setup were also enhanced.

Changes

File(s) Change Summary
.../config/crd/external/oauthclients.oauth.openshift.io.yaml Added CRD definition for cluster-scoped OAuthClient resources under oauth.openshift.io/v1.
.../config/rbac/role.yaml
.../controllers/notebook_controller.go
Expanded RBAC permissions: added update for Notebooks and delete for OAuthClients.
.../controllers/notebook_oauth.go Added finalizer handling for Notebooks to manage OAuthClient cleanup on deletion; new methods for finalizer logic.
.../controllers/notebook_controller_test.go Added tests for Notebook creation/deletion with OAuthClient and finalizer handling.
.../controllers/suite_test.go Registered OpenShift OAuth API scheme for test environment.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant K8s API
    participant NotebookController
    participant OAuthClient (CRD)

    User->>K8s API: Create Notebook (with OAuth)
    K8s API->>NotebookController: Trigger reconcile
    NotebookController->>Notebook: Add finalizer (if missing)
    NotebookController->>OAuthClient: Create/Update OAuthClient
    User->>K8s API: Delete Notebook
    K8s API->>NotebookController: Trigger reconcile (deletion)
    NotebookController->>OAuthClient: Delete OAuthClient
    NotebookController->>Notebook: Remove finalizer
    K8s API->>Notebook: Complete deletion
Loading

Poem

In the warren of code, a new CRD appears,
OAuthClients hop in, allaying old fears.
Finalizers now tidy, when Notebooks depart,
Permissions expanded, all roles play their part.
With tests in the meadow and schemes in the sun,
The controller and clients now hop as one!
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e264a1 and ba7e765.

📒 Files selected for processing (2)
  • components/odh-notebook-controller/controllers/notebook_controller_test.go (3 hunks)
  • components/odh-notebook-controller/controllers/notebook_oauth.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • components/odh-notebook-controller/controllers/notebook_controller_test.go
  • components/odh-notebook-controller/controllers/notebook_oauth.go
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: build
  • GitHub Check: build
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@openshift-merge-robot openshift-merge-robot added the needs-rebase The PR needs a rebase or there are conflicts label Jun 20, 2025
@jiridanek
Copy link
Member

@coderabbitai full

Copy link

coderabbitai bot commented Jun 20, 2025

Tip

For best results, initiate chat on the files or code changes.

@jiridanek I'll perform a full review of this PR from scratch as requested.

✅ Actions performed

Full review triggered.

Note: This will review all files in the PR comprehensively, including any previously reviewed content. Processing all changes now...

📥 Commits being reviewed

Reviewing all files that changed from the base of the PR and the current head commit 105b60a.

📒 Files selected for processing (6)
  • components/odh-notebook-controller/config/crd/external/oauthclients.oauth.openshift.io.yaml (1 hunks)
  • components/odh-notebook-controller/config/rbac/role.yaml (2 hunks)
  • components/odh-notebook-controller/controllers/notebook_controller.go (1 hunks)
  • components/odh-notebook-controller/controllers/notebook_controller_test.go (2 hunks)
  • components/odh-notebook-controller/controllers/notebook_oauth.go (3 hunks)
  • components/odh-notebook-controller/controllers/suite_test.go (2 hunks)

The comprehensive review will analyze all aspects of the changes including:

  • OAuth client integration and management
  • RBAC permission configurations
  • Controller logic and error handling
  • Test coverage and implementation
  • CRD definitions and specifications

Please note that the PR currently has merge conflicts that will need to be resolved before it can be merged.

@openshift-ci openshift-ci bot added size/l and removed size/l labels Jun 20, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
components/odh-notebook-controller/controllers/notebook_controller_test.go (1)

1076-1077: Manual finalizer addition contradicts controller responsibility.

The test manually adds the finalizer to the Notebook, but the controller should be responsible for adding finalizers during normal operation. This approach doesn't test the actual controller behavior.

Based on previous reviewer feedback, the controller should add the finalizer automatically. Consider testing:

  1. Notebook creation → verify controller adds finalizer
  2. Notebook deletion → verify controller removes finalizer and deletes OAuthClient
-			notebook.Finalizers = append(notebook.Finalizers, Finalizer)
-			Expect(cli.Update(ctx, notebook)).Should(Succeed())
+			// Wait for controller to add finalizer
+			Eventually(func() bool {
+				_ = cli.Get(ctx, key, notebook)
+				return containsFinalizer(notebook.Finalizers, Finalizer)
+			}, duration, interval).Should(BeTrue())
components/odh-notebook-controller/controllers/notebook_oauth.go (1)

366-366: Potential issue with OAuthClient name length.

The constructed name pattern {notebook-name}-{namespace}-oauth-client could exceed Kubernetes' 253 character limit for object names if notebook names or namespaces are long. Consider adding validation or truncation logic to ensure the name stays within limits.

 func (r *OpenshiftNotebookReconciler) cleanupOAuthClient(notebook *nbv1.Notebook, ctx context.Context) error {
 	// Name of the OAuthClient based on the Notebook name
 	oauthClientName := fmt.Sprintf("%s-%s-%s", notebook.Name, notebook.Namespace, "oauth-client")
+	
+	// Ensure the name doesn't exceed Kubernetes name limits
+	const maxNameLength = 253
+	if len(oauthClientName) > maxNameLength {
+		// Consider logging a warning or implementing a truncation strategy
+		log := logf.FromContext(ctx)
+		log.Error(nil, "OAuthClient name exceeds maximum length", "name", oauthClientName, "length", len(oauthClientName))
+		return fmt.Errorf("OAuthClient name exceeds maximum length of %d characters", maxNameLength)
+	}
🧹 Nitpick comments (1)
components/odh-notebook-controller/controllers/notebook_oauth.go (1)

328-359: Consider adding logging for better observability.

The finalizer implementation is correct, but adding log statements would help with debugging and monitoring the lifecycle of OAuthClient cleanup.

 func (r *OpenshiftNotebookReconciler) HandleOAuthClientFinalizer(notebook *nbv1.Notebook, ctx context.Context) (bool, error) {
 	finalizerName := "notebooks.kubeflow.org/oauthclient"
+	log := logf.FromContext(ctx)
 
 	// Check if object is being deleted
 	if notebook.GetDeletionTimestamp() != nil {
 		if controllerutil.ContainsFinalizer(notebook, finalizerName) {
+			log.Info("Notebook deletion detected, cleaning up OAuthClient", "notebook", notebook.Name)
 			if err := r.cleanupOAuthClient(notebook, ctx); err != nil {
 				return true, err
 			}
 
 			controllerutil.RemoveFinalizer(notebook, finalizerName)
 			if err := r.Update(ctx, notebook); err != nil {
 				return true, err
 			}
+			log.Info("OAuthClient cleanup completed", "notebook", notebook.Name)
 		}
 		// Return true to indicate finalization is in progress
 		return true, nil
 	}
 
 	// Add finalizer if it doesn't exist yet
 	if !controllerutil.ContainsFinalizer(notebook, finalizerName) {
+		log.Info("Adding finalizer to notebook", "notebook", notebook.Name)
 		controllerutil.AddFinalizer(notebook, finalizerName)
 		if err := r.Update(ctx, notebook); err != nil {
 			return true, err
 		}
 		// Return true to indicate we're setting up finalization
 		return true, nil
 	}
 
 	// Return false to indicate normal reconciliation should proceed
 	return false, nil
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5dd457e and 105b60a.

📒 Files selected for processing (6)
  • components/odh-notebook-controller/config/crd/external/oauthclients.oauth.openshift.io.yaml (1 hunks)
  • components/odh-notebook-controller/config/rbac/role.yaml (2 hunks)
  • components/odh-notebook-controller/controllers/notebook_controller.go (1 hunks)
  • components/odh-notebook-controller/controllers/notebook_controller_test.go (2 hunks)
  • components/odh-notebook-controller/controllers/notebook_oauth.go (3 hunks)
  • components/odh-notebook-controller/controllers/suite_test.go (2 hunks)
🔇 Additional comments (12)
components/odh-notebook-controller/config/crd/external/oauthclients.oauth.openshift.io.yaml (3)

4-5: Clarify the purpose of the "fake" annotation.

The annotation crd/fake: "true" suggests this CRD is only for testing purposes. Please clarify whether this is intended for production use or if this is a test-only CRD definition.


14-14: Cluster scope is appropriate for OAuthClient resources.

The cluster-scoped nature aligns with the PR objective of managing cluster-level OAuthClient resources that cannot be garbage-collected using standard owner references.


17-65: Schema definition is comprehensive and correct.

The OpenAPI schema properly defines all standard OAuth client properties including grant methods, secrets, redirect URIs, token timeouts, and scope restrictions. The structure follows OpenShift OAuth client conventions.

components/odh-notebook-controller/controllers/suite_test.go (2)

40-40: Import addition is necessary for OAuthClient support.

Adding the OAuth API import enables the test environment to handle OAuthClient resources as part of the finalizer testing.


160-160: Scheme registration enables OAuthClient testing.

Registering the OAuth scheme with the test environment is essential for the new OAuthClient lifecycle tests to function properly.

components/odh-notebook-controller/config/rbac/role.yaml (2)

45-45: Update permission on notebooks enables finalizer management.

Adding the "update" verb is necessary for the controller to manage finalizers on Notebook resources during the deletion lifecycle.


76-76: Delete permission on OAuthClients enables cleanup.

Adding the "delete" verb allows the controller to properly clean up OAuthClient resources when Notebooks are deleted, which is the core functionality of this PR.

components/odh-notebook-controller/controllers/notebook_controller.go (2)

69-69: RBAC annotation correctly includes update permission.

Adding the "update" verb to the notebooks RBAC annotation ensures the generated ClusterRole will allow finalizer management, matching the manual role.yaml changes.


76-76: RBAC annotation correctly includes delete permission.

Adding the "delete" verb to the oauthclients RBAC annotation enables OAuthClient cleanup functionality, consistent with the role.yaml updates.

components/odh-notebook-controller/controllers/notebook_controller_test.go (2)

31-31: Import addition supports OAuth testing.

Adding the OAuth API import is necessary for the new OAuthClient finalizer tests.


1015-1042: Test setup is well-structured.

The test context setup properly defines constants, variables, and BeforeEach initialization for consistent test execution.

components/odh-notebook-controller/controllers/notebook_oauth.go (1)

307-318: Finalizer handling placement is correct.

The finalizer check before OAuthClient creation/update ensures proper cleanup during notebook deletion.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
components/odh-notebook-controller/controllers/notebook_controller_test.go (1)

1218-1285: Consider refactoring to reduce code duplication

The helper function is functionally correct but duplicates significant code from the existing OAuth test setup (lines 687-715). The notebook structure, labels, annotations, and volumes are nearly identical.

Consider extracting common OAuth notebook configuration into a shared helper or parameterizing the existing createNotebook function to handle OAuth scenarios:

+func createNotebookWithOAuth(name, namespace string) *nbv1.Notebook {
+	notebook := createNotebook(name, namespace)
+	return configureNotebookForOAuth(notebook, name, namespace)
+}
+
+func configureNotebookForOAuth(notebook *nbv1.Notebook, name, namespace string) *nbv1.Notebook {
+	// Extract common OAuth configuration logic here
+	// This can be reused by both test contexts
+}

This would improve maintainability and ensure consistency between OAuth test configurations.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 105b60a and 70b92d0.

📒 Files selected for processing (1)
  • components/odh-notebook-controller/controllers/notebook_controller_test.go (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
components/odh-notebook-controller/controllers/notebook_controller_test.go (2)
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
🪛 golangci-lint (1.64.8)
components/odh-notebook-controller/controllers/notebook_controller_test.go

1060-1060: Error return value of cli.Get is not checked

(errcheck)


1082-1082: Error return value of cli.Get is not checked

(errcheck)

🔇 Additional comments (1)
components/odh-notebook-controller/controllers/notebook_controller_test.go (1)

31-31: LGTM: Import addition for OAuth testing

The OAuth API import is correctly added and necessary for the new OAuthClient testing functionality.

@daniellutz daniellutz force-pushed the oauth-client-cleanup branch from 70b92d0 to c64b93d Compare June 30, 2025 22:50
@openshift-merge-robot openshift-merge-robot removed the needs-rebase The PR needs a rebase or there are conflicts label Jun 30, 2025
@openshift-ci openshift-ci bot added size/l and removed size/l labels Jun 30, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
components/odh-notebook-controller/controllers/notebook_controller_test.go (4)

1018-1133: Consider test environment limitations for finalizer testing.

Based on previous review discussions, the envtest environment may not properly support controller lifecycle testing with finalizers. This aligns with the concerns raised by previous reviewers about the need for more robust testing approaches like Kind clusters.

The current tests validate the logic but may not fully verify the controller's runtime behavior. Consider documenting these limitations or exploring integration tests against a real cluster for complete validation.


1084-1088: Fix unchecked error return in Eventually block.

Another instance of ignoring cli.Get errors that should be handled properly.

		Eventually(func() []string {
			var updated nbv1.Notebook
-			cli.Get(ctx, notebookKey, &updated)
+			if err := cli.Get(ctx, notebookKey, &updated); err != nil {
+				return nil
+			}
			return updated.Finalizers
		}, duration, interval).Should(ContainElement(Finalizer))

1062-1066: Fix unchecked error return in Eventually block.

The cli.Get call ignores errors, which could lead to panics or misleading test results.

		Eventually(func() []string {
			var updated nbv1.Notebook
-			cli.Get(ctx, notebookKey, &updated)
+			if err := cli.Get(ctx, notebookKey, &updated); err != nil {
+				return nil
+			}
			return updated.Finalizers
		}, duration, interval).Should(ContainElement(Finalizer))

1073-1074: Reduce excessive timeout duration.

The 90-second timeout is extremely long and suggests underlying test environment issues. This could significantly slow down the test suite.

-		}, time.Second*90, time.Second*3).Should(Succeed(),
+		}, duration, interval).Should(Succeed(),

If the longer timeout is truly necessary due to test environment limitations, document why this specific case requires extended timing.

🧹 Nitpick comments (1)
components/odh-notebook-controller/controllers/notebook_controller_test.go (1)

1038-1045: Improve test timing consistency.

The hardcoded 3-second sleep suggests timing dependencies that could lead to test flakiness. Consider using Eventually assertions instead.

-		// Wait for the Reconciler to create the route, or else the OAuth reconciler
-		// cannot create the OAuthClient object
-		time.Sleep(3 * time.Second)
+		By("Waiting for route to be created")
+		Eventually(func() error {
+			route := &routev1.Route{}
+			key := types.NamespacedName{Name: Name, Namespace: Namespace}
+			return cli.Get(ctx, key, route)
+		}, duration, interval).Should(Succeed())
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 70b92d0 and c64b93d.

📒 Files selected for processing (6)
  • components/odh-notebook-controller/config/crd/external/oauthclients.oauth.openshift.io.yaml (1 hunks)
  • components/odh-notebook-controller/config/rbac/role.yaml (2 hunks)
  • components/odh-notebook-controller/controllers/notebook_controller.go (1 hunks)
  • components/odh-notebook-controller/controllers/notebook_controller_test.go (3 hunks)
  • components/odh-notebook-controller/controllers/notebook_oauth.go (3 hunks)
  • components/odh-notebook-controller/controllers/suite_test.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • components/odh-notebook-controller/config/rbac/role.yaml
  • components/odh-notebook-controller/controllers/suite_test.go
  • components/odh-notebook-controller/controllers/notebook_controller.go
  • components/odh-notebook-controller/config/crd/external/oauthclients.oauth.openshift.io.yaml
  • components/odh-notebook-controller/controllers/notebook_oauth.go
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
components/odh-notebook-controller/controllers/notebook_controller_test.go (2)
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.
🪛 GitHub Check: golangci-lint (components/odh-notebook-controller)
components/odh-notebook-controller/controllers/notebook_controller_test.go

[failure] 1104-1104:
Error return value of cli.Get is not checked (errcheck)


[failure] 1082-1082:
Error return value of cli.Get is not checked (errcheck)

🪛 golangci-lint (1.64.8)
components/odh-notebook-controller/controllers/notebook_controller_test.go

1064-1064: Error return value of cli.Get is not checked

(errcheck)


1086-1086: Error return value of cli.Get is not checked

(errcheck)

🪛 GitHub Actions: Code static analysis
components/odh-notebook-controller/controllers/notebook_controller_test.go

[error] 1082-1082: golangci-lint errcheck: Error return value of cli.Get is not checked.

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: build
  • GitHub Check: build
🔇 Additional comments (3)
components/odh-notebook-controller/controllers/notebook_controller_test.go (3)

32-32: LGTM: OAuth API import added.

The import is correctly added to support the new OAuthClient testing functionality.


1018-1037: Test context setup looks good.

The test structure and variable initialization are well-organized, with clear constants and proper context setup.


1467-1534: Well-structured helper function.

The createNotebookWithOAuth function follows good testing patterns by providing a clean way to create notebooks with OAuth configuration. The structure matches the expected notebook specification with proper OAuth annotations and volumes.

@openshift-ci openshift-ci bot added size/l and removed size/l labels Jun 30, 2025
@daniellutz daniellutz force-pushed the oauth-client-cleanup branch from 0e264a1 to ba7e765 Compare June 30, 2025 23:47
@openshift-ci openshift-ci bot added size/l and removed size/l labels Jun 30, 2025
)

// Initialize context and keys once
ctx = context.Background()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai we already have a perfectly good context at https://github.com/opendatahub-io/kubeflow/blob/main/components/odh-notebook-controller/controllers/suite_test.go#L68 defined, what's the reason to define new context?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@jiridanek You're absolutely right! Looking at the code patterns in the test file, there are two issues with the current approach:

  1. Unnecessary variable declaration: The test defines ctx as a variable in the When block scope, but then only assigns context.Background() to it once. This is inconsistent with the existing pattern where most tests use ctx := context.Background() directly within each It block.

  2. Missing context reuse: If there's already a context available in the test suite setup, that should be preferred for consistency.

The current code should be simplified to follow the existing pattern. Instead of:

var (
    notebook    *nbv1.Notebook
    ctx         context.Context
    notebookKey types.NamespacedName
)

// Initialize context and keys once
ctx = context.Background()

It should use the standard pattern seen throughout the file:

var (
    notebook    *nbv1.Notebook
    notebookKey types.NamespacedName
)

And then in each It block where context is needed:

It("Should create a Notebook CR successfully", func() {
    ctx := context.Background()
    // ... rest of test
})

This maintains consistency with the existing test structure and follows Go testing best practices.


🏁 Script executed:

#!/bin/bash
# Check the context definition in suite_test.go around line 68
cat components/odh-notebook-controller/controllers/suite_test.go | head -80 | tail -20

Length of output: 577


🏁 Script executed:

#!/bin/bash
# Look for context usage patterns in suite_test.go
rg "ctx" components/odh-notebook-controller/controllers/suite_test.go -A 2 -B 2

Length of output: 871


🏁 Script executed:

#!/bin/bash
# Check if there's a global context variable definition in suite_test.go
rg "var.*ctx|ctx.*=" components/odh-notebook-controller/controllers/suite_test.go

Length of output: 139


@jiridanek You're absolutely correct! I can see from the suite_test.go file that there's indeed a global context properly defined and initialized:

var (
    ctx            context.Context
    cancel         context.CancelFunc
    // ...
)

// In BeforeSuite:
ctx, cancel = context.WithCancel(context.Background())

This global context is:

  1. Properly managed - Created with cancellation support in BeforeSuite
  2. Already used - Used for manager setup and namespace creation in the test suite
  3. Available everywhere - Accessible throughout all test files in the package

The new test should definitely use this existing global ctx instead of creating its own. The variable declarations should be simplified to:

var (
    notebook    *nbv1.Notebook
    notebookKey types.NamespacedName
)

And remove the ctx = context.Background() assignment since the global ctx is already available and properly initialized.

This would be more consistent with the test suite architecture and take advantage of the cancelable context that's properly managed in the test lifecycle. Thank you for catching this - it's a good example of following existing patterns and reusing properly set up infrastructure!

🐰 ears perked with curiosity


🧬 Code Graph Analysis Results

Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller_test.go (full file provided)

  • The test file defines multiple test cases for the Openshift Notebook controller, including creation, update, deletion, and reconciliation of various Kubernetes resources related to Notebooks such as Routes, RoleBindings, NetworkPolicies, OAuth sidecars, ServiceAccounts, Secrets, and ConfigMaps.
  • The tests consistently create a new context with ctx := context.Background() inside each It block before performing client operations like cli.Create(ctx, ...), cli.Get(ctx, ...), cli.Delete(ctx, ...), and cli.Update(ctx, ...).
  • The context is used as a standard Go idiomatic way to pass cancellation, deadlines, and request-scoped values to client calls.
  • There is no shared or global context reused across tests; each test creates its own fresh context.
  • Example snippet from multiple tests:
    It("Should create a Route to expose the traffic externally", func() {
        ctx := context.Background()
        Expect(cli.Create(ctx, notebook)).Should(Succeed())
        ...
    })
  • The context is not imported or referenced from any external test suite file like suite_test.go.

Possibly relevant snippets from components/odh-notebook-controller/controllers/suite_test.go (referenced by user comment)

[No direct snippet provided here, but user references line 68 in suite_test.go where a context is defined.]


Relevant snippets from other files in components/odh-notebook-controller/controllers/

notebook_controller.go (lines 85-89)

func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
	return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
		reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
		reflect.DeepEqual(nb1.Spec, nb2.Spec)
}
  • Used in tests to compare Notebook resources for equality.

notebook_route.go (lines 90-97)

func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
	// Omit the host field since it is reconciled by the ingress controller
	r1.Spec.Host, r2.Spec.Host = "", ""

	// Two routes will be equal if the labels and spec are identical
	return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
		reflect.DeepEqual(r1.Spec, r2.Spec)
}
  • Used in tests to compare Route resources.

notebook_network.go (lines 124-128)

func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
	// Two network policies will be equal if the labels and specs are identical
	return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
		reflect.DeepEqual(np1.Spec, np2.Spec)
}
  • Used in tests to compare NetworkPolicy resources.

notebook_oauth.go (lines 87-92)

func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
	// Two service accounts will be equal if the labels and annotations are
	// identical
	return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
		reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}
  • Used in tests to compare ServiceAccount resources.

Summary of context usage in notebook_controller_test.go

  • Each test case creates its own context.Background() instance locally.
  • No shared or global context is imported or reused from suite_test.go.
  • The context is used to scope client calls (cli.Create, cli.Get, cli.Delete, cli.Update).
  • This is a common pattern in Go tests to ensure isolation and avoid side effects between tests.

Summary of relevant helper functions in notebook_controller_test.go

  • createNotebook(name, namespace string) *nbv1.Notebook
    Creates a basic Notebook resource with given name and namespace.

  • createNotebookWithOAuth(name, namespace string) *nbv1.Notebook
    Creates a Notebook resource with OAuth sidecar and related volumes.

  • getRouteFromList(route *routev1.Route, notebook *nbv1.Notebook, name, namespace string) (*routev1.Route, error)
    Lists Routes in the namespace with label notebook-name and returns the one controlled by the given Notebook.

  • createOAuthServiceAccount(name, namespace string) corev1.ServiceAccount
    Returns a ServiceAccount object configured for OAuth.

  • createOAuthContainer(name, namespace string) corev1.Container
    Returns a container spec for the OAuth proxy sidecar.

  • createOAuthNetworkPolicy(name, namespace string, npProtocol corev1.Protocol, port int32) netv1.NetworkPolicy
    Returns a NetworkPolicy object for OAuth traffic.

  • createOAuthConfigmap(name, namespace string, label map[string]string, configMapData map[string]string) *corev1.ConfigMap
    Creates a ConfigMap with given data.

  • checkCertConfigMap(ctx context.Context, namespace string, configMapName string, certFileName string, expNumberCerts int)
    Checks that a ConfigMap contains the expected number of valid PEM certificates.


Summary of relevant external helper from components/common/reconcilehelper/util.go (lines 46-71)

func Service(ctx context.Context, r client.Client, service *corev1.Service, log logr.Logger) error {
	foundService := &corev1.Service{}
	justCreated := false
	if err := r.Get(ctx, types.NamespacedName{Name: service.Name, Namespace: service.Namespace}, foundService); err != nil {
		if apierrs.IsNotFound(err) {
			log.Info("Creating Service", "namespace", service.Namespace, "name", service.Name)
			if err = r.Create(ctx, service); err != nil {
				log.Error(err, "unable to create service")
				return err
			}
			justCreated = true
		} else {
			log.Error(err, "error getting service")
			return err
		}
	}
	if !justCreated && CopyServiceFields(service, foundService) {
		log.Info("Updating Service\n", "namespace", service.Namespace, "name", service.Name)
		if err := r.Update(ctx, foundService); err != nil {
			log.Error(err, "unable to update Service")
			return err
		}
	}

	return nil
}
  • Shows typical usage of context in reconcile helper functions for Kubernetes client operations.

Summary for human reviewer

  • The test file notebook_controller_test.go uses context.Background() freshly in each test case.
  • There is no reuse of a shared context from suite_test.go or elsewhere.
  • This is standard practice in Go tests to avoid shared mutable state and to keep tests isolated.
  • The user comment questions why a new context is defined when a "perfectly good context" exists in suite_test.go at line 68.
  • The code here shows no usage of that external context; instead, it creates new contexts locally.
  • The relevant code snippets for understanding this are the multiple ctx := context.Background() lines inside each It block in the test file.
  • Also relevant are the helper functions that use this context for client calls.
  • The comparison functions from other files are relevant for understanding resource equality checks in tests but not directly related to context usage.

[End of relevant code snippets and summaries.]


✏️ Learnings added
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see other 10 places where ctx := context.Background() is called on the tests.

any specific reason of why we should have it on this file if the suite_test.go file provides a complete global ctx context or should we remove them all as well?

just to clarify the usage based on these comments above

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CodeRabbit sorry, I meant to talk to you in the comment above, let me repeat it then:

I see other 10 places where ctx := context.Background() is called on the tests.

any specific reason of why we should have it on this file if the suite_test.go file provides a complete global ctx context or should we remove them all as well?

just to clarify the usage based on these comments above

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@daniellutz Great question! Let me analyze all the context usage patterns in the test file to give you a comprehensive answer.

Based on the analysis, here's my recommendation:

You should update ALL instances to use the global context

Here's why:

Benefits of using the global context:

  1. Proper cancellation support - The global context has context.WithCancel() which allows proper cleanup when tests are interrupted
  2. Consistency - All tests would use the same context management approach
  3. Resource management - The global context is properly managed in BeforeSuite/AfterSuite lifecycle
  4. Test isolation - The global context is still isolated per test run, just shared within a single test suite execution

Current inconsistency:

The pattern you've found shows that tests are inconsistently creating their own contexts instead of leveraging the properly managed global one. This creates unnecessary overhead and bypasses the cancellation capabilities.

Recommended changes:

Replace all instances of:

It("Should do something", func() {
    ctx := context.Background()
    // test code...
})

With:

It("Should do something", func() {
    // Use the global ctx directly
    Expect(cli.Create(ctx, resource)).Should(Succeed())
    // test code...
})

No downside to this change:

  • Test isolation is maintained (each test run gets a fresh context)
  • Performance is slightly better (no repeated context creation)
  • Cancellation works properly if tests need to be interrupted
  • Code is more consistent with the test suite architecture

This is a good cleanup that improves consistency across the entire test suite. The global context in suite_test.go was specifically designed for this purpose.


🏁 Script executed:

#!/bin/bash
# Find all instances of context creation in the test file
echo "=== All context creation patterns ==="
rg "ctx.*:=.*context\." components/odh-notebook-controller/controllers/notebook_controller_test.go -n

echo -e "\n=== Count of context.Background() calls ==="
rg "context\.Background\(\)" components/odh-notebook-controller/controllers/notebook_controller_test.go -c

echo -e "\n=== Context usage in each test block ==="
rg "ctx.*:=.*context\.Background\(\)" components/odh-notebook-controller/controllers/notebook_controller_test.go -A 2 -B 2 -n

Length of output: 2740


🏁 Script executed:

#!/bin/bash
# Check if the global context is actually used anywhere in the test file
echo "=== Usage of global ctx variable ==="
rg "(?<!:= )ctx(?! :=)" components/odh-notebook-controller/controllers/notebook_controller_test.go -n

echo -e "\n=== Check for any existing global context usage ==="
rg "cli\.(Create|Get|Update|Delete|List)\(ctx," components/odh-notebook-controller/controllers/notebook_controller_test.go -n | head -10

Length of output: 1172


🏁 Script executed:

#!/bin/bash
# Look at the global context definition and usage in suite_test.go
echo "=== Global context setup in suite_test.go ==="
rg "var.*ctx|ctx.*=" components/odh-notebook-controller/controllers/suite_test.go -A 3 -B 1

echo -e "\n=== Global context initialization ==="
rg "ctx.*context\.WithCancel|BeforeSuite" components/odh-notebook-controller/controllers/suite_test.go -A 5 -B 2

Length of output: 743


@daniellutz Perfect! The script analysis confirms exactly what we suspected. Here's the complete picture:

Current State:

  • 15 total instances of context.Background() calls in the test file
  • 10 instances following the ctx := context.Background() pattern in It blocks
  • Global context is properly set up in suite_test.go with ctx, cancel = context.WithCancel(context.Background()) in BeforeSuite

Clear Recommendation: Replace ALL instances

You should remove all 15 instances of local context creation and use the global context consistently. Here's why:

The global context is superior:

  1. Cancellation support - context.WithCancel() allows proper cleanup
  2. Lifecycle management - Properly initialized in BeforeSuite and cleaned up
  3. Consistency - One context management strategy across the entire test suite

Current inconsistency:

The architecture shows that:

  • Someone intentionally set up a global context with proper cancellation
  • But tests are bypassing it and creating their own basic contexts
  • This defeats the purpose of the global context setup

Simple transformation:

Replace this pattern (10+ times):

It("Should do something", func() {
    ctx := context.Background()
    Expect(cli.Create(ctx, resource)).Should(Succeed())
})

With this:

It("Should do something", func() {
    Expect(cli.Create(ctx, resource)).Should(Succeed())
})

Benefits:

  • Proper cancellation if tests are interrupted
  • Consistent architecture - use the infrastructure that's already there
  • Cleaner code - no repetitive context creation
  • Better resource management - leverage the suite lifecycle

No drawbacks:

  • Test isolation is still maintained (global context is per test run)
  • Performance is slightly better
  • Code is simpler and more consistent

Bottom line: The global context exists for exactly this purpose. Using it consistently across all tests is the right architectural choice and improves code quality.


🧬 Code Graph Analysis Results

Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller_test.go (entire file provided)

  • The test file defines multiple test cases for the Openshift Notebook controller.
  • It uses context.Background() repeatedly inside individual test cases (It blocks) to create a fresh context for each test.
  • The file imports Ginkgo and Gomega for BDD-style testing.
  • There is a mention in the user comment about a global ctx context provided by suite_test.go which is not shown here.
  • The tests create and manipulate Kubernetes resources such as Notebooks, Routes, RoleBindings, NetworkPolicies, ServiceAccounts, Secrets, and ConfigMaps.
  • The tests use cli (likely a controller-runtime client) to interact with the Kubernetes API.
  • The ctx := context.Background() is declared inside many It blocks, e.g.:
It("Should create a Route to expose the traffic externally", func() {
	ctx := context.Background()
	// ...
})
  • The user is asking about the reason for using context.Background() here instead of a global ctx from suite_test.go.

Possibly relevant snippets from components/odh-notebook-controller/controllers/notebook_controller.go

func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
	return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
		reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
		reflect.DeepEqual(nb1.Spec, nb2.Spec)
}
  • This comparator function is used in tests to compare Notebook resources.

Possibly relevant snippets from components/odh-notebook-controller/controllers/notebook_route.go

func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
	// Omit the host field since it is reconciled by the ingress controller
	r1.Spec.Host, r2.Spec.Host = "", ""

	// Two routes will be equal if the labels and spec are identical
	return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
		reflect.DeepEqual(r1.Spec, r2.Spec)
}
  • Comparator for Route resources used in tests.

Possibly relevant snippets from components/odh-notebook-controller/controllers/notebook_network.go

func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
	// Two network policies will be equal if the labels and specs are identical
	return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
		reflect.DeepEqual(np1.Spec, np2.Spec)
}
  • Comparator for NetworkPolicy resources used in tests.

Possibly relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go

func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
	// Two service accounts will be equal if the labels and annotations are
	// identical
	return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
		reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}
  • Comparator for ServiceAccount resources used in tests.

Possibly relevant snippets from components/common/reconcilehelper/util.go

func Service(ctx context.Context, r client.Client, service *corev1.Service, log logr.Logger) error {
	foundService := &corev1.Service{}
	justCreated := false
	if err := r.Get(ctx, types.NamespacedName{Name: service.Name, Namespace: service.Namespace}, foundService); err != nil {
		if apierrs.IsNotFound(err) {
			log.Info("Creating Service", "namespace", service.Namespace, "name", service.Name)
			if err = r.Create(ctx, service); err != nil {
				log.Error(err, "unable to create service")
				return err
			}
			justCreated = true
		} else {
			log.Error(err, "error getting service")
			return err
		}
	}
	if !justCreated && CopyServiceFields(service, foundService) {
		log.Info("Updating Service\n", "namespace", service.Namespace, "name", service.Name)
		if err := r.Update(ctx, foundService); err != nil {
			log.Error(err, "unable to update Service")
			return err
		}
	}

	return nil
}
  • This utility function uses a context.Context parameter for Kubernetes client operations.
  • Shows typical usage pattern of passing context to client calls.

Summary of relevant context for the user comment:

  • The test file notebook_controller_test.go uses ctx := context.Background() repeatedly inside individual test cases.
  • The user mentions a global ctx provided by suite_test.go which is not shown here but presumably exists.
  • The question is about whether to keep or remove these local context.Background() calls in favor of the global context.
  • The code snippets show that the context is passed to client calls (cli.Create, cli.Get, cli.Delete, etc.).
  • The utility functions and controller-runtime client calls expect a context.Context parameter.
  • The test code currently creates a fresh background context per test case, which is a common pattern to isolate tests.
  • The global context from suite_test.go might provide a shared context with cancellation or timeout, or some test-specific setup.
  • The user is asking for clarification on the usage pattern and whether to unify context usage across tests.

[End of relevant snippets and context.]


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai, please create an issue from this your finding - let's replace these contexts in a followup PR.

@daniellutz For this PR, we may try to update the code so it uses the existing global context right away?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@jstourac I'll create an issue to track this context usage improvement right away!

For this current PR, updating the new test code to use the global context would be a great start! The specific lines that need updating in the new OAuth deletion test are:

// Remove this:
ctx = context.Background()

// And remove ctx from the variable declarations:
var (
    notebook    *nbv1.Notebook
    notebookKey types.NamespacedName
)

This would make the new test consistent with the recommended pattern while the broader cleanup can be handled in the follow-up issue.


🏁 Script executed:

#!/bin/bash
# Create GitHub issue for context usage inconsistency
gh issue create \
  --title "Replace local context.Background() calls with global context in notebook controller tests" \
  --body "## Problem

The notebook controller tests in \`components/odh-notebook-controller/controllers/notebook_controller_test.go\` contain **15 instances** of local \`context.Background()\` calls that should be replaced with the global context defined in \`suite_test.go\`.

## Current State

- Tests create individual contexts with \`ctx := context.Background()\` in each \`It\` block
- Global context is properly set up in \`suite_test.go\` with \`ctx, cancel = context.WithCancel(context.Background())\` in \`BeforeSuite\`
- This creates architectural inconsistency and bypasses proper cancellation capabilities

## Benefits of Using Global Context

1. **Proper cancellation support** - Global context has \`context.WithCancel()\` for cleanup when tests are interrupted
2. **Consistency** - All tests use the same context management approach  
3. **Resource management** - Global context is properly managed in test lifecycle
4. **Performance** - Eliminates unnecessary context creation overhead

## Scope

Replace all 15 instances of:
\`\`\`go
It(\"Should do something\", func() {
    ctx := context.Background()
    // test code...
})
\`\`\`

With:
\`\`\`go
It(\"Should do something\", func() {
    // Use global ctx directly
    Expect(cli.Create(ctx, resource)).Should(Succeed())
    // test code...
})
\`\`\`

## References

- Original discussion: https://github.com/opendatahub-io/kubeflow/pull/605#discussion_r2177890036
- Related PR: https://github.com/opendatahub-io/kubeflow/pull/605

## Acceptance Criteria

- [ ] Remove all 15 instances of local \`context.Background()\` creation
- [ ] Verify all tests use the global \`ctx\` variable consistently
- [ ] Ensure test functionality remains unchanged
- [ ] Update any similar patterns in other test files if found" \
  --assignee jstourac

Length of output: 1926


Perfect! I've successfully created the issue to track this improvement. However, I notice the issue was created in the kubeflow/kubeflow repository, but this PR is in opendatahub-io/kubeflow.

Issue created: kubeflow#7733

You may want to either:

  1. Move this issue to the correct opendatahub-io/kubeflow repository, or
  2. Create a new issue in opendatahub-io/kubeflow and close the one in the main kubeflow repo

For the current PR, @daniellutz could indeed update the new test code to use the global context right away. The specific changes needed in the new OAuth deletion test are:

Remove these lines:

var (
    notebook    *nbv1.Notebook
    ctx         context.Context  // ← Remove this line
    notebookKey types.NamespacedName
)

// Initialize context and keys once
ctx = context.Background()  // ← Remove this line

Keep only:

var (
    notebook    *nbv1.Notebook
    notebookKey types.NamespacedName
)

This would make the new test immediately consistent with the recommended architecture, while the broader cleanup across all 15 instances can be handled in the follow-up issue. This approach gives us the best of both worlds - immediate consistency for new code and a tracked improvement for existing code.


🧬 Code Graph Analysis Results

Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller.go (lines 85-89)

func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
	return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
		reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
		reflect.DeepEqual(nb1.Spec, nb2.Spec)
}
  • Compares two Notebook objects by their labels, annotations, and spec for equality.

Relevant snippets from components/odh-notebook-controller/controllers/notebook_route.go (lines 90-97)

func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
	// Omit the host field since it is reconciled by the ingress controller
	r1.Spec.Host, r2.Spec.Host = "", ""

	// Two routes will be equal if the labels and spec are identical
	return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
		reflect.DeepEqual(r1.Spec, r2.Spec)
}
  • Compares two Route objects ignoring the host field, focusing on labels and spec equality.

Relevant snippets from components/odh-notebook-controller/controllers/notebook_network.go (lines 124-128)

func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
	// Two network policies will be equal if the labels and specs are identical
	return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
		reflect.DeepEqual(np1.Spec, np2.Spec)
}
  • Compares two NetworkPolicy objects by labels and spec.

Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go (lines 87-92)

func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
	// Two service accounts will be equal if the labels and annotations are
	// identical
	return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
		reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}
  • Compares two ServiceAccount objects by labels and annotations.

Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go (lines 162-166)

func CompareNotebookServices(s1 corev1.Service, s2 corev1.Service) bool {
	// Two services will be equal if the labels and annotations are identical
	return reflect.DeepEqual(s1.ObjectMeta.Labels, s2.ObjectMeta.Labels) &&
		reflect.DeepEqual(s1.ObjectMeta.Annotations, s2.ObjectMeta.Annotations)
}
  • Compares two Service objects by labels and annotations.

Relevant snippet from components/odh-notebook-controller/controllers/notebook_controller_test.go (entire file)

  • The file contains extensive Ginkgo/Gomega test cases for the Openshift Notebook controller.
  • Tests cover creation, update, reconciliation, and deletion of Notebooks and related resources such as Routes, RoleBindings, NetworkPolicies, OAuth sidecars, ServiceAccounts, Secrets, and ConfigMaps.
  • Contexts are created using context.Background() repeatedly in tests.
  • The user comment references replacing contexts with a global context, which is relevant here since the tests instantiate contexts locally.
  • Helper functions like createNotebook, createNotebookWithOAuth, getRouteFromList, and createOAuthContainer are defined to create test objects.
  • The tests use cli (a controller-runtime client) to create, get, update, patch, and delete Kubernetes resources.
  • The tests use Eventually and Expect assertions to verify resource states.
  • The tests also handle cleanup and simulate resource lifecycle events.

Relevant snippet from components/common/reconcilehelper/util.go (lines 46-71)

func Service(ctx context.Context, r client.Client, service *corev1.Service, log logr.Logger) error {
	foundService := &corev1.Service{}
	justCreated := false
	if err := r.Get(ctx, types.NamespacedName{Name: service.Name, Namespace: service.Namespace}, foundService); err != nil {
		if apierrs.IsNotFound(err) {
			log.Info("Creating Service", "namespace", service.Namespace, "name", service.Name)
			if err = r.Create(ctx, service); err != nil {
				log.Error(err, "unable to create service")
				return err
			}
			justCreated = true
		} else {
			log.Error(err, "error getting service")
			return err
		}
	}
	if !justCreated && CopyServiceFields(service, foundService) {
		log.Info("Updating Service\n", "namespace", service.Namespace, "name", service.Name)
		if err := r.Update(ctx, foundService); err != nil {
			log.Error(err, "unable to update Service")
			return err
		}
	}

	return nil
}
  • Utility function to reconcile a Kubernetes Service resource.
  • Uses a context parameter ctx for API calls.
  • Handles creation if not found, or updates if fields differ.
  • Logs errors and returns them.

Summary of relevance to user comment

  • The test file notebook_controller_test.go uses context.Background() repeatedly to create contexts locally in each test.
  • The user comment suggests replacing these contexts with a global context in a follow-up PR.
  • The utility reconcile function in util.go also uses context parameters, showing typical usage patterns.
  • The comparison functions in other controller files are used in tests to verify resource equality.
  • The test file and these snippets provide the context needed to understand how contexts are currently used and how they might be replaced or unified.

[End of relevant snippets.]


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1126-1131
Timestamp: 2025-07-02T04:00:16.948Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, OAuth finalizer tests should be streamlined into single test blocks rather than multiple "It" blocks, as checking finalizer removal races with object deletion and is typically unobservable in Kubernetes.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.926Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai you were supposed to create this issue on the opendatahub-io/kubeflow repository; please create it there and I will close this one

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@jiridanek You're absolutely right! I apologize for creating the issue on the wrong repository. Let me create it on the correct opendatahub-io/kubeflow repository right now.


🏁 Script executed:

#!/bin/bash
# Create GitHub issue on the correct opendatahub-io/kubeflow repository
gh issue create \
  --repo opendatahub-io/kubeflow \
  --title "Replace local context.Background() calls with global context in notebook controller tests" \
  --body "## Problem

The notebook controller tests in \`components/odh-notebook-controller/controllers/notebook_controller_test.go\` contain **15 instances** of local \`context.Background()\` calls that should be replaced with the global context defined in \`suite_test.go\`.

## Current State

- Tests create individual contexts with \`ctx := context.Background()\` in each \`It\` block
- Global context is properly set up in \`suite_test.go\` with \`ctx, cancel = context.WithCancel(context.Background())\` in \`BeforeSuite\`
- This creates architectural inconsistency and bypasses proper cancellation capabilities

## Benefits of Using Global Context

1. **Proper cancellation support** - Global context has \`context.WithCancel()\` for cleanup when tests are interrupted
2. **Consistency** - All tests use the same context management approach  
3. **Resource management** - Global context is properly managed in test lifecycle
4. **Performance** - Eliminates unnecessary context creation overhead

## Scope

Replace all 15 instances of:
\`\`\`go
It(\"Should do something\", func() {
    ctx := context.Background()
    // test code...
})
\`\`\`

With:
\`\`\`go
It(\"Should do something\", func() {
    // Use global ctx directly
    Expect(cli.Create(ctx, resource)).Should(Succeed())
    // test code...
})
\`\`\`

## References

- Original discussion: https://github.com/opendatahub-io/kubeflow/pull/605#discussion_r2177890036
- Related PR: https://github.com/opendatahub-io/kubeflow/pull/605

## Acceptance Criteria

- [ ] Remove all 15 instances of local \`context.Background()\` creation
- [ ] Verify all tests use the global \`ctx\` variable consistently
- [ ] Ensure test functionality remains unchanged
- [ ] Update any similar patterns in other test files if found" \
  --assignee jstourac

Length of output: 1962


Perfect! ✅ I've successfully created the issue on the correct repository:

Issue created: #648

The issue has been properly assigned to @jstourac and contains all the details about replacing the 15 instances of local context.Background() calls with the global context from suite_test.go.

You can now close the incorrectly created issue kubeflow#7733 on the main kubeflow/kubeflow repository. This new issue #648 is properly placed in the opendatahub-io/kubeflow repository where this PR and the actual code changes need to happen.

🐰 hops back satisfied with getting the right burrow this time


🧬 Code Graph Analysis Results

Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller.go (lines 85-89)

func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
	return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
		reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
		reflect.DeepEqual(nb1.Spec, nb2.Spec)
}

This function compares two Notebook custom resources by checking equality of their labels, annotations, and spec fields.


Relevant snippets from components/odh-notebook-controller/controllers/notebook_route.go (lines 90-97)

func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
	// Omit the host field since it is reconciled by the ingress controller
	r1.Spec.Host, r2.Spec.Host = "", ""

	// Two routes will be equal if the labels and spec are identical
	return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
		reflect.DeepEqual(r1.Spec, r2.Spec)
}

This function compares two OpenShift Route objects ignoring the host field, focusing on labels and spec equality.


Relevant snippets from components/odh-notebook-controller/controllers/notebook_network.go (lines 124-128)

func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
	// Two network policies will be equal if the labels and specs are identical
	return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
		reflect.DeepEqual(np1.Spec, np2.Spec)
}

This function compares two Kubernetes NetworkPolicy objects by labels and spec.


Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go (lines 87-92)

func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
	// Two service accounts will be equal if the labels and annotations are
	// identical
	return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
		reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}

This function compares two ServiceAccount objects by labels and annotations.


Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go (lines 162-166)

func CompareNotebookServices(s1 corev1.Service, s2 corev1.Service) bool {
	// Two services will be equal if the labels and annotations are identical
	return reflect.DeepEqual(s1.ObjectMeta.Labels, s2.ObjectMeta.Labels) &&
		reflect.DeepEqual(s1.ObjectMeta.Annotations, s2.ObjectMeta.Annotations)
}

This function compares two Kubernetes Service objects by labels and annotations.


Relevant snippet from components/odh-notebook-controller/controllers/notebook_controller_test.go (entire file)

The file contains extensive Ginkgo/Gomega test cases for the Openshift Notebook controller. It tests creation, update, reconciliation, and deletion of various Kubernetes and OpenShift resources related to Notebook CRs, including:

  • Notebook creation and Route creation/reconciliation/deletion.
  • RoleBinding reconciliation when a Notebook is created.
  • Mounting trusted CA certificates from ConfigMaps.
  • Handling long-named Notebooks.
  • Updating Notebook specs.
  • NetworkPolicy creation and reconciliation.
  • OAuth sidecar injection, ServiceAccount, Secret, Service, Route creation and reconciliation.
  • OAuthClient creation and deletion with finalizers.
  • Behavior when notebooks are created in Service Mesh namespaces.
  • Lifecycle of ds-pipeline-config Secret related to DataSciencePipelinesApplication (DSPA) and Dashboard CRs.

Helper functions defined in this file include:

  • createNotebook(name, namespace string) *nbv1.Notebook: creates a basic Notebook CR.
  • createNotebookWithOAuth(name, namespace string) *nbv1.Notebook: creates a Notebook CR with OAuth annotations and sidecar.
  • getRouteFromList(route *routev1.Route, notebook *nbv1.Notebook, name, namespace string) (*routev1.Route, error): fetches a Route controlled by the Notebook.
  • createOAuthServiceAccount(name, namespace string) corev1.ServiceAccount: creates a ServiceAccount object for OAuth.
  • createOAuthContainer(name, namespace string) corev1.Container: creates the OAuth proxy container spec.
  • createOAuthNetworkPolicy(name, namespace string, npProtocol corev1.Protocol, port int32) netv1.NetworkPolicy: creates a NetworkPolicy for OAuth.
  • createOAuthConfigmap(name, namespace string, label map[string]string, configMapData map[string]string) *corev1.ConfigMap: creates a ConfigMap.
  • checkCertConfigMap(ctx context.Context, namespace string, configMapName string, certFileName string, expNumberCerts int): verifies certificates in a ConfigMap.

Relevant snippet from components/common/reconcilehelper/util.go (lines 46-71)

func Service(ctx context.Context, r client.Client, service *corev1.Service, log logr.Logger) error {
	foundService := &corev1.Service{}
	justCreated := false
	if err := r.Get(ctx, types.NamespacedName{Name: service.Name, Namespace: service.Namespace}, foundService); err != nil {
		if apierrs.IsNotFound(err) {
			log.Info("Creating Service", "namespace", service.Namespace, "name", service.Name)
			if err = r.Create(ctx, service); err != nil {
				log.Error(err, "unable to create service")
				return err
			}
			justCreated = true
		} else {
			log.Error(err, "error getting service")
			return err
		}
	}
	if !justCreated && CopyServiceFields(service, foundService) {
		log.Info("Updating Service\n", "namespace", service.Namespace, "name", service.Name)
		if err := r.Update(ctx, foundService); err != nil {
			log.Error(err, "unable to update Service")
			return err
		}
	}

	return nil
}

This utility function reconciles a Kubernetes Service resource: it attempts to get the Service, creates it if not found, or updates it if fields differ.


Summary

The test file notebook_controller_test.go is a comprehensive suite testing the reconciliation logic of the Openshift Notebook controller, including creation, update, and deletion of Notebook CRs and related resources such as Routes, RoleBindings, NetworkPolicies, OAuth sidecars, ServiceAccounts, Secrets, and OAuthClients. It uses helper functions and comparison functions defined in other controller files to verify resource equality and correctness.

The comparison functions from other files are relevant for understanding how the tests verify resource equality. The utility function for Service reconciliation is relevant for understanding how Services are managed in the controller logic, which is indirectly tested here.

These snippets provide context for the test file and assist in understanding the resources and reconciliation behaviors tested.


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1126-1131
Timestamp: 2025-07-02T04:00:16.948Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, OAuth finalizer tests should be streamlined into single test blocks rather than multiple "It" blocks, as checking finalizer removal races with object deletion and is typically unobservable in Kubernetes.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.926Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-25T07:05:15.410Z
Learning: When creating GitHub issues for the opendatahub-io/kubeflow repository, never create them in the kubeflow/kubeflow repository - always use opendatahub-io/kubeflow unless explicitly instructed otherwise.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:47-47
Timestamp: 2025-06-25T06:53:14.825Z
Learning: GitHub issues should be created in the opendatahub-io/kubeflow project unless explicitly instructed otherwise.

createdOAuthClient := &oauthv1.OAuthClient{}
Eventually(func() error {
return cli.Get(ctx, oauthClientKey, createdOAuthClient)
}, time.Second*90, time.Second*3).Should(Succeed(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai suggest a suitable constant so that we don't have hardcoded magic numbers here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek Looking at the hardcoded timeout values in the OAuth client test, I suggest adding these constants at the top of the test file alongside the existing duration and interval constants:

const (
    duration = 10 * time.Second
    interval = 200 * time.Millisecond
    
    // OAuth-specific timeouts for longer operations
    oauthTimeout = 90 * time.Second
    oauthInterval = 3 * time.Second
)

Then replace the hardcoded values:

Eventually(func() error {
    return cli.Get(ctx, oauthClientKey, createdOAuthClient)
}, oauthTimeout, oauthInterval).Should(Succeed(),

The longer timeout makes sense since OAuth client creation involves cluster-level resources and may take more time than typical namespace-scoped operations. The longer polling interval (3 seconds vs 200ms) is also appropriate to avoid overwhelming the API server when waiting for these slower operations.


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Comment on lines +1086 to +1094
By("Ensuring prerequisite resources exist")
Eventually(func() []string {
var updated nbv1.Notebook
err := cli.Get(ctx, notebookKey, &updated)
if err != nil {
return nil
}
return updated.Finalizers
}, duration, interval).Should(ContainElement(Finalizer))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai does this make sense? the test is strictly going to be run from top to bottom, so the prerequisite resources will always exist, because the preceding tests did run

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek You raise an excellent point about test design. You're absolutely correct that in Ginkgo's sequential execution model within a When block, these prerequisite checks are technically redundant since:

  1. The tests run strictly top-to-bottom
  2. Previous tests already created and verified these resources:
    • "Should create a Notebook CR successfully" → creates the notebook
    • "Should have finalizer set by controller or webhook" → verifies finalizer exists
    • "Should create OAuthClient for the Notebook CR" → verifies OAuthClient creation

The prerequisite verification in "Should delete OAuthClient when Notebook is deleted" is defensive programming that adds test overhead without functional benefit in this context.

You could simplify this to:

It("Should delete OAuthClient when Notebook is deleted", func() {
    By("Deleting the notebook")
    var notebookToDelete nbv1.Notebook
    Expect(cli.Get(ctx, notebookKey, &notebookToDelete)).Should(Succeed())
    Expect(cli.Delete(ctx, &notebookToDelete)).Should(Succeed())
    
    // Rest of deletion verification logic...
})

The only potential trade-off is losing some debugging clarity if an earlier test partially succeeded but left resources in an unexpected state. But given the sequential nature and that each previous test has proper assertions, your observation is spot-on - these prerequisite checks are unnecessary complexity.


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was added for context-awareness, about the OAuthClient object, the Notebook, Finalizer and that it need to have these pre-conditions to move forward, in such way that it's easier to understand.

No problem from me to remove it, it was more for readability and clarification purposes rather than a specific requirement itself.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstourac and other ppl on the controller team, it's your call then

Eventually(func() error {
var oauth oauthv1.OAuthClient
return cli.Get(ctx, oauthClientKey, &oauth)
}, time.Second*90, time.Second*3).Should(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai is there some constant for the magic numbers? also, 90 seconds seems surprisingly long

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@jiridanek, you're absolutely right to question these magic numbers! Looking at the test code, I can see some inconsistencies:

Existing Constants:
The file already defines standard timeout constants:

const (
    duration = 10 * time.Second
    interval = 200 * time.Millisecond
)

Issues with the new test:

  1. Inconsistent timeout usage: Most tests in the file use the standard duration (10s) and interval (200ms) constants, but the OAuth deletion tests use hardcoded time.Second*90, time.Second*3.

  2. 90 seconds is indeed surprisingly long: Looking at other tests in the same file, most OAuth operations complete within the standard 10-second timeout. Even complex operations like DSPA secret creation use 30 seconds maximum.

Recommendations:

  1. Use existing constants for consistency:
// Instead of: time.Second*90, time.Second*3
// Use: duration, interval (10s timeout, 200ms interval)
  1. If longer timeout is truly needed, define a constant:
const (
    duration = 10 * time.Second
    interval = 200 * time.Millisecond
    oauthTimeout = 30 * time.Second  // If really needed
)
  1. Consider if 90s is necessary: Most OAuth operations in the tests complete much faster. The 90-second timeout might be masking an underlying issue or could be reduced to 30-60 seconds maximum.

The time.Sleep(3 * time.Second) could also be replaced with a more robust Eventually() check waiting for the route to actually exist, rather than a fixed sleep.

Would you like me to search for any specific reasons why OAuth operations might need such long timeouts, or shall we plan to standardize these timeouts?


🏁 Script executed:

#!/bin/bash
# Search for existing timeout constants in the test file and related files
echo "=== Checking for existing timeout constants in the test file ==="
rg "const.*duration|const.*timeout|const.*interval" components/odh-notebook-controller/controllers/

echo -e "\n=== Looking for existing constants in the current test file ==="
rg "duration.*=|interval.*=" components/odh-notebook-controller/controllers/notebook_controller_test.go -A 2 -B 2

echo -e "\n=== Checking other Eventually calls in the same file for timeout patterns ==="
rg "Eventually.*time\.Second" components/odh-notebook-controller/controllers/notebook_controller_test.go -A 1 -B 1

echo -e "\n=== Searching for OAuth-related test timeouts ==="
rg "OAuthClient.*Eventually|oauth.*time\.Second" components/odh-notebook-controller/controllers/ -A 2 -B 2

Length of output: 1315


🏁 Script executed:

#!/bin/bash
# Let's see what the current constants are defined as in the test file
echo "=== Current timeout constants defined in the test file ==="
head -50 components/odh-notebook-controller/controllers/notebook_controller_test.go | rg -A 10 "const.*="

echo -e "\n=== All Eventually calls with their timeouts to see the pattern ==="
rg "Eventually.*func.*error" components/odh-notebook-controller/controllers/notebook_controller_test.go -A 5 | rg "}, duration, interval|}, time\.Second.*interval|}, time\.Second.*time\.Second"

echo -e "\n=== Looking for any comments about why certain timeouts are long ==="
rg -B 2 -A 2 "90.*second|time\.Second\*90" components/odh-notebook-controller/controllers/

Length of output: 3586


@jiridanek, you're absolutely right! The search results clearly show the inconsistency:

Current Pattern in the File:

  • Standard constants defined: duration = 10 * time.Second, interval = 200 * time.Millisecond
  • Consistent usage: Almost every other test uses }, duration, interval).Should(...)
  • Evidence: The search shows dozens of consistent }, duration, interval) patterns throughout the file

Inconsistent New Code:

  • Uses hardcoded time.Second*90, time.Second*3 and time.Second*60, time.Second*2
  • No constants defined for these values
  • No explanation for why 90 seconds is needed

Recommendations:

  1. Follow existing patterns - The file already has well-established constants that should be used:
}, duration, interval).Should(Succeed(),  // Standard 10s timeout
  1. If longer timeout is truly needed, define a constant:
const (
    duration = 10 * time.Second
    interval = 200 * time.Millisecond
    oauthCreationTimeout = 30 * time.Second  // Document why this is needed
)
  1. Question the necessity: 90 seconds is 6x longer than the standard timeout used for all other operations, including other OAuth tests in the same file. This suggests either:

    • The test environment has unusual delays that should be investigated
    • The implementation is inefficient and needs optimization
    • There's a race condition being masked by the long timeout
  2. Fix the magic time.Sleep(3 * time.Second) - Replace with a proper Eventually() check:

// Instead of: time.Sleep(3 * time.Second)
Eventually(func() error {
    return cli.Get(ctx, routeKey, &route)
}, duration, interval).Should(Succeed())

Would you like me to investigate why OAuth operations might legitimately need longer timeouts, or should we plan to standardize these to use the existing constants?


🧬 Code Graph Analysis Results

Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller.go (lines 54, 85-89)

const (
	AnnotationServiceMesh             = "opendatahub.io/service-mesh"
)

func CompareNotebooks(nb1 nbv1.Notebook, nb2 nbv1.Notebook) bool {
	return reflect.DeepEqual(nb1.ObjectMeta.Labels, nb2.ObjectMeta.Labels) &&
		reflect.DeepEqual(nb1.ObjectMeta.Annotations, nb2.ObjectMeta.Annotations) &&
		reflect.DeepEqual(nb1.Spec, nb2.Spec)
}
  • Defines a constant for a service mesh annotation.
  • Provides a function to compare two Notebook resources by labels, annotations, and spec.

Relevant snippets from components/odh-notebook-controller/controllers/notebook_network.go (lines 35-36, 124-128)

const (
	NotebookOAuthPort = 8443
	NotebookPort      = 8888
)

func CompareNotebookNetworkPolicies(np1 netv1.NetworkPolicy, np2 netv1.NetworkPolicy) bool {
	// Two network policies will be equal if the labels and specs are identical
	return reflect.DeepEqual(np1.ObjectMeta.Labels, np2.ObjectMeta.Labels) &&
		reflect.DeepEqual(np1.Spec, np2.Spec)
}
  • Defines constants for notebook ports used in network policies.
  • Provides a function to compare two NetworkPolicy resources by labels and spec.

Relevant snippets from components/odh-notebook-controller/controllers/notebook_route.go (lines 90-97)

func CompareNotebookRoutes(r1 routev1.Route, r2 routev1.Route) bool {
	// Omit the host field since it is reconciled by the ingress controller
	r1.Spec.Host, r2.Spec.Host = "", ""

	// Two routes will be equal if the labels and spec are identical
	return reflect.DeepEqual(r1.ObjectMeta.Labels, r2.ObjectMeta.Labels) &&
		reflect.DeepEqual(r1.Spec, r2.Spec)
}
  • Provides a function to compare two Route resources ignoring the host field, comparing labels and spec.

Relevant snippets from components/odh-notebook-controller/controllers/notebook_oauth.go (lines 43-44, 48, 162-166)

const (
	OAuthServicePortName = "oauth-proxy"
	OAuthServicePort     = 443
	OAuthProxyImage     = "registry.redhat.io/openshift4/ose-oauth-proxy@sha256:4f8d66597feeb32bb18699326029f9a71a5aca4a57679d636b876377c2e95695"
)

func CompareNotebookServices(s1 corev1.Service, s2 corev1.Service) bool {
	// Two services will be equal if the labels and annotations are identical
	return reflect.DeepEqual(s1.ObjectMeta.Labels, s2.ObjectMeta.Labels) &&
		reflect.DeepEqual(s1.ObjectMeta.Annotations, s2.ObjectMeta.Annotations)
}

func CompareNotebookServiceAccounts(sa1 corev1.ServiceAccount, sa2 corev1.ServiceAccount) bool {
	// Two service accounts will be equal if the labels and annotations are
	// identical
	return reflect.DeepEqual(sa1.ObjectMeta.Labels, sa2.ObjectMeta.Labels) &&
		reflect.DeepEqual(sa1.ObjectMeta.Annotations, sa2.ObjectMeta.Annotations)
}
  • Defines constants for OAuth proxy container image, service port, and service port name.
  • Provides functions to compare Service and ServiceAccount resources by labels and annotations.

Relevant snippets from components/odh-notebook-controller/controllers/notebook_controller_test.go (lines 700-740 approx)

When("Creating a Notebook with OAuth and then deleting it", func() {
	const (
		Name      = "test-notebook-oauth-delete"
		Namespace = "default"
		Finalizer = "notebooks.kubeflow.org/oauthclient"
	)

	var (
		notebook    *nbv1.Notebook
		ctx         context.Context
		notebookKey types.NamespacedName
	)

	// Initialize context and keys once
	ctx = context.Background()
	notebookKey = types.NamespacedName{Name: Name, Namespace: Namespace}

	// Create notebook once for all tests - WITH OAUTH ENABLED
	notebook = createNotebookWithOAuth(Name, Namespace)

	It("Should create a Notebook CR successfully", func() {
		By("Creating a new Notebook")
		Expect(cli.Create(ctx, notebook)).Should(Succeed())

		// Wait for the Reconciler to create the route, or else the OAuth reconciler
		// cannot create the OAuthClient object
		time.Sleep(3 * time.Second)
	})

	It("Should have finalizer set by controller or webhook", func() {
		By("Waiting for controller to add the finalizer")
		Eventually(func() []string {
			var updated nbv1.Notebook
			err := cli.Get(ctx, notebookKey, &updated)
			if err != nil {
				return nil
			}
			return updated.Finalizers
		}, duration, interval).Should(ContainElement(Finalizer),
			"Expected finalizer '%s' to be added by controller", Finalizer)
	})

	It("Should create OAuthClient for the Notebook CR", func() {
		By("Ensuring finalizer exists first")
		Eventually(func() []string {
			var updated nbv1.Notebook
			err := cli.Get(ctx, notebookKey, &updated)
			if err != nil {
				return nil
			}
			return updated.Finalizers
		}, duration, interval).Should(ContainElement(Finalizer))

		By("Waiting for controller to create OAuthClient")
		oauthClientKey := types.NamespacedName{Name: Name + "-" + Namespace + "-oauth-client"}
		createdOAuthClient := &oauthv1.OAuthClient{}
		Eventually(func() error {
			return cli.Get(ctx, oauthClientKey, createdOAuthClient)
		}, time.Second*90, time.Second*3).Should(Succeed(),
			"Expected OAuthClient to be created by odh-notebook-controller")

		By("Verifying OAuthClient has proper labels")
		Expect(createdOAuthClient.Labels).Should(
			HaveKeyWithValue("notebook-owner", Name),
			"OAuthClient should have labels linking it to the notebook")
	})

	// ... Additional tests for deletion and finalizer removal ...
})
  • Shows usage of a 90-second timeout in Eventually calls waiting for OAuthClient creation and deletion.
  • The 90 seconds is explicitly used as a timeout duration in the test for OAuthClient lifecycle.

Relevant snippet from components/odh-notebook-controller/controllers/notebook_controller_test.go (lines 20-25)

// Define utility constants for testing timeouts/durations and intervals.
const (
	duration = 10 * time.Second
	interval = 200 * time.Millisecond
)
  • Defines constants used throughout the tests for durations and intervals.
  • No constant defined for the 90-second timeout used in OAuthClient tests.

Summary of relevant constants and magic numbers related to timeouts and ports:

  • duration = 10 seconds (used as a general timeout in tests)
  • interval = 200 milliseconds (used as polling interval in tests)
  • 90 seconds timeout used explicitly in some Eventually calls for OAuthClient creation/deletion (no constant defined)
  • Ports:
    • NotebookPort = 8888
    • NotebookOAuthPort = 8443
    • OAuthServicePort = 443
  • OAuth proxy image and service port name constants defined.

Summary of relevant constants for magic numbers in the test file:

  • The test file defines duration and interval constants for general use.
  • The 90-second timeout used in OAuthClient related tests is hardcoded, not defined as a constant.
  • Port numbers and OAuth proxy image are defined as constants in other files and used in the tests.
  • No constants defined in the test file for the 90-second timeout or other magic numbers related to OAuthClient lifecycle waits.

[End of relevant snippets]


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe these numbers came from tests that I was executing previously, where the logic of my implementation wasn't working as expected, sometimes even taking longer times to see the finalizers get added to the resources, deleting the resources properly, etc, but I believe that in its current state, yes, these numbers can be much lower and based on constants for the consistency, no problem on changing that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@@ -1014,8 +1015,132 @@ var _ = Describe("The Openshift Notebook controller", func() {
})
})

When("Creating notebook as part of Service Mesh", func() {
When("Creating a Notebook with OAuth and then deleting it", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai does this test implement this sketched out test plan?

automated test that

  1. creates a Notebook CR
  2. checks that the Notebook CR has Finalizer set (by odh-notebook-controller, or maybe the webhook, don't care, but one of the two should set finalizer)
  3. checks that eventually OAuthClient for the Notebook CR has been created (by the odh-notebook-controller) in the cluster for the notebook CR
  4. deletes the Notebook CR
  5. checks that eventually the OAuthClient for the Notebook CR gets deleted (by the odh-notebook-controller)
  6. checks that eventually finalizer has been removed from the Notebook CR

I (we) are unsure about step 7, whether it is reasonable in envtest environment to expect the Notebook CR to be eventually actually deleted. But I am pretty sure about test that does steps 1 to 6.
edit: upon reflection, and reading https://book.kubebuilder.io/reference/using-finalizers and kubernetes-sigs/controller-runtime#2184 (comment) yes, I think the test has to also

  1. check that eventually Notebook CR has been actually deleted
  • this step may race with step 6, it is entirely possible to check 5 and then immediately see 7, without observing the system at 6, and that's entirely fine by me and the test should still pass either way
  • I guess that means we should not attempt to check for 6 in a test, therefore, check 5, then 7
  • yeah, reading the issue comment on gh, we should never observe 6, actually; updating an object that is being deleted by removing it's last remaining finalizer deletes the object, as the comment says; and I believe that

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek Yes, this test does implement your sketched out test plan! Let me map each step to the corresponding test assertions:

Step 1: Creates a Notebook CR
✅ Implemented in "Should create a Notebook CR successfully":

Expect(cli.Create(ctx, notebook)).Should(Succeed())

Step 2: Checks that the Notebook CR has Finalizer set
✅ Implemented in "Should have finalizer set by controller or webhook":

Eventually(func() []string {
    var updated nbv1.Notebook
    err := cli.Get(ctx, notebookKey, &updated)
    return updated.Finalizers
}, duration, interval).Should(ContainElement(Finalizer))

Step 3: Checks that eventually OAuthClient gets created
✅ Implemented in "Should create OAuthClient for the Notebook CR":

Eventually(func() error {
    return cli.Get(ctx, oauthClientKey, createdOAuthClient)
}, time.Second*90, time.Second*3).Should(Succeed())

Step 4: Deletes the Notebook CR
✅ Implemented in "Should delete OAuthClient when Notebook is deleted":

Expect(cli.Delete(ctx, &notebookToDelete)).Should(Succeed())

Step 5: Checks that eventually the OAuthClient gets deleted
✅ Implemented in the same test block:

Eventually(func() error {
    var oauth oauthv1.OAuthClient
    return cli.Get(ctx, oauthClientKey, &oauth)
}, time.Second*90, time.Second*3).Should(
    MatchError(ContainSubstring("not found")))

Step 6: Checks finalizer removal (not observable)
As you correctly noted, this step races with step 7 and might not be observable.

Step 7: Checks that eventually Notebook CR is actually deleted
✅ Implemented in "Should remove finalizer and delete Notebook after OAuthClient cleanup":

Eventually(func() error {
    var updated nbv1.Notebook
    return cli.Get(ctx, notebookKey, &updated)
}, time.Second*60, time.Second*2).Should(
    MatchError(ContainSubstring("not found")))

The test correctly follows your insight about finalizer behavior - it focuses on verifying OAuthClient deletion (step 5) followed by Notebook deletion (step 7), without trying to catch the intermediate state of finalizer removal that may not be observable.

The test also uses appropriate timeouts (90 seconds for OAuthClient operations, 60 seconds for final deletion) and the createNotebookWithOAuth helper function that sets up the OAuth-enabled notebook with proper annotations and spec.


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstourac I guess I am happy, then

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

}
return updated.DeletionTimestamp != nil
}, duration, interval).Should(BeTrue(),
"Notebook should be marked for deletion but persist due to finalizer")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai isn't this check flaky? depending on timing, we may not observe the resource in the process of being finalized in the test

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai your previous response does not actually contain any answer

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek You're absolutely right - my previous response was incomplete. Let me address your original question about the test flakiness.

Yes, this check is indeed potentially flaky due to timing issues. The test is trying to observe an intermediate state during finalizer processing:

By("Verifying notebook is marked for deletion but persists due to finalizer")
Eventually(func() bool {
    var updated nbv1.Notebook
    if err := cli.Get(ctx, notebookKey, &updated); err != nil {
        return false
    }
    return updated.DeletionTimestamp != nil
}, duration, interval).Should(BeTrue(),
    "Notebook should be marked for deletion but persist due to finalizer")

The race condition occurs because:

  1. cli.Delete() sets the deletion timestamp
  2. Finalizer prevents immediate deletion
  3. Controller processes finalizer (deletes OAuthClient)
  4. Controller removes finalizer
  5. Notebook gets fully deleted

If steps 3-5 happen very quickly, the test might miss observing the intermediate state where DeletionTimestamp != nil but the resource still exists.

To make this more robust, you could:

  1. Accept either state as valid:
By("Verifying notebook deletion process")
Eventually(func() bool {
    var updated nbv1.Notebook
    err := cli.Get(ctx, notebookKey, &updated)
    if apierrors.IsNotFound(err) {
        return true // Already fully deleted - that's fine
    }
    if err != nil {
        return false // Unexpected error
    }
    return updated.DeletionTimestamp != nil // In deletion process
}, duration, interval).Should(BeTrue())
  1. Add a brief sleep after the delete call to increase chances of observing the intermediate state

  2. Split into separate checks - first verify deletion was initiated, then verify the cleanup process

This timing issue aligns with the known flakiness mentioned in the learnings for the odh-notebook-controller-unit tests.


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Comment on lines +1126 to +1131
By("Verifying OAuthClient was deleted in previous test")
oauthClientKey := types.NamespacedName{Name: Name + "-" + Namespace + "-oauth-client"}
Eventually(func() error {
var oauth oauthv1.OAuthClient
return cli.Get(ctx, oauthClientKey, &oauth)
}, duration, interval).Should(MatchError(ContainSubstring("not found")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try not to repeat checks like this, the test is already quite long and it may be difficult to see the essential in noise

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test is quite long and this is why I wanted to have these additional checks, to guarantee that the logic was easily readable and understandable by everyone.

I can remove it, no problem from my perspective as well. =)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai can you suggest alternative, more streamlined implementation of this test, that will be easier to follow?

automated test that

  1. creates a Notebook CR
  2. checks that the Notebook CR has Finalizer set (by odh-notebook-controller, or maybe the webhook, don't care, but one of the two should set finalizer)
  3. checks that eventually OAuthClient for the Notebook CR has been created (by the odh-notebook-controller) in the cluster for the notebook CR
  4. deletes the Notebook CR
  5. checks that eventually the OAuthClient for the Notebook CR gets deleted (by the odh-notebook-controller)
  6. checks that eventually finalizer has been removed from the Notebook CR

I (we) are unsure about step 7, whether it is reasonable in envtest environment to expect the Notebook CR to be eventually actually deleted. But I am pretty sure about test that does steps 1 to 6.
edit: upon reflection, and reading https://book.kubebuilder.io/reference/using-finalizers and kubernetes-sigs/controller-runtime#2184 (comment) yes, I think the test has to also

  1. check that eventually Notebook CR has been actually deleted
  • this step may race with step 6, it is entirely possible to check 5 and then immediately see 7, without observing the system at 6, and that's entirely fine by me and the test should still pass either way
  • I guess that means we should not attempt to check for 6 in a test, therefore, check 5, then 7
  • yeah, reading the issue comment on gh, we should never observe 6, actually; updating an object that is being deleted by removing it's last remaining finalizer deletes the object, as the comment says; and I believe that

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiridanek You're absolutely right about the test being too verbose. Here's a more streamlined implementation that follows your outlined flow:

When("Creating a Notebook with OAuth and then deleting it", func() {
	const (
		Name      = "test-notebook-oauth-delete"
		Namespace = "default"
		Finalizer = "notebooks.kubeflow.org/oauthclient"
	)

	It("Should handle OAuth finalizer lifecycle correctly", func() {
		// Step 1: Create a Notebook CR with OAuth enabled
		notebook := createNotebookWithOAuth(Name, Namespace)
		Expect(cli.Create(ctx, notebook)).Should(Succeed())

		// Wait for reconciliation to complete
		time.Sleep(3 * time.Second)

		notebookKey := types.NamespacedName{Name: Name, Namespace: Namespace}
		oauthClientKey := types.NamespacedName{Name: Name + "-" + Namespace + "-oauth-client"}

		// Step 2: Check that finalizer is set
		Eventually(func() []string {
			var updated nbv1.Notebook
			if err := cli.Get(ctx, notebookKey, &updated); err != nil {
				return nil
			}
			return updated.Finalizers
		}, duration, interval).Should(ContainElement(Finalizer))

		// Step 3: Check that OAuthClient is created
		Eventually(func() error {
			return cli.Get(ctx, oauthClientKey, &oauthv1.OAuthClient{})
		}, time.Second*90, time.Second*3).Should(Succeed())

		// Step 4: Delete the Notebook CR
		var notebookToDelete nbv1.Notebook
		Expect(cli.Get(ctx, notebookKey, &notebookToDelete)).Should(Succeed())
		Expect(cli.Delete(ctx, &notebookToDelete)).Should(Succeed())

		// Step 5: Check that OAuthClient gets deleted
		Eventually(func() error {
			return cli.Get(ctx, oauthClientKey, &oauthv1.OAuthClient{})
		}, time.Second*90, time.Second*3).Should(
			MatchError(ContainSubstring("not found")))

		// Step 7: Check that Notebook CR is eventually deleted
		// (Step 6 - finalizer removal - is not observable as it races with deletion)
		Eventually(func() error {
			return cli.Get(ctx, notebookKey, &nbv1.Notebook{})
		}, time.Second*60, time.Second*2).Should(
			MatchError(ContainSubstring("not found")))
	})
})

This streamlined version:

  1. Combines all steps into a single focused test - easier to follow the complete flow
  2. Eliminates redundant checks - removes the repetitive verification blocks
  3. Uses descriptive comments to mark each step clearly
  4. Skips step 6 (finalizer removal check) as you correctly noted it races with deletion and is unobservable
  5. Uses the global context consistently (from the retrieved learning about the ctx variable in suite_test.go)

The test now reads like a clear narrative: create → verify setup → delete → verify cleanup, which makes it much easier to understand the intended behavior and debug any failures.


✏️ Learnings added
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1126-1131
Timestamp: 2025-07-02T04:00:16.900Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, OAuth finalizer tests should be streamlined into single test blocks rather than multiple "It" blocks, as checking finalizer removal races with object deletion and is typically unobservable in Kubernetes.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: jiridanek
PR: opendatahub-io/kubeflow#623
File: components/odh-notebook-controller/controllers/notebook_runtime_test.go:179-179
Timestamp: 2025-06-25T06:54:57.600Z
Learning: In the file `components/odh-notebook-controller/controllers/notebook_runtime_test.go`, there is a suggestion to repurpose `testCase.ConfigMap` as `expectedConfigMap` in the test structure to improve clarity by separating input from expected results. This improvement is deferred to be addressed as part of GitHub issue #634.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#605
File: components/odh-notebook-controller/controllers/notebook_controller_test.go:1032-1032
Timestamp: 2025-07-01T15:23:02.903Z
Learning: In the opendatahub-io/kubeflow repository's notebook controller tests, there is a global context variable `ctx` defined in `suite_test.go` that is properly initialized with cancellation support in `BeforeSuite`. Tests should use this global context instead of creating new `context.Background()` instances for consistency with the test suite architecture.

Learnt from: jiridanek
PR: opendatahub-io/kubeflow#0
File: :0-0
Timestamp: 2025-06-29T12:32:38.270Z
Learning: The `ci/prow/odh-notebook-controller-unit` test in the opendatahub-io/kubeflow repository is a known flaky test with existing Jira tickets RHOAIENG-15909 and RHOAIENG-15907 tracking the issue. Test failures from this CI job are often not related to code changes and typically pass on rerun.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok then, @jiri, feel free to replace the entire test with something more aligned with your expectations, no problem from my side :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is, whatever we may think and prefer actually does not matter. We're not on the notebooks team, neither you or me. So I'm gonna call on the indigo folks to give them chance.

In case they decide to pass, here's your out-of-jail card
/lgtm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daniellutz where does this definition come from? Just wanna check it.

Also, we should probably review the place where we put this (if not now, then in the future) - if this is supposed to be only for tests, it gets mixed with the rest of the resources that are truly taken and applied with the rhods operator on the actual OpenShift cluster in the production.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as I remember, I exported from my own OpenShift the CRD.

I don't have the command anymore because I did provision my server again, but I will check if I have in my own notes.

I did put there for the tests, feel free to move it or suggest another place where this feels more suitable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants