feat(controllers): add reconcile requeue result and speed up role delete finalization#279
feat(controllers): add reconcile requeue result and speed up role delete finalization#279
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
This PR introduces a Kubernetes-style reconcile result pattern to enable fine-grained control over requeue behavior in controllers. The main motivation is to speed up role deletion by allowing immediate requeuing after the first phase of deletion (status update), rather than waiting for the full polling cycle.
Changes:
- Added
reconcile.Resulttype withRequeueandRequeueAfterfields to control requeue behavior - Updated the
Reconcilerinterface to return(reconcile.Result, error)instead of justerror - Enhanced
RoleControllerto returnRequeueAfter: 100msduring role deletion phase-1 to accelerate the physical delete
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| controllers/reconcile/result.go | New Result struct defining requeue behavior (Requeue bool, RequeueAfter duration) |
| controllers/base_controller.go | Updated processNextWorkItem to handle Result-based requeueing; modified Reconciler interface signature |
| controllers/role_controller.go | Implements fast delete by returning RequeueAfter during deletion phase-1; added roleDeleteRequeueAfter constant (100ms) |
| controllers/role_controller_test.go | Added assertion to verify RequeueAfter is returned during deletion; updated to handle new Reconcile signature |
| controllers/workspace_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/workspace_controller_test.go | Updated test to capture Result return value (ignored with _) |
| controllers/user_profile_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/model_registry_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/model_registry_controller_test.go | Updated test to capture Result return value (ignored with _) |
| controllers/model_catalog_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/model_catalog_controller_test.go | Updated test to capture Result return value (ignored with _) |
| controllers/image_registry_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/image_registry_controller_test.go | Updated test to capture Result return value (ignored with _) |
| controllers/engine_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/engine_controller_test.go | Updated test to capture Result return value (ignored with _) |
| controllers/endpoint_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/endpoint_controller_test.go | Updated test to capture Result return value (ignored with _) |
| controllers/cluster_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/cluster_controller_test.go | Updated test to capture Result return value (ignored with _) |
| controllers/role_assignment_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/role_assignment_controller_test.go | Updated test to capture Result return value (ignored with _) |
| controllers/api_key_controller.go | Updated Reconcile to return reconcile.Result{} with existing sync logic unchanged |
| controllers/api_key_controller_test.go | Updated test to capture Result return value (ignored with _) |
| controllers/base_controller_test.go | Updated mock expectations to return reconcile.Result{} in test cases |
| controllers/mocks/mock_reconciler.go | Regenerated mock to match new Reconciler interface with (Result, error) return signature |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| type Result struct { | ||
| Requeue bool |
There was a problem hiding this comment.
The reconcile.Result struct allows both Requeue and RequeueAfter to be set simultaneously, but the base controller implementation will only honor RequeueAfter in this case (lines 97-101, 105-109 in base_controller.go). Consider either: (1) documenting this precedence in the Result struct, or (2) adding validation in the Result type to prevent both fields from being set, or (3) returning an error if both are set.
| type Result struct { | |
| Requeue bool | |
| // Result captures the outcome of a reconciliation. | |
| // | |
| // If both Requeue and RequeueAfter are set, the base controller implementation | |
| // will honor RequeueAfter and ignore Requeue. | |
| type Result struct { | |
| // Requeue indicates whether to requeue immediately. | |
| // | |
| // Note: If both Requeue and RequeueAfter are set, RequeueAfter takes | |
| // precedence and the controller will honor RequeueAfter. | |
| Requeue bool | |
| // RequeueAfter specifies how long to wait before requeuing. | |
| // | |
| // When non-zero and Requeue is also true, this value takes precedence. |
|
|
||
| type Result struct { | ||
| Requeue bool | ||
| RequeueAfter time.Duration |
There was a problem hiding this comment.
The Result struct doesn't validate that RequeueAfter is non-negative. If a negative duration is provided, the behavior is undefined and may cause unexpected issues with the workqueue. Consider adding validation to ensure RequeueAfter is either zero or positive, or document that negative values are not allowed.
| RequeueAfter time.Duration | |
| RequeueAfter time.Duration // must be zero or positive; negative values are not allowed |
| if result.RequeueAfter > 0 { | ||
| bc.queue.AddAfter(key, result.RequeueAfter) | ||
| } else if result.Requeue { | ||
| bc.queue.AddRateLimited(key) |
There was a problem hiding this comment.
When reconciliation returns an error, the item is not automatically requeued unless a Result flag is set. This means errors without explicit requeue flags will cause the item to be dropped from the queue entirely. The standard pattern should be to always requeue on error (using rate limiting) regardless of the Result flags. Consider adding a fallback to ensure items are requeued with rate limiting when an error occurs and no requeue flags are set.
| bc.queue.AddRateLimited(key) | |
| bc.queue.AddRateLimited(key) | |
| } else { | |
| // Ensure items are always retried on error, even if Result does not request requeue. | |
| bc.queue.AddRateLimited(key) |
| return c.syncHandler(role) | ||
| if role.Metadata != nil && role.Metadata.DeletionTimestamp != "" && | ||
| (role.Status == nil || role.Status.Phase != v1.RolePhaseDELETED) { | ||
| return reconcile.Result{RequeueAfter: roleDeleteRequeueAfter}, c.syncHandler(role) |
There was a problem hiding this comment.
When the status update to DELETED phase fails in syncHandler (line 70-77), the Reconcile method will return both a RequeueAfter result and an error. With the current base_controller implementation, the RequeueAfter will be honored even though there's an error, which means the retry won't use rate limiting. This could lead to rapid retry loops if the error persists. Consider either: (1) only returning RequeueAfter when syncHandler succeeds, or (2) fixing the base controller to apply rate limiting when both error and RequeueAfter are present.
| return reconcile.Result{RequeueAfter: roleDeleteRequeueAfter}, c.syncHandler(role) | |
| err := c.syncHandler(role) | |
| if err != nil { | |
| // On error, do not set RequeueAfter so that the base controller's rate limiting applies. | |
| return reconcile.Result{}, err | |
| } | |
| // On success, requeue after a short delay to continue deletion processing. | |
| return reconcile.Result{RequeueAfter: roleDeleteRequeueAfter}, nil |
| if role, ok := tt.inputKey.(*v1.Role); ok && role.Metadata != nil && | ||
| role.Metadata.DeletionTimestamp != "" && | ||
| (role.Status == nil || role.Status.Phase != v1.RolePhaseDELETED) { | ||
| assert.True(t, result.RequeueAfter > 0) |
There was a problem hiding this comment.
The test assertion on lines 293-297 checks that RequeueAfter is set when a role is being deleted but not yet in DELETED phase. However, this assertion will pass even if RequeueAfter is 0, because it only checks if it's greater than 0. Since roleDeleteRequeueAfter is 100ms, this should work, but the assertion could be more precise by checking that the value equals roleDeleteRequeueAfter. Consider using assert.Equal(t, roleDeleteRequeueAfter, result.RequeueAfter) for more specific validation.
| assert.True(t, result.RequeueAfter > 0) | |
| assert.Equal(t, roleDeleteRequeueAfter, result.RequeueAfter) |
| if role, ok := tt.inputKey.(*v1.Role); ok && role.Metadata != nil && | ||
| role.Metadata.DeletionTimestamp != "" && | ||
| (role.Status == nil || role.Status.Phase != v1.RolePhaseDELETED) { | ||
| assert.True(t, result.RequeueAfter > 0) | ||
| } |
There was a problem hiding this comment.
The assertion on lines 293-297 checks for RequeueAfter behavior when a role is being deleted, but none of the test cases in this test function actually use testRoleWithDeletionTimestamp. This means the new RequeueAfter logic is not being tested here. Consider adding at least one test case that uses testRoleWithDeletionTimestamp with a phase other than DELETED to verify that RequeueAfter is properly returned.
| if role, ok := tt.inputKey.(*v1.Role); ok && role.Metadata != nil && | |
| role.Metadata.DeletionTimestamp != "" && | |
| (role.Status == nil || role.Status.Phase != v1.RolePhaseDELETED) { | |
| assert.True(t, result.RequeueAfter > 0) | |
| } |
| type Result struct { | ||
| Requeue bool | ||
| RequeueAfter time.Duration | ||
| } |
There was a problem hiding this comment.
The Result struct lacks documentation explaining its purpose and how its fields should be used. Consider adding a comment describing when to use Requeue vs RequeueAfter, and clarifying the behavior when both an error and a Result are returned. For example: "Result controls whether and how a reconciled object should be requeued. When error is nil and Requeue is true, the item is requeued immediately. When error is nil and RequeueAfter is set, the item is requeued after the specified duration. When an error is returned, these fields may influence the requeue behavior depending on the controller implementation."
Issues
DeletedChanges
controllers/reconcile/result.gotype Result { Requeue bool; RequeueAfter time.Duration }Reconcilerinterface to return(reconcile.Result, error).BaseController.processNextWorkItemto honor:RequeueRequeueAfterRoleControllerreturnsRequeueAfterafter phase-1 delete status update, so phase-2 final delete is reconciled quickly without waiting next full poll interval.controllers/mocks/mock_reconciler.goand adjusted tests for the new return signature.Why only role uses requeue now
Risk assessment
RequeueAfter), avoiding global polling interval changes.Rollback plan
RequeueAfterreturn inRoleControllerwhile keeping interface scaffolding.Reconcilersignature and processing flow.Test
go test ./controllers/...