-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Is there an existing issue already for this feature request/idea?
- I have searched for an existing issue, and could not find anything. I believe this is a new feature request to be evaluated.
What problem is this feature going to solve? Why should it be added?
Add a new BEHAVIOR_REQUEUE value to the OperatorLifecycleResponse.Behavior enum, allowing plugins to signal that reconciliation should be retried later
without treating it as an error.
Currently, when a plugin's lifecycle hook encounters a situation where it
cannot proceed (e.g., waiting for a dependent custom resource to be created),
it can only:
- Return success (incorrect, as the operation isn't complete)
- Return an error (causes the cluster to enter a failure state)
Neither option is appropriate for transient "not ready yet" conditions.
Plugins need a way to signal "please try again later" without failing.
Describe the solution you'd like
- Add
BEHAVIOR_REQUEUE= 2 toOperatorLifecycleResponse.Behaviorenum - Add optional
requeue_afterfield (int32, seconds) to control requeue timing
Describe alternatives you've considered
-
Use gRPC error codes (e.g.,
UNAVAILABLE)
Rejected: Error codes are meant for actual failures, not flow control. No way
to pass requeue timing. -
Convention-based error messages (e.g.,
REQUEUE:reason)
Rejected: Fragile string parsing, no type safety, cannot carry structured
metadata. -
Separate CheckReadiness RPC
Rejected: Extra complexity, race conditions, and readiness is often
context-dependent. -
Plugin blocks internally until ready
Rejected: Ties up resources, can timeout, plugin cannot coordinate with
operator's reconciliation loop.
Additional context
No response
Are you willing to actively contribute to this feature?
Yes
Code of Conduct
- I agree to follow this project's Code of Conduct