-
Notifications
You must be signed in to change notification settings - Fork 43
ManifestWorkReplicaSet Rollout Plugin #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ManifestWorkReplicaSet Rollout Plugin #160
Conversation
youngbupark
commented
Oct 28, 2025
- This is the initial draft for supporting rollout plugin in ManifestWorkReplicaSet Work Controller
- Note: rollback will be added when we propose MWRS automatic rollback enhancement.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: youngbupark The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| The following service defines the contract between Work Controller and the plugin. Each call must be idempotent, stateless, and time-bounded (≤30 s) to ensure consistent controller reconciliation. Plugin server must implement the following APIs. The helpers to implement server and clients will be implemented in [ocm/sdk-go](https://github.com/open-cluster-management-io/sdk-go) repository. | ||
|
|
||
| ```proto | ||
| // RolloutPluginService is the service for the rollout plugin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the initial commit of gRPC server proto - open-cluster-management-io/sdk-go#154
Note: The implementation can change as we develop.
| observedGeneration: 1 | ||
| reason: PluginInitialized | ||
| status: "True" | ||
| type: PluginLoaded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it is good idea to expose the plugin status or not... simple logging in work controller might be enough. I would like to get the feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this status might not be needed, but we would need to surface the message on mwrs API when calling grpc API fails.
| observedGeneration: 1 | ||
| reason: PluginInitialized | ||
| status: "True" | ||
| type: PluginLoaded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this status might not be needed, but we would need to surface the message on mwrs API when calling grpc API fails.
| // RolloutPluginService is the service for the rollout plugin. | ||
| service RolloutPluginService { | ||
| // Initialize initializes the plugin. | ||
| rpc Initialize(InitializeRequest) returns (InitializeResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need some clarification on error handling. What happens when a specific call fails? How would mwrs consumer to know and debug.
| workConfiguration: | ||
| workDriver: kube | ||
| # Plugin configuration | ||
| plugin: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the plugin is disabled if plugin is not set, right?
| // If the validation is completed successfully, the plugin should return a OK result. | ||
| // If the validation is still in progress, the plugin should return a INPROGRESS result. | ||
| // If the validation is failed, the plugin should return a FAILED result. | ||
| rpc ValidateRollout(RolloutPluginRequest) returns (ValidateResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When will this be called in mwrs reconciler? I think a flow on when these APIs will be called in mwrs controller will be helpful.
|
|
||
| // BeginRollout is called before the manifestwork resource is applied. | ||
| // It is used to prepare the rollout. | ||
| rpc BeginRollout(RolloutPluginRequest) returns (google.protobuf.Empty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean any spec change on manifestwork will trigger this? What if placement changes but mw spec does not change in mwrs?
| | False | False or not set | SUCCEEDED | Succeeded | Work has been successfully applied | | ||
| | Unknown/Not set | Any | N/A | Progressing | Conservative fallback: treat as still progressing | | ||
|
|
||
| The following state machine shows the expected transitions between rollout statuses: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this only enabled when rollout strategy in mwrs is set?
| # Plugin configuration | ||
| plugin: | ||
| # the image name of plugin sidecar | ||
| image: quay.io/open-cluster-management/my-rollout-plugin:v0.0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if setting image only is enough if the plugin also need to wire with cloud service or mesh which mean the plugin also need some customized args or secret.
|
|
||
| ## Alternatives | ||
|
|
||
| - Implement plugin server as a standalone service: In order to run plugin server securely, we need to enable TLS connection. Using a sidecar, we can avoid the complex network and security configuration. No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on the other hand, plugin can be easily customized.
| - rolloutStatus: The current [cluster rollout status](https://github.com/open-cluster-management-io/sdk-go/blob/main/pkg/apis/cluster/v1alpha1/rollout.go#L23-L39) (e.g., ToApply, Progressing, Succeeded, Failed, TimeOut, Skip). | ||
| - manifestRevisionName: The name of the manifest revision applied to the cluster. | ||
|
|
||
| ### Configure custom plugin for work controller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once a plugin is enabled, what's the behavior of a normal mwrs rollout which does not need to call any plugin?
| Work->>PluginServer: (NEW) ProgressingRollout() | ||
| PluginServer-->>Work: OK | ||
| loop clusterToRollout clusters | ||
| Work->>PluginServer: (NEW) BeforeRollout() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo? BeginRollout() ?