-
Notifications
You must be signed in to change notification settings - Fork 14
operator: Restart rolling update when Statefulset changes #1075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
andrewstucki
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure that I'm following the thrust of the change (basically adding in a MarkPodsForUpdate when there's a diff in the patch calculation in the shouldUpdate method). Not super familiar with the patch diff calculation, but I'm assuming this only takes in spec-level fields? If so, this seems good to me, but just making sure this isn't going to cause an infinite loop if say it detects diffs on Status.
Overall though, it'd be really nice if we eventually moved this to the same logic as the lifecycle package (or just refactored it to use lifecycle directly) which basically does this diff detection by fetching and introspecting the intermediate ControllerRevision generated from the StatefulSet and associated with each pod to make sure it's up-to-date... which is pretty much what the built-in Kubernetes StatefulSet controller does for its diff detection.
chrisseto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR Desc / Commit message 🎉
| return nil | ||
| } | ||
|
|
||
| if stsChanged { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a quick comment here indicating why the logic is this way so future editors don't have to reverse engineer the reasoning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will add comment.
Yes, it's spec-level fields as per redpanda-operator/operator/pkg/resources/statefulset_update.go Lines 628 to 631 in c59ff49
There is no problem of infinite loop when StatefulSet Status is changed.
I would love to do that, but with time constraint and lack of tests for Cluster Custom Resource I would like to leave diffing detection as is. |
8e8d9c3 to
e561dfe
Compare
a037c0f to
a4695d9
Compare
This PR enhances the detection mechanism for Node Pool rolling upgrades. Previously, when a Node Pool rolling upgrade was detected, all Pods were marked with the `ClusterUpdate` Pod Status Condition, and the `restarting` Cluster Custom Resource Status Node Pool field was marked as `true`. If a modification to the Cluster Custom Resource occurred during an ongoing rolling upgrade (e.g., a change to the Redpanda container tag), it could trigger an update to the StatefulSet resource. In such cases, Pods that had already been restarted within the affected Node Pool were not restarted again, resulting in inconsistent state propagation (e.g., mismatched Redpanda container tags). With this change, updates to the StatefulSet are explicitly distinguished from Node Pool rolling upgrades. When a StatefulSet update is required, all associated Pods are now marked or re-marked with the `ClusterUpdate` Pod Status Condition to ensure consistent rollout of the updated specification.
a4695d9 to
8d93a2a
Compare
This PR enhances the detection mechanism for Node Pool rolling upgrades. Previously, when a Node Pool rolling upgrade was detected, all Pods were marked with the
ClusterUpdatePod Status Condition, and therestartingCluster Custom Resource Status Node Pool field was marked astrue.If a modification to the Cluster Custom Resource occurred during an ongoing rolling upgrade (e.g., a change to the Redpanda container tag), it could trigger an update to the StatefulSet resource. In such cases, Pods that had already been restarted within the affected Node Pool were not restarted again, resulting in inconsistent state propagation (e.g., mismatched Redpanda container tags).
With this change, updates to the StatefulSet are explicitly distinguished from Node Pool rolling upgrades. When a StatefulSet update is required, all associated Pods are now marked or re-marked with the
ClusterUpdatePod Status Condition to ensure consistent rollout of the updated specification.