restart redpanda on changes to the cluster configuration#603
restart redpanda on changes to the cluster configuration#603
Conversation
|
|
ac35e0f to
d3f2f7f
Compare
|
@simon0191 - looks like you need to run |
d3f2f7f to
045ed72
Compare
045ed72 to
c857965
Compare
| // NB: Seed servers is excluded to avoid a rolling restart when only | ||
| // replicas is changed. | ||
| dependencies = append(dependencies, RedpandaConfigFile(dot, false)) | ||
| dependencies = append(dependencies, RedpandaConfigFile(dot, false), RedpandaClusterConfig(dot)) |
There was a problem hiding this comment.
Mentioned in slack but saying it here as well, this won't work as expected. This will trigger a restart before the cluster config is applied.
I think our best option is to do some hackery within the operator when useFlux is false (This is now the default in Azure so we should be good to go).
In broad strokes, you'll want to get the cluster config's version and inject it into an annotation on the redpanda Pods.
In reconcileResources (which is reconcileDeflux in the release/2.3.x and release/2.4.x branch), you should acquire the version and set it:
# Everything here is nullable, so you'll need to do some nasty if chaining.
values.Statefulset.PodTemplate.Annotations["some-magic-key"] = clusterConfigVersionI see ~2 options for getting the version. You can either make an admin client and pull it directly (1) or you could update reconcileClusterConfig stash the cluster config version onto the Status and/or Condition (2) and then read that within reconcileResources.
Option 1 feels kinda hacky but I think it'll be the fastest way to get this done.
Option 2 "fits" into the operator a bit better IMO. We're it me, I'd update the syncer.Sync to return (ClusterConfigVersion, error) and then stash the version in the Message of the cluster config condition:
apimeta.SetStatusCondition(rp.GetConditions(), metav1.Condition{
Type: redpandav1alpha2.ClusterConfigSynced,
Status: metav1.ConditionTrue,
ObservedGeneration: rp.Generation,
Reason: "ConfigSynced",
Message: fmt.Sprintf("ClusterConfig at Version %d", version),
})Then you can set the annotation to the value of message.
I'd strongly recommend adding a test case to RedpandaControllerSuite to make sure this works as expected.
You'll also likely want to work off the release/2.3.x branch as that's what's currently being deployed into cloud.
|
Closing in favor of #672 |
In operator v2, we're missing the capability of restarting Redpanda on cluster config changes that require restart.
As an initial implementation, this PR adds the entire cluster configuration to the
statefulset-checksum-annotationso that the cluster is restarted in any config change.Even though this approach is not ideal, as configuration changes are infrequent, this change should not be too disruptive.
A smarter implementation that checks for config properties that require restart based in the config schema is in the roadmap https://redpandadata.atlassian.net/browse/K8S-499.