Skip to content

fix: Correctly handle variable overrides #1252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jimmidyson
Copy link
Member

Currently the metamutator only passes the override values when they are
present on a machine deployment or control plane config. This commit
fixes that by merging the global variables into the overrides, resulting
in the behaviour that we always expected.

As overrides are not that commonly used yet, we have not seen this issue
but as taints, etc become more commonly used this will become an issue.

Tests added to prove the fix as well.

@jimmidyson jimmidyson requested review from faiq and mhrabovcin August 6, 2025 15:26
Copy link
Contributor

@dkoshkin dkoshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the new tests

@yanhua121
Copy link
Contributor

yanhua121 commented Aug 6, 2025

#This commit fixes that by merging the global variables into the overrides, resulting in the behaviour that we always expected.

I think this will break the worker nodes FD configuration with cluster.spec.topology.workers.machineDeployments.failureDomain with variables.overrides.workerConfig. In case of the "workerConfig" has the global configuration (supposed to be the default settings for worker nodes) with machineDetails having "cluster" and "subnets" settings, for the worker MD with FD setting, it is required to have the overrides variable "workerConfig" with machineDetails NOT have the "cluster" and "subnets" settings. But with this fix, the global setting of these two fields will appear in the MD's variable, it will fail the cluster object's webhook validation.

Since the e2e tests "[e2e-quick-start (Nutanix provider)]" which include the FD test cases passed in the PR. It seems the above described scenario is not a concern. The CAREN Cluster webhook validation only checks the MD's overridden workerConfig settings if it's set, but not the merged one. And the MD spec after patch with both FD and cluster/subnets setting is allowd.

@thunderboltsid
Copy link
Contributor

thunderboltsid commented Aug 7, 2025

@jimmidyson
Copy link
Member Author

jimmidyson commented Aug 7, 2025

@thunderboltsid
Copy link
Contributor

thunderboltsid commented Aug 8, 2025

@thunderboltsid So we 'd need to do merge in https://github.com/nutanix-cloud-native/cluster-api-runtime-extensions-nutanix/blob/main/pkg/webhook/cluster/nutanix_validator.go#L238?

Yeah, currently we validate against one or the other, now we have to validate against projected merge instead.

Currently the metamutator only passes the override values when they are
present on a machine deployment or control plane config. This commit
fixes that by merging the global variables into the overrides, resulting
in the behaviour that we always expected.

As overrides are not that commonly used yet, we have not seen this issue
but as taints, etc become more commonly used this will become an issue.

Tests added to prove the fix as well.
@jimmidyson jimmidyson force-pushed the jimmi/fix-overrides branch from b11bd79 to 97409aa Compare August 13, 2025 14:18
@jimmidyson
Copy link
Member Author

jimmidyson commented Aug 13, 2025

@thunderboltsid Please take a look at c5a0cdc.

@jimmidyson jimmidyson force-pushed the jimmi/fix-overrides branch from 97409aa to 8a4743a Compare August 13, 2025 14:45
Comment on lines +114 to +125
name: "both values are objects with nested objects with vars having nil object explicitly, merged",
args: args{
vars: map[string]apiextensionsv1.JSON{
"a": {Raw: []byte(`{"x":1,"y":{"a": 2,"b": null}}`)},
},
globalVars: map[string]apiextensionsv1.JSON{
"a": {Raw: []byte(`{"y":{"a": 2,"b":{"c": 5, "d": 6}},"z":4}`)},
},
},
want: map[string]apiextensionsv1.JSON{
"a": {Raw: []byte(`{"x":1,"y":{"a": 2,"b":{"c": 5, "d": 6}},"z":4}`)},
},
Copy link
Contributor

@thunderboltsid thunderboltsid Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to ensure that overrides have a mechanism to nullify the global vars else they can't really be considered an "override".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants