You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(preflight): ensure MDs without overrides are also checked (#1216)
We were not adding preflight checks for worker machine deployments that
did not have variable overrides set.
Note: this is stacked on top of #1215 mostly for unit tests. Adding a do
not merge label until #1215 is merged.
**How has this been tested?**
1. Misconfigured storage container (doesn't actually exist) on cluster
in machine details or on cluster in failure domain (e.g.
`ncn-dev-sandbox-gpu` doesn't have `k8s` storage container)
2. Create failure domain
```
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: NutanixFailureDomain
metadata:
name: fd-2
namespace: default
spec:
prismElementCluster:
type: name
name: ncn-dev-sandbox-gpu
subnets:
- type: name
name: subnet-2
```
3. Create a cluster without the changes in this PR
```
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
...
spec:
...
topology:
class: nutanix-quick-start
controlPlane:
metadata: {}
replicas: 3
variables:
...
- name: workerConfig
value:
nutanix:
machineDetails:
bootType: uefi
cluster:
name: ncn-dev-sandbox-gpu
type: name
imageLookup:
baseOS: rocky-9.6
format: nkp-{{.BaseOS}}-release-{{.K8sVersion}}-*
memorySize: 4Gi
subnets:
- name: vlan173
type: name
systemDiskSize: 40Gi
vcpuSockets: 2
vcpusPerSocket: 1
version: 1.33.1
workers:
machineDeployments:
- class: default-worker
name: md-variable-override
variables:
overrides:
- name: workerConfig
value:
nutanix:
machineDetails:
...
cluster:
name: ncn-dev-sandbox-gpu
type: name
- class: default-worker
failureDomain: fd-2
name: md-variable-override-failure-domain
variables:
overrides:
- name: workerConfig
value:
nutanix:
machineDetails:
...
cluster:
name: ncn-dev-sandbox-gpu
type: name
- class: default-worker
name: md-no-overrides
- class: default-worker
failureDomain: fd-2
name: md-no-overrides-failure-domain
```
4. Observer pre-flight failure on only 2 machine deployments
```
The request is invalid:
* $.spec.topology.workers.machineDeployments[[email protected]=="md-variable-override-failure-domain"].failureDomain: Found no Storage Containers with name "k8s" on Cluster "ncn-dev-sandbox-gpu". Create a Storage Container with this name on Cluster "ncn-dev-sandbox-gpu", and then retry.
* $.spec.topology.workers.machineDeployments[[email protected]=="md-variable-override"].variables[[email protected]=workerConfig].value.nutanix.machineDetails: Found no Storage Containers with name "k8s" on Cluster "ncn-dev-sandbox-gpu". Create a Storage Container with this name on Cluster "ncn-dev-sandbox-gpu", and then retry.
```
6. Create the same cluster with changes in this PR and observe
pre-flight failure on all 4 machine deployments
```
The request is invalid:
* $.spec.topology.workers.machineDeployments[[email protected]=="md-variable-override-failure-domain"].failureDomain: Found no Storage Containers with name "k8s" on Cluster "ncn-dev-sandbox-gpu". Create a Storage Container with this name on Cluster "ncn-dev-sandbox-gpu", and then retry.
* $.spec.topology.workers.machineDeployments[[email protected]=="md-no-overrides"].variables[[email protected]=workerConfig].value.nutanix.machineDetails: Found no Storage Containers with name "k8s" on Cluster "ncn-dev-sandbox-gpu". Create a Storage Container with this name on Cluster "ncn-dev-sandbox-gpu", and then retry.
* $.spec.topology.workers.machineDeployments[[email protected]=="md-no-overrides-failure-domain"].failureDomain: Found no Storage Containers with name "k8s" on Cluster "ncn-dev-sandbox-gpu". Create a Storage Container with this name on Cluster "ncn-dev-sandbox-gpu", and then retry.
* $.spec.topology.workers.machineDeployments[[email protected]=="md-variable-override"].variables[[email protected]=workerConfig].value.nutanix.machineDetails: Found no Storage Containers with name "k8s" on Cluster "ncn-dev-sandbox-gpu". Create a Storage Container with this name on Cluster "ncn-dev-sandbox-gpu", and then retry.
```
Message: "Failed to unmarshal topology machineDeployment variable \"workerConfig\": failed to unmarshal json: invalid character 'i' looking for beginning of object key string. Review the Cluster.", ///nolint:lll // The message is long.
187
-
Field: "$.spec.topology.workers.machineDeployments[[email protected]==\"md-0\"].variables[[email protected]=workerConfig].value.nutanix.machineDetails", ///nolint:lll // The field is long.
187
+
Message: "Failed to unmarshal variable \"workerConfig\": failed to unmarshal json: invalid character 'i' looking for beginning of object key string. Review the Cluster.", ///nolint:lll // The message is long.
188
+
Field: "$.spec.topology.workers.machineDeployments[[email protected]==\"md-0\"].variables[[email protected]=='workerConfig'].value.nutanix",///nolint:lll // The field is long.
0 commit comments