You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/operator-nexus/howto-cluster-runtime-upgrade-template.md
+23-30Lines changed: 23 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ ms.topic: how-to
9
9
ms.custom: azure-operator-nexus, template-include
10
10
---
11
11
12
-
# Cluster Runtime Upgrade Template
12
+
# Cluster runtime upgrade template
13
13
14
14
This how-to guide provides a step-by-step template for upgrading a Nexus Cluster designed to assist users in managing a reproducible end-to-end upgrade through Azure APIs and standard operating procedures. Regular updates are crucial for maintaining system integrity and accessing the latest product improvements.
15
15
@@ -111,35 +111,28 @@ If any failures occur, report the <MISE_CID>, <CORRELATION_ID>, status code, and
111
111
112
112
1. Validate the provisioning and detailed status for the CM and Cluster.
113
113
114
-
Set up the subscription, CM, and Cluster parameters:
114
+
Login to Azure CLI and select or set the `<CUSTOMER_SUB_ID>`:
115
115
```
116
-
export SUBSCRIPTION_ID=<CUSTOMER_SUB_ID>
117
-
export CM_RG=<CM_RG>
118
-
export CM_NAME=<CM_NAME>
119
-
export CLUSTER_RG=<CLUSTER_RG>
120
-
export CLUSTER_NAME=<CLUSTER_NAME>
121
-
export CLUSTER_RID=<CLUSTER_RID>
122
-
export CLUSTER_MRG=<CLUSTER_MRG>
123
-
export THRESHOLD=<DEPLOYMENT_THRESHOLD>
124
-
export PAUSE_MINS=<DEPLOYMENT_PAUSE_MINS>
116
+
az login
117
+
az account set --subscription <CUSTOMER_SUB_ID>
125
118
```
126
119
127
120
Check that the CM is in `Succeeded` for `Provisioning state`:
128
121
```
129
-
az networkcloud clustermanager show -g $CM_RG --resource-name $CM_NAME --subscription $SUBSCRIPTION_ID -o table
122
+
az networkcloud clustermanager show -g <CM_RG> --resource-name <CM_NAME> --subscription <CUSTOMER_SUB_ID> -o table
130
123
```
131
124
132
125
Check the Cluster status `Detailed status` is `Running`:
133
126
```
134
-
az networkcloud cluster show -g $CLUSTER_RG --resource-name $CLUSTER_NAME --subscription $SUBSCRIPTION_ID -o table
127
+
az networkcloud cluster show -g <CLUSTER_RG> --resource-name <CLUSTER_NAME> --subscription <CUSTOMER_SUB_ID> -o table
135
128
```
136
129
137
130
>[!Note]
138
131
> If CM `Provisioning state` isn't `Succeeded` and Cluster `Detailed status` isn't `Running` stop the upgrade until issues are resolved.
139
132
140
133
2. Check the Bare Metal Machine (BMM) status `Detailed status` is `Running`:
141
134
```
142
-
az networkcloud baremetalmachine list -g $CLUSTER_MRG --subscription $SUBSCRIPTION_ID --query "sort_by([].{name:name,kubernetesNodeName:kubernetesNodeName,location:location,readyState:readyState,provisioningState:provisioningState,detailedStatus:detailedStatus,detailedStatusMessage:detailedStatusMessage,cordonStatus:cordonStatus,powerState:powerState,kubernetesVersion:kubernetesVersion,machineClusterVersion:machineClusterVersion,machineRoles:machineRoles| join(', ', @),createdAt:systemData.createdAt}, &name)" -o table
135
+
az networkcloud baremetalmachine list -g <CLUSTER_MRG> --subscription <CUSTOMER_SUB_ID> --query "sort_by([].{name:name,kubernetesNodeName:kubernetesNodeName,location:location,readyState:readyState,provisioningState:provisioningState,detailedStatus:detailedStatus,detailedStatusMessage:detailedStatusMessage,cordonStatus:cordonStatus,powerState:powerState,kubernetesVersion:kubernetesVersion,machineClusterVersion:machineClusterVersion,machineRoles:machineRoles| join(', ', @),createdAt:systemData.createdAt}, &name)" -o table
143
136
```
144
137
145
138
Validate the following resource states for each BMM (except spare):
@@ -158,8 +151,8 @@ If any failures occur, report the <MISE_CID>, <CORRELATION_ID>, status code, and
158
151
159
152
3. Collect a profile of the tenant workloads:
160
153
```
161
-
az networkcloud virtualmachine list --sub $SUBSCRIPTION_ID --query "reverse(sort_by([?clusterId=='$CLUSTER_RID'].{name:name, createdAt:systemData.createdAt, resourceGroup:resourceGroup, powerState:powerState, provisioningState:provisioningState, detailedStatus:detailedStatus,bareMetalMachineId:bareMetalMachineIdi,CPUCount:cpuCores, EmulatorStatus:isolateEmulatorThread}, &createdAt))" -o table
162
-
az networkcloud kubernetescluster list --sub $SUBSCRIPTION_ID --query "[?clusterId=='$CLUSTER_RID'].{name:name, resourceGroup:resourceGroup, provisioningState:provisioningState, detailedStatus:detailedStatus, detailedStatusMessage:detailedStatusMessage, createdAt:systemData.createdAt, kubernetesVersion:kubernetesVersion}" -o table
154
+
az networkcloud virtualmachine list --sub <CUSTOMER_SUB_ID> --query "reverse(sort_by([?clusterId=='<CLUSTER_RID>'].{name:name, createdAt:systemData.createdAt, resourceGroup:resourceGroup, powerState:powerState, provisioningState:provisioningState, detailedStatus:detailedStatus,bareMetalMachineId:bareMetalMachineIdi,CPUCount:cpuCores, EmulatorStatus:isolateEmulatorThread}, &createdAt))" -o table
155
+
az networkcloud kubernetescluster list --sub <CUSTOMER_SUB_ID> --query "[?clusterId=='<CLUSTER_RID>'].{name:name, resourceGroup:resourceGroup, provisioningState:provisioningState, detailedStatus:detailedStatus, detailedStatusMessage:detailedStatusMessage, createdAt:systemData.createdAt, kubernetesVersion:kubernetesVersion}" -o table
163
156
```
164
157
165
158
4. Review Operator Nexus Release notes for required checks and configuration updates not included in this document.
@@ -190,20 +183,20 @@ If `updateStrategy` isn't set, the default values are as follows:
190
183
191
184
### Set a deployment threshold and wait time different than default
> If 100% threshold is required, review the BMM status reported during pre-checks and make sure all BMM are healthy before proceeding with the upgrade.
197
190
198
191
Verify update:
199
192
```
200
-
az networkcloud cluster show -n $CLUSTER_NAME -g $CLUSTER_RG --subscription $SUBSCRIPTION_ID| grep -A5 updateStrategy
193
+
az networkcloud cluster show -n <CLUSTER_NAME> -g <CLUSTER_RG> --subscription <CUSTOMER_SUB_ID>| grep -A5 updateStrategy
201
194
"updateStrategy": {
202
195
"maxUnavailable": 32767,
203
196
"strategyType": "Rack",
204
197
"thresholdType": "PercentSuccess",
205
-
"thresholdValue": $THRESHOLD,
206
-
"waitTimeMinutes": $PAUSE_MINS
198
+
"thresholdValue": <DEPLOYMENT_THRESHOLD>,
199
+
"waitTimeMinutes": <DEPLOYMENT_PAUSE_MINS>
207
200
}
208
201
```
209
202
@@ -212,7 +205,7 @@ az networkcloud cluster show -n $CLUSTER_NAME -g $CLUSTER_RG --subscription $SUB
Gather ASYNC URL and Correlation ID info for further troubleshooting if needed.
@@ -246,19 +239,19 @@ Once a compute Rack meets the success threshold, the upgrade pauses until the us
246
239
247
240
Use the following command to continue upgrade once a Compute Rack is paused after meeting the deployment threshold for the Rack:
248
241
```
249
-
az networkcloud cluster continue-update-version -g $CLUSTER_RG -n $CLUSTER_NAME$ --subscription $SUBSCRIPTION_ID
242
+
az networkcloud cluster continue-update-version -g <CLUSTER_RG> -n <CLUSTER_NAME> --subscription <CUSTOMER_SUB_ID>
250
243
```
251
244
252
245
### Monitor status of Cluster
253
246
```
254
-
az networkcloud cluster list -g $CLUSTER_RG --subscription $SUBSCRIPTION_ID -o table
247
+
az networkcloud cluster list -g <CLUSTER_RG> --subscription <CUSTOMER_SUB_ID> -o table
255
248
```
256
249
The Cluster `Detailed status` shows `Running` and the `Detailed status message` shows 'Cluster is up and running.` when the upgrade is complete.
257
250
258
251
### Monitor status of Bare Metal Machines
259
252
```
260
-
az networkcloud baremetalmachine list -g $CLUSTER_MRG --subscription $SUBSCRIPTION_ID -o table
261
-
az networkcloud baremetalmachine list -g $CLUSTER_MRG --subscription $SUBSCRIPTION_ID --query "sort_by([].{name:name,kubernetesNodeName:kubernetesNodeName,location:location,readyState:readyState,provisioningState:provisioningState,detailedStatus:detailedStatus,detailedStatusMessage:detailedStatusMessage,cordonStatus:cordonStatus,powerState:powerState,kubernetesVersion:kubernetesVersion,machineClusterVersion:machineClusterVersion,machineRoles:machineRoles| join(', ', @),createdAt:systemData.createdAt}, &name)" -o table
253
+
az networkcloud baremetalmachine list -g <CLUSTER_MRG> --subscription <CUSTOMER_SUB_ID> -o table
254
+
az networkcloud baremetalmachine list -g <CLUSTER_MRG> --subscription <CUSTOMER_SUB_ID> --query "sort_by([].{name:name,kubernetesNodeName:kubernetesNodeName,location:location,readyState:readyState,provisioningState:provisioningState,detailedStatus:detailedStatus,detailedStatusMessage:detailedStatusMessage,cordonStatus:cordonStatus,powerState:powerState,kubernetesVersion:kubernetesVersion,machineClusterVersion:machineClusterVersion,machineRoles:machineRoles| join(', ', @),createdAt:systemData.createdAt}, &name)" -o table
262
255
```
263
256
264
257
Validate the following states for each BMM (except spare):
@@ -319,12 +312,12 @@ az networkcloud baremetalmachine list -g <CLUSTER_MRG> --subscription <CUSTOMER_
319
312
az networkcloud storageappliance list -g <CLUSTER_MRG> --subscription <CUSTOMER_SUB_ID> -o table
320
313
321
314
# Tenant Workloads
322
-
az networkcloud virtualmachine list --sub $SUBSCRIPTION_ID --query "reverse(sort_by([?clusterId=='$CLUSTER_RID'].{name:name, createdAt:systemData.createdAt, resourceGroup:resourceGroup, powerState:powerState, provisioningState:provisioningState, detailedStatus:detailedStatus,bareMetalMachineId:bareMetalMachineIdi,CPUCount:cpuCores, EmulatorStatus:isolateEmulatorThread}, &createdAt))" -o table
323
-
az networkcloud kubernetescluster list --sub $SUBSCRIPTION_ID --query "[?clusterId=='$CLUSTER_RID'].{name:name, resourceGroup:resourceGroup, provisioningState:provisioningState, detailedStatus:detailedStatus, detailedStatusMessage:detailedStatusMessage, createdAt:systemData.createdAt, kubernetesVersion:kubernetesVersion}" -o table
315
+
az networkcloud virtualmachine list --sub <CUSTOMER_SUB_ID> --query "reverse(sort_by([?clusterId=='<CLUSTER_RID>'].{name:name, createdAt:systemData.createdAt, resourceGroup:resourceGroup, powerState:powerState, provisioningState:provisioningState, detailedStatus:detailedStatus,bareMetalMachineId:bareMetalMachineIdi,CPUCount:cpuCores, EmulatorStatus:isolateEmulatorThread}, &createdAt))" -o table
316
+
az networkcloud kubernetescluster list --sub <CUSTOMER_SUB_ID> --query "[?clusterId=='<CLUSTER_RID>'].{name:name, resourceGroup:resourceGroup, provisioningState:provisioningState, detailedStatus:detailedStatus, detailedStatusMessage:detailedStatusMessage, createdAt:systemData.createdAt, kubernetesVersion:kubernetesVersion}" -o table
324
317
```
325
318
326
319
> [!Note]
327
-
> IRT validation provides a complete functional test of networking and workloads across all components of the Nexus Instance. Simple validation does not provide functional tesing.
320
+
> IRT validation provides a complete functional test of networking and workloads across all components of the Nexus Instance. Simple validation does not provide functional testing.
0 commit comments