Skip to content

Commit 20c1c91

Browse files
authored
Update troubleshoot-reboot-reimage-replace.md
1 parent 2109f88 commit 20c1c91

File tree

1 file changed

+33
-19
lines changed

1 file changed

+33
-19
lines changed

articles/operator-nexus/troubleshoot-reboot-reimage-replace.md

Lines changed: 33 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -57,24 +57,24 @@ The restart typically is the starting point for mitigating a problem.
5757
```
5858
az networkcloud baremetalmachine power-off \
5959
--name <bareMetalMachineName> \
60-
--resource-group <CLUSTER_MRG> \
61-
--subscription <SUBSCRIPTION_ID>
60+
--resource-group "<resourceGroup>" \
61+
--subscription <subscriptionID>
6262
```
6363

6464
***The following Azure CLI command will `start` the specified bareMetalMachineName.***
6565
```
6666
az networkcloud baremetalmachine start \
6767
--name <bareMetalMachineName> \
68-
--resource-group <CLUSTER_MRG> \
69-
--subscription <SUBSCRIPTION_ID>
68+
--resource-group "<resourceGroup>" \
69+
--subscription <subscriptionID>
7070
```
7171

7272
***The following Azure CLI command will `restart` the specified bareMetalMachineName.***
7373
```
7474
az networkcloud baremetalmachine restart \
7575
--name <bareMetalMachineName> \
76-
--resource-group <CLUSTER_MRG> \
77-
--subscription <SUBSCRIPTION_ID>
76+
--resource-group "<resourceGroup>" \
77+
--subscription <subscriptionID>
7878
```
7979

8080

@@ -87,31 +87,45 @@ The reimage action can be useful for troubleshooting problems by restoring the O
8787
A reimage action is the best practice for lowest operational risk to ensure the integrity of the BMM.
8888

8989
As a best practice, make sure the BMM's workloads are drained using the cordon command, with evacuate "True", before executing the reimage command.
90-
<!--(PLACEHOLDER: We need to explain how a customer can identify if workloads are currently running on a BMM and the az cli command used to get this information. Ask NAKS team to provide.) -->
90+
91+
***To identify if any workloads are currently running on a BMM, run the following command:***
92+
93+
***For Virtual Machines:***
94+
```azurecli
95+
az networkcloud baremetalmachine show -n <nodeName> /
96+
--resource-group <resourceGroup> /
97+
--subscription <subscriptionID> | jq '.virtualMachinesAssociatedIds'
98+
```
99+
100+
***For NAKS nodes: (requires logging into the NAKS cluster)***
101+
102+
```
103+
kubectl get nodes <resourceName> -ojson |jq '.metadata.labels."topology.kubernetes.io/baremetalmachine"'
104+
```
91105

92106
***The following Azure CLI command will `cordon` the specified bareMetalMachineName.***
93107
```
94108
az networkcloud baremetalmachine cordon \
95109
--evacuate "True" \
96110
--name <bareMetalMachineName> \
97-
--resource-group <CLUSTER_MRG> \
98-
--subscription <SUBSCRIPTION_ID>
111+
--resource-group "<resourceGroup>" \
112+
--subscription <subscriptionID>
99113
```
100114

101115
***The following Azure CLI command will `reimage` the specified bareMetalMachineName.***
102116
```
103117
az networkcloud baremetalmachine reimage \
104118
--name <bareMetalMachineName> \
105-
--resource-group <CLUSTER_MRG> \
106-
--subscription <SUBSCRIPTION_ID>
119+
--resource-group "<resourceGroup>" \
120+
--subscription <subscriptionID>
107121
```
108122

109123
***The following Azure CLI command will `uncordon` the specified bareMetalMachineName.***
110124
```
111125
az networkcloud baremetalmachine uncordon \
112126
--name <bareMetalMachineName> \
113-
--resource-group <CLUSTER_MRG> \
114-
--subscription <SUBSCRIPTION_ID>
127+
--resource-group "<resourceGroup>" \
128+
--subscription <subscriptionID>
115129
```
116130

117131
## Troubleshoot with a replace action
@@ -130,8 +144,8 @@ As a best practice, first issue a `cordon` command to remove the bare metal mach
130144
az networkcloud baremetalmachine cordon \
131145
--evacuate "True" \
132146
--name <bareMetalMachineName> \
133-
--resource-group <CLUSTER_MRG> \
134-
--subscription <SUBSCRIPTION_ID>
147+
--resource-group "<resourceGroup>" \
148+
--subscription <subscriptionID>
135149
```
136150

137151
When you're performing a physical hot swappable power supply repair, a replace action is not required because the BMM host will continue to function normally after the repair.
@@ -160,21 +174,21 @@ After physical repairs are completed, perform a replace action.
160174
```
161175
az networkcloud baremetalmachine replace \
162176
--name <bareMetalMachineName> \
163-
--resource-group <CLUSTER_MRG> \
177+
--resource-group "<resourceGroup>" \
164178
--bmc-credentials password=<IDRAC_PASSWORD> username=<IDRAC_USER> \
165179
--bmc-mac-address <IDRAC_MAC> \
166180
--boot-mac-address <PXE_MAC> \
167181
--machine-name <OS_HOSTNAME> \
168182
--serial-number <SERIAL_NUM> \
169-
--subscription <SUBSCRIPTION_ID>
183+
--subscription <subscriptionID>
170184
```
171185

172186
***The following Azure CLI command will uncordon the specified bareMetalMachineName.***
173187
```
174188
az networkcloud baremetalmachine uncordon \
175189
--name <bareMetalMachineName> \
176-
--resource-group <CLUSTER_MRG> \
177-
--subscription <SUBSCRIPTION_ID>
190+
--resource-group "<resourceGroup>" \
191+
--subscription <subscriptionID>
178192
```
179193

180194
## Summary

0 commit comments

Comments
 (0)