You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/operator-nexus/troubleshoot-reboot-reimage-replace.md
+33-19Lines changed: 33 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,24 +57,24 @@ The restart typically is the starting point for mitigating a problem.
57
57
```
58
58
az networkcloud baremetalmachine power-off \
59
59
--name <bareMetalMachineName> \
60
-
--resource-group <CLUSTER_MRG> \
61
-
--subscription <SUBSCRIPTION_ID>
60
+
--resource-group "<resourceGroup>" \
61
+
--subscription <subscriptionID>
62
62
```
63
63
64
64
***The following Azure CLI command will `start` the specified bareMetalMachineName.***
65
65
```
66
66
az networkcloud baremetalmachine start \
67
67
--name <bareMetalMachineName> \
68
-
--resource-group <CLUSTER_MRG> \
69
-
--subscription <SUBSCRIPTION_ID>
68
+
--resource-group "<resourceGroup>" \
69
+
--subscription <subscriptionID>
70
70
```
71
71
72
72
***The following Azure CLI command will `restart` the specified bareMetalMachineName.***
73
73
```
74
74
az networkcloud baremetalmachine restart \
75
75
--name <bareMetalMachineName> \
76
-
--resource-group <CLUSTER_MRG> \
77
-
--subscription <SUBSCRIPTION_ID>
76
+
--resource-group "<resourceGroup>" \
77
+
--subscription <subscriptionID>
78
78
```
79
79
80
80
@@ -87,31 +87,45 @@ The reimage action can be useful for troubleshooting problems by restoring the O
87
87
A reimage action is the best practice for lowest operational risk to ensure the integrity of the BMM.
88
88
89
89
As a best practice, make sure the BMM's workloads are drained using the cordon command, with evacuate "True", before executing the reimage command.
90
-
<!--(PLACEHOLDER: We need to explain how a customer can identify if workloads are currently running on a BMM and the az cli command used to get this information. Ask NAKS team to provide.) -->
90
+
91
+
***To identify if any workloads are currently running on a BMM, run the following command:***
92
+
93
+
***For Virtual Machines:***
94
+
```azurecli
95
+
az networkcloud baremetalmachine show -n <nodeName> /
***For NAKS nodes: (requires logging into the NAKS cluster)***
101
+
102
+
```
103
+
kubectl get nodes <resourceName> -ojson |jq '.metadata.labels."topology.kubernetes.io/baremetalmachine"'
104
+
```
91
105
92
106
***The following Azure CLI command will `cordon` the specified bareMetalMachineName.***
93
107
```
94
108
az networkcloud baremetalmachine cordon \
95
109
--evacuate "True" \
96
110
--name <bareMetalMachineName> \
97
-
--resource-group <CLUSTER_MRG> \
98
-
--subscription <SUBSCRIPTION_ID>
111
+
--resource-group "<resourceGroup>" \
112
+
--subscription <subscriptionID>
99
113
```
100
114
101
115
***The following Azure CLI command will `reimage` the specified bareMetalMachineName.***
102
116
```
103
117
az networkcloud baremetalmachine reimage \
104
118
--name <bareMetalMachineName> \
105
-
--resource-group <CLUSTER_MRG> \
106
-
--subscription <SUBSCRIPTION_ID>
119
+
--resource-group "<resourceGroup>" \
120
+
--subscription <subscriptionID>
107
121
```
108
122
109
123
***The following Azure CLI command will `uncordon` the specified bareMetalMachineName.***
110
124
```
111
125
az networkcloud baremetalmachine uncordon \
112
126
--name <bareMetalMachineName> \
113
-
--resource-group <CLUSTER_MRG> \
114
-
--subscription <SUBSCRIPTION_ID>
127
+
--resource-group "<resourceGroup>" \
128
+
--subscription <subscriptionID>
115
129
```
116
130
117
131
## Troubleshoot with a replace action
@@ -130,8 +144,8 @@ As a best practice, first issue a `cordon` command to remove the bare metal mach
130
144
az networkcloud baremetalmachine cordon \
131
145
--evacuate "True" \
132
146
--name <bareMetalMachineName> \
133
-
--resource-group <CLUSTER_MRG> \
134
-
--subscription <SUBSCRIPTION_ID>
147
+
--resource-group "<resourceGroup>" \
148
+
--subscription <subscriptionID>
135
149
```
136
150
137
151
When you're performing a physical hot swappable power supply repair, a replace action is not required because the BMM host will continue to function normally after the repair.
@@ -160,21 +174,21 @@ After physical repairs are completed, perform a replace action.
0 commit comments