Skip to content

Commit 24c2998

Browse files
authored
Merge pull request #303091 from sushantjrao/break-glass-setup
Update howto-replace-network-devices.md
2 parents b491f71 + 2853b05 commit 24c2998

File tree

1 file changed

+76
-29
lines changed

1 file changed

+76
-29
lines changed

articles/operator-nexus/howto-replace-network-devices.md

Lines changed: 76 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,12 @@ ms.date: 08/12/2024
99
ms.custom: template-how-to, devx-track-azurecli
1010
---
1111

12-
# Replace a device in Azure Operator Nexus Network Fabric (NNF)
12+
# Replace a network device in Azure Operator Nexus Network Fabric (NNF)
1313

14-
This article describes how to replace a faulty or underperforming device in Azure Operator Nexus Network Fabric (NNF) using the RMA (Return Material Authorization) process which ensures minimal disruption and safe reintegration of the replacement hardware into the fabric.
14+
This article explains how to replace a faulty or underperforming network device in Azure Operator Nexus Network Fabric (NNF).
15+
It covers devices such as the Top of Rack (TOR) switch, Customer Edge (CE) switch, Network Packet Broker (NPB), and the Management Switch.
16+
The replacement is performed using the Return Material Authorization (RMA) process.
17+
This process is designed to minimize service disruption and safely reintegrate the new hardware into the fabric.
1518

1619
## Scenarios for device replacement
1720

@@ -25,15 +28,20 @@ Device replacement may be required in the following situations:
2528

2629
## Prerequisites
2730

28-
- Azure CLI installed and configured.
31+
To ensure a smooth and timely RMA process, verify the following prerequisites before initiating deployment:
2932

30-
- Required permissions to manage Microsoft.ManagedNetworkFabric resources.
33+
- Azure CLI is installed and properly configured
3134

32-
- Replacement device powered on and connected physically.
35+
- Permissions are granted to manage Microsoft.ManagedNetworkFabric resources
3336

34-
- Replacement device must support Zero Touch Provisioning (ZTP).
37+
- Replacement device is powered on and physically connected
3538

36-
- To ensure a smooth and timely RMA process, please verify the following before initiating deployment:
39+
- Replacement device supports Zero Touch Provisioning (ZTP)
40+
41+
- To prevent failure during the device disable action if the device is affected by continuous reboots due to hardware issues, it is advised to power off the device prior to initiating the RMA process.
42+
43+
- Before initiating the RMA deployment, perform the following checks:
44+
3745

3846
- Interface Speed Validation
3947

@@ -42,14 +50,20 @@ Device replacement may be required in the following situations:
4250
- If the speed is below 100 Mbps, update it accordingly to prevent delays or potential timeouts during the RMA process.
4351

4452
- Device Storage Check
45-
- Ensure the device has a minimum of 2 GB of free space available.
53+
- Ensure the device has a minimum of 3 GB of free space available.
4654

47-
- This is required to successfully download and stage the necessary image files.
55+
- This action is required to successfully download and stage the necessary image files.
4856

57+
## Device types supported
58+
59+
- Customer Edge (CE)
60+
- Top of Rack (TOR)
61+
- Management Switch (Mgmt Switch)
62+
- Network Packet Broker (NPB)
4963

5064
## Steps to replace a device
5165

52-
1. Disable administrative state.
66+
### Step 1: Disable administrative state
5367

5468
Use the following command to disable the administrative state of the device:
5569

@@ -60,19 +74,25 @@ az networkfabric device update-admin-state \
6074
--resource-group "resource-group-name"
6175
```
6276

63-
This action:
77+
This action sets the following states:
6478

65-
- Moves the device to a degraded state: EnabledDegraded.
79+
- Device Administrative State: Disabled
6680

67-
- Excludes the device from all control plane actions such as:
81+
- Fabric Administrative State: EnabledDegraded
6882

69-
- Certificate rotations
70-
71-
- Password rotations
72-
73-
- Fabric upgrades
83+
>[!Note]
84+
> This action is not permitted by the service, if any of the following operations are in progress at the fabric level:
85+
> - Device upgrade
86+
> - Configuration push
87+
> - Secret or certificate updates
88+
> - Administrative lock
89+
> - Terminal Server (TS) reprovisioning.
90+
91+
### Step 2: Update the serial number
7492

75-
2. Update the serial number.
93+
Execution conditions:
94+
- Device Administrative State must be `Disabled`
95+
- Fabric Administrative State must be `EnabledDegraded`
7696

7797
Once the replacement device is physically installed, update its serial number in the fabric resource:
7898

@@ -83,14 +103,26 @@ az networkfabric device update \
83103
--resource-group "resource-group-name"
84104
```
85105

86-
3. Ensure device is in ZTP Mode.
106+
Error recovery guidance:
107+
108+
- If RMA fails due to an incorrect serial number, repatching is allowed without a support ticket.
109+
110+
- If validation fails after device bootstrap, the system returns the status: Device Unable to Boot Up - Failed.
111+
112+
This action performs the following tasks:
113+
114+
- Update serial number stored in Azure ARM resource
115+
116+
- Keeps the device in `Disabled` state and Fabric Administrative State in `EnabledDegraded`
117+
118+
### Step 3: Ensure device is in ZTP Mode
87119

88120
Verify that the replacement device is in ZTP mode. If not, configure the device for ZTP before continuing.
89121

90122
> [!Note]
91123
> ZTP enables automatic configuration retrieval during the RMA process.
92124
93-
4. Set RMA State.
125+
### Step 4: Initiate RMA process
94126

95127
Initiate the RMA process using the following command:
96128

@@ -101,23 +133,33 @@ az networkfabric device update-admin-state \
101133
--resource-group "resource-group-name"
102134
```
103135

104-
This will:
136+
- Network Fabric Controller pushes all required configuration files to the new replaced device. It is advised to retry the operation if there's transient failures until success is confirmed.
137+
138+
- The device boots into its base configuration using the maintenance profile. This condition applies only to TOR and CE device types.
139+
140+
This action sets the following states:
105141

106-
- Trigger the Network Fabric Controller to push all required configuration files to the replacement device.
142+
- Device Administrative State: UnderMaintenance
107143

108-
- Retry the operation if there is transient failures until success is confirmed.
144+
- Fabric Administrative State: EnabledDegraded
109145

110-
5. Refresh configuration
146+
### Step 5: Refresh configuration
111147

112148
This step pushes the latest configuration to the device after it enters maintenance mode (applicable only for CE and TOR).
113149

114150
```Azure CLI
115151
az networkfabric device refresh-configuration --resource-name <resource-name> --resource-group <rg-name>
116152
```
117153

118-
This will push the latest config to the device.
154+
This action pushes the latest configuration to the device.
119155

120-
6. Enable administrative state.
156+
This action keeps the device in following states:
157+
158+
- Device Administrative State: UnderMaintenance
159+
160+
- Fabric Administrative State: EnabledDegraded
161+
162+
### Step 6: Enable administrative state.
121163

122164
Once configuration is applied successfully, bring the device back into active service:
123165

@@ -128,9 +170,14 @@ az networkfabric device update-admin-state \
128170
--resource-group "resource-group-name"
129171
```
130172

131-
This will:
173+
This action sets the following state once it's fully healthy and synchronized with the fabric:
174+
175+
- Device Administrative State: `Enabled`
176+
177+
- Fabric Administrative State: `Enabled`
132178

133-
- Sets device state to Enabled once it's fully healthy and synchronized with the fabric.
179+
>[!Note]
180+
> In a given fabric if there are any other device is in Disabled state then the Fabric Administrative State will maintained as : `EnabledDegraded`
134181
135182
## Summary
136183

0 commit comments

Comments
 (0)