Skip to content

Commit 7b3c257

Browse files
committed
Troubleshoot deployment validation steps
1 parent 64e3d8d commit 7b3c257

File tree

1 file changed

+59
-35
lines changed

1 file changed

+59
-35
lines changed

azure-stack/hci/manage/troubleshoot-deployment.md

Lines changed: 59 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,22 @@
11
---
2-
3-
foundinbuild: Deployments after August 6th, 2024
4-
fixedinbuild: Targeted within 2-weeks of August 14th, 2024
5-
engineeringid: 532173261
2+
title: Troubleshoot deployment validation issues in Azure Stack HCI, version 23H2 via Azure portal
3+
description: Learn how to troubleshoot the deployment validation failures for Azure Stack HCI, version 23H2 when deployed via the Azure portal.
4+
ms.topic: how-to
5+
ms.author: alkohli
6+
author: alkohli
7+
ms.date: 08/21/2024
68
---
79

8-
# Background
9-
A new portal validation method which calculates the ipv4Address of the host machine was rolled out 1st week of August 2024. From the PG: "If some switch is configured for the device, we find the corresponding Virtual NIC whose MAC Address matches that of the MAC Address of the Management NIC. In these customer cases, a switch was found but that switch was not configured to any Virtual NIC, hence the missing ipv4 address, because the switch was incorrectly configured for the customer device."
1010

11-
## Error message
11+
# Troubleshoot Azure portal deployment validation issues for Azure Stack HCI, version 23H2
1212

13+
[!INCLUDE [hci-applies-to-23h2](../../includes/hci-applies-to-23h2.md)]
14+
15+
This article provides guidance on how to troubleshoot deployment validation issues experienced during the deployment of your Azure Stack HCI cluster via the Azure portal.
16+
17+
## Error - deployment vaildation failure with not a valid IPv4 address
18+
19+
When deploying Azure Stack HCI, version 23H2 via the Azure portal, you might encounter a deployment validation failure.
1320
The "Azure Stack HCI Network - Check network requirements" validation task fail with the following error:
1421

1522
```
@@ -20,40 +27,51 @@ found error at deploymentdata.physicalnodes[0].ipv4address: The specified for
2027
Example: 192.168.0.1 or 192.168.0.1","Target":null,"Details":null}].
2128
```
2229

23-
If you go to the **Networking** tab in Azure portal deployment, within the **Network Intent** configuration, the following error could be seen:
30+
If you go to the **Networking** tab in Azure portal deployment, within the **Network Intent** configuration, you could see the following error:
2431

25-
`The selected physical network adapter is no binded to the management virtual Switch.`
32+
```
33+
The selected physical network adapter is not binded to the management virtual Switch.
34+
```
2635

2736
## Cause
2837

2938
This issue occurs on deployments triggered after August 6th. The issue only happens if the deployment validation was triggered on the cluster and the validation result was a failure, with subsequent validation retries.
3039

31-
The mismatch occurs for the following reason:
32-
- Validation on the device creates a VM Switch for network related tests and is deleted at the end of tests.
33-
- DeviceManagementExtension extension is not detecting this deletion of the VM Switch.
40+
The issue occurs for the following reason:
41+
42+
- Validation on the device creates a VM switch for network related tests and is deleted at the end of tests.
43+
- `DeviceManagementExtension` extension is not detecting the deletion of the VM switch.
3444

3545
## Recommended resolution
3646

47+
The multi-step resolution process includes the following steps:
3748

49+
- [Remove the lock from the seed node](#remove-the-lock-from-the-seed-node)
50+
- [Remove the validation error](#remove-the-validation-error)
51+
- [Clean up the Edge Device Azure Resource with incorrect VM switch information](#clean-up-the-edge-device-azure--resource-with-incorrect-vm-switch-information)
52+
- [Refresh the cloud data](#refresh-the-cloud-edgedevices-data)
53+
- [Redo the Azure portal](#redo-the-azure-portal)
54+
- [Recreate the lock on the seed node resource](#recreate-the-lock-on-the-seed-node-resource)
3855

3956
### Remove the lock from the seed node
4057

41-
> All steps below only need to be performed for the seed node.
58+
> [!NOTE]
59+
> All the steps in this section need to be performed on the seed node.
4260
4361
Follow this multi-step process to mitigate this validation error:
4462

4563
1. To remove the lock, in the Azure portal, go to the object via the resource group or within Machines - Azure Arc.
4664
1. In the left-pane, go to **Settings > Locks**. You should see a lock named **DoNotDelete**. This is the automatic resource lock that is created when the node is onboarded.
4765
1. Select **Delete** against the lock.
4866

49-
If you attmept the steps in the next section without removing the lock, the Delete command will fail with the following error:
67+
If you attmept the steps in the next section without removing the lock, the **Delete** command fails with the following error:
5068

51-
```Output
69+
```
5270
Some resources failed to be deleted (run with `--verbose` for more information):
5371
/subscriptions/<subid>/resourceGroups/<rgname>/providers/Microsoft.HybridCompute/machines/<machinename>/providers/Microsoft.AzureStackHCI/edgeDevices/default
5472
```
5573

56-
Running the `--verbose` switch provides the following additional output:
74+
Here is the example output when run with the `--verbose` switch:
5775

5876
```Output
5977
(ScopeLocked) The scope '/subscriptions/<subid>/resourceGroups/<rgname>/providers/Microsoft.HybridCompute/machines/<machinename>/providers/Microsoft.AzureStackHCI/edgeDevices/default' cannot perform delete operation because following scope(s) are locked: '/subscriptions/<subid>/resourceGroups/<rgname>/providers/Microsoft.HybridCompute/machines/<machinename>'. Please remove the lock and try again.
@@ -71,21 +89,21 @@ With the lock removed, follow these steps to remove the validation error.
7189
Get-VMSwitch
7290
```
7391
74-
1. Check the output of the `Get-VMSwitch` command for any unexpected VM Switches, for example, the switch that gets created during the Network Validation step and has a name similar to: `"ConvergedSwitch(compute_management)"`. The name of the switch depends on the chosen network intent configuration.
92+
1. Check the output of the `Get-VMSwitch` command for any unexpected VM switches, for example, the switch that gets created during the Network Validation step and has a name similar to: `"ConvergedSwitch(compute_management)"`. The exact name of the switch depends on the chosen network intent configuration.
7593
76-
1. If a VM switch that you didn't intentionally create (like the one named above) exists, remove the switch. Run the following PowerShell command:
94+
1. If a VM switch that you didn't intentionally create exists, remove the switch. Run the following PowerShell command:
7795
7896
```PowerShell
7997
Remove-VMSwitch -Name "<VM Switch Name>" -Force
8098
```
81-
Make sure to use the VM switch name from the `Get-VMSwitch` command. If you didn't intentionally create a VM switch, the Get-VMSwitch command will have no results.
82-
- **Note 2:** If the customer did not intentionally create a VM Switch, it is likely that the output of the `Get-VMSwitch` command will have no results, this is due to the fact that the Network Validation Step cleaned this VM Switch up, and it is the DeviceManagementExtension that did not detect the cleanup, leaving the Deployment Validation in this fail state.
99+
100+
Make sure to use the VM switch name from the `Get-VMSwitch` command. If you didn't intentionally create a VM switch, the `Get-VMSwitch` command will have no results. The failure occurs because the Network Validation Step cleaned up the VM switch, but the `DeviceManagementExtension` didn't detect the cleanup.
83101
84102
1. Confirm the device cleanup and wait for the cleanup to complete.
85103
86-
### Clean up the Edge Device Azure Resource with incorrect VM Switch information
104+
### Clean up the Edge Device Azure Resource with incorrect VM switch information
87105
88-
After the VM switch on the device is removed, clean up the Edge Device ARM resource containing the incorrect VM switch information. This ARM side cleanup can be done via AZ CLI.
106+
After the VM switch on the device is removed, clean up the Edge Device ARM resource containing the incorrect VM switch information via the Azure CLI.
89107
90108
1. On a machine that the customer uses with access to Azure, verify install or install AZ CLI: [Install Azure CLI on Windows](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-windows?tabs=azure-cli)
91109
- You can verify install by running: `az`
@@ -133,7 +151,9 @@ Message: The resource 'Microsoft.HybridCompute/machines/<machineNAME>/providers/
133151
134152
### Refresh the cloud `edgeDevices` data
135153
136-
With the ARM resource and all the unintentional VM switches removed, refresh the cloud-side `edgeDevices` data again. Follow these steps to refresh the cloud data:
154+
With the ARM resource and all the unintentional VM switches removed, refresh the cloud-side `edgeDevices` data again.
155+
156+
Follow these steps to refresh the cloud data:
137157
138158
1. Restart the `DeviceManagementService` on the seed node. Run the following PowerShell command:
139159
@@ -148,27 +168,31 @@ With the ARM resource and all the unintentional VM switches removed, refresh the
148168
149169
### Redo the Azure portal
150170
151-
With device and cloud-side data now back in sync, you can go to the Azure portal and navigate to Basics tab in the deployment UX. This should prevent any cached information from previous attempts. Follow these steps in the Azure portal:
171+
With device and cloud data now back in sync, you can go to the Azure portal and provide the deployment inputs. The previous step prevents any cached information from previous attempts.
172+
173+
Follow these steps in the Azure portal:
152174
153175
1. On the **Basics** tab, provide your inputs (by selecting from the dropdowns once again) to the fields from the top.
154176
155177
1. Uncheck the nodes at the bottom of the page.
156178
157179
1. Revalidate the reselected nodes.
158180
159-
1. Confirm the information on the subsequent pages.
160-
- **Note:** When you get to the Network page, you should no longer see the `"The selected physical network adapter is no binded to the management virtual Switch"` error that might have been seen previously.
161-
- When you get to the Validation page at the end, you should know if you are past the original issue, as the `"deploymentdata.physicalnodes[0].ipv4address is not a valid IPv4 address"` error should no longer immediately pop.
162-
1. If no other/different validation issues occur, the Deployment should be ready to kick off and run.
181+
1. Confirm the information on the subsequent pages. You should see the following changes:
182+
- On the **Network** page, you should no longer see the `"The selected physical network adapter is no binded to the management virtual Switch"` error that might have been seen previously.
183+
- On the **Validation** page at the end, if you are past the original issue, the `"deploymentdata.physicalnodes[0].ipv4address is not a valid IPv4 address"` error won't be displayed.
163184
185+
1. If no other validation issues occur, start the deployment.
164186
165-
> :ledger: **NOTE**
166-
>
167-
> Once the mitigation has been completed, it is **HIGHLY** recommended to recreate the lock on the resource. Steps for this are included after the mitigation steps.
187+
### Recreate the lock on the seed node resource
168188
169-
Once mitigation is successful, you should recreate the lock on the seed node resource. To do this, navigate to the object via the resource group or within 'Machines - Azure Arc'. Expand 'Settings' in the left column, then click on 'Locks'. Click '**+ Add**' at the top of the page.
189+
After the mitigation is complete, we strongly recommend that you recreate the lock on the resource.
170190
171-
For 'Lock name', enter '**DoNotDelete**'.
172-
For 'Lock type', select '**Delete**' from the drop-down.
191+
Follow these steps to recreate the lock:
173192
174-
Click 'OK' to save the lock.
193+
1. In the Azure portal, go to the object via the resource group or within **Machines - Azure Arc**.
194+
1. Expand '**Settings** in the left column, then select **Locks**.
195+
1. Select '**+ Add**' at the top of the page.
196+
1. For **Lock name**, enter **DoNotDelete**.
197+
1. For **Lock type**, select '**Delete**' from the drop-down.
198+
1. Select **OK** to save the lock.

0 commit comments

Comments
 (0)