Skip to content

Commit ff6377d

Browse files
authored
Sync release-hotfixes with main
Sync release-hotfixes with main
2 parents 82b5b77 + 6088494 commit ff6377d

File tree

11 files changed

+124
-27
lines changed

11 files changed

+124
-27
lines changed

AKS-Arc/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,8 @@
181181
href: check-vm-sku.md
182182
- name: Connectivity issues with MetalLB
183183
href: load-balancer-issues.md
184+
- name: Troubleshoot general network validation errors
185+
href: network-validation-errors.md
184186
- name: Network validation error due to .local domain
185187
href: network-validation-error-local.md
186188
- name: Reference

AKS-Arc/network-validation-errors.md

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
---
2+
title: Troubleshoot network validation errors
3+
description: Learn how to troubleshoot general network validation errors in AKS Arc.
4+
author: sethmanheim
5+
ms.author: sethm
6+
ms.topic: troubleshooting
7+
ms.date: 05/07/2025
8+
ms.reviewer: pradwivedi
9+
ms.lastreviewed: 05/06/2025
10+
11+
---
12+
13+
# Troubleshoot network validation errors
14+
15+
This article describes how to identify and resolve various network validation errors you might encounter during cluster creation. The article emphasizes the importance of pre-checks for early issue detection. These errors are detected by pre-checks designed to highlight issues early, allowing for easier resolution before the cluster is created.
16+
17+
The article summarizes error codes, their potential causes, and actionable mitigation steps to help you resolve issues effectively.
18+
19+
## CloudAgentConnectivityError
20+
21+
Error: Network validation failed during cluster creation.
22+
23+
### Description
24+
25+
Detailed message: `Not able to connect to http://cloudagent.contoso.local:50000. Error returned: action failed after 5 attempts: Get "http://cloudagent.contoso.local:50000": dial tcp: lookup http://cloudagent.contoso.local: Temporary failure in name resolution`
26+
27+
The MOC cloud agent is created using one of the IP addresses from the [Management IP pool](/azure/azure-local/plan/cloud-deployment-network-considerations#management-ip-pool) on port 5500 and the control plane node VM is given IP addresses from the Arc VM logical network. This error occurs when the MOC cloud agent is not reachable from the control plane VM, or when the DNS servers specified in the Arc VM logical network are unable to resolve the MOC cloud agent FQDN.
28+
29+
### Causes of failure
30+
31+
Logical network IP addresses can't connect to management IP pool addresses, due to:
32+
33+
- Incorrect DNS server resolution.
34+
- Firewall rules between the Arc VM logical network and the cloud agent endpoint.
35+
- The logical network is in a different VLAN than the management IP pool and there's no cross-VLAN connectivity.
36+
37+
### Mitigation
38+
39+
To resolve this error, you can take the following steps:
40+
41+
- Make sure that the DNS servers specified in the Arc VM logical network can resolve the MOC cloud agent FQDN.
42+
- Make sure that the logical network IP addresses can connect to all the management IP pool addresses on the required ports. For a detailed list of ports that need to be opened, see [AKS network port and cross-VLAN requirements](aks-hci-network-system-requirements.md#network-port-and-cross-vlan-requirements).
43+
44+
## InternetConnectivityError
45+
46+
Error: Network validation failed during cluster creation.
47+
48+
### Description
49+
50+
Detailed message: `Not able to connect to https://mcr.microsoft.com. Error returned: action failed after 5 attempts: Get "https://mcr.microsoft.com": dial tcp: lookup mcr.microsoft.com on <>: read udp <>: i/o timeout`.
51+
52+
This error indicates that the required URLs are not reachable from the AKS cluster control plane node VM.
53+
54+
### Causes of failure
55+
56+
- Control plane node VM has no outbound internet access.
57+
- Required URLs aren't allowed through the firewall.
58+
59+
### Mitigation
60+
61+
To resolve this error, ensure that the logical network IP addresses have outbound internet access. If there's a firewall, ensure that the [AKS required URLs](aks-hci-network-system-requirements.md#firewall-url-exceptions) are accessible from the Arc VM logical network.
62+
63+
## VMNotReachableError
64+
65+
Error: Network validation failed during cluster creation.
66+
67+
### Description
68+
69+
Detailed message: `VM IP : <> is not reachable from management cluster`.
70+
71+
This error indicates that the AKS cluster control plane VM is not reachable from the Arc Resource Bridge (ARB).
72+
73+
### Causes of failure
74+
75+
The Arc VM logical network is not reachable from management IP pool addresses.
76+
77+
### Mitigation
78+
79+
To resolve this error, you can take the following steps:
80+
81+
- Make sure that the management IP pool addresses can reach the logical network IP addresses.
82+
- For a detailed list of ports that need to be opened, see [AKS network port and cross-VLAN requirements](aks-hci-network-system-requirements.md#network-port-and-cross-vlan-requirements).
83+
84+
## DNSResolutionError
85+
86+
This error occurs when DNS servers specified in the Arc VM logical network can't resolve the MOC cloud FQDN or the required URLs.
87+
88+
### Causes of failure
89+
90+
DNS servers specified in a logical network can't resolve the MOC cloud FQDN or the required URLs.
91+
92+
### Mitigation
93+
94+
To resolve this error, check the DNS servers specified in the logical network so that they can resolve the MOC cloud FQDN or the required URLs.
95+
96+
## Contact Microsoft Support
97+
98+
If problems persist, [collect AKS cluster logs](get-on-demand-logs.md) before you [create a support request](aks-troubleshoot.md#open-a-support-request).
99+
100+
## Next steps
101+
102+
[Troubleshoot issues in AKS enabled by Azure Arc](aks-troubleshoot.md)

azure-local/concepts/firewall-requirements.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,11 @@ For a consolidated list of endpoints for Japan East that includes Azure Local, A
6565
For a consolidated list of endpoints for South Central US that includes Azure Local, Arc-enabled servers, ARB, and AKS, use:
6666
- [Required endpoints in South Central US for Azure Local](https://github.com/Azure/AzureStack-Tools/blob/master/HCI/SouthCentralUSEndpoints/southcentralus-hci-endpoints.md)
6767

68+
## Required firewall URLs for Azure Local in Azure Government regions
69+
70+
For a consolidated list of endpoints for US Gov Virginia that includes Azure Local, Arc-enabled servers, ARB, and AKS, use:
71+
- [Required endpoints in US Gov Virginia for Azure Local](https://github.com/CristianEdwards/AzureStack-Tools/blob/master/HCI/usgovvirginia-hci-endpoints/usgovvirginia-hci-endpoints.md)
72+
6873
## Firewall requirements for OEMs
6974

7075
Depending on the OEM you are using for Azure Local you may need to open additional endpoints in your firewall.

azure-local/concepts/software-defined-networking.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ author: AnirbanPaul
55
ms.author: anpaul
66
ms.topic: conceptual
77
ms.service: azure-local
8-
ms.date: 04/17/2023
8+
ms.date: 05/06/2025
99
---
1010

1111
# Software Defined Networking (SDN) in Azure Stack HCI and Windows Server
@@ -21,7 +21,7 @@ Virtual network elements such as [Hyper-V Virtual Switch](/windows-server/virtua
2121
There are three major SDN components, and you can choose which you want to deploy: Network Controller, Software Load Balancer, and Gateway.
2222

2323
> [!NOTE]
24-
> SDN is not supported on stretched (multi-site) clusters.
24+
> SDN isn't supported on stretched (multi-site) clusters.
2525
2626
## Network Controller
2727

@@ -45,7 +45,7 @@ You have the option to [deploy SDN Network Controller using SDN Express](../mana
4545
Gateways are used for routing network traffic between a virtual network and another network, either local or remote. Gateways can be used to:
4646

4747
- Create secure site-to-site IPsec connections between SDN virtual networks and external customer networks over the internet.
48-
- Create Generic Routing Encapsulation (GRE) connections between SDN virtual networks and external networks. The difference between site-to-site connections and GRE connections is that the latter is not an encrypted connection. For more information about GRE connectivity scenarios, see [GRE Tunneling in Windows Server](/windows-server/remote/remote-access/ras-gateway/gre-tunneling-windows-server).
48+
- Create Generic Routing Encapsulation (GRE) connections between SDN virtual networks and external networks. The difference between site-to-site connections and GRE connections is that the latter isn't an encrypted connection. For more information about GRE connectivity scenarios, see [GRE Tunneling in Windows Server](/windows-server/remote/remote-access/ras-gateway/gre-tunneling-windows-server).
4949
- Create Layer 3 connections between SDN virtual networks and external networks. In this case, the SDN gateway simply acts as a router between your virtual network and the external network.
5050

5151
Gateways use [Border Gateway Protocol](/windows-server/remote/remote-access/bgp/border-gateway-protocol-bgp) to advertise GRE endpoints and establish point-to-point connections. SDN deployment creates a default gateway pool that supports all connection types. Within this pool, you can specify how many gateways are reserved on standby in case an active gateway fails.

azure-local/concepts/system-requirements.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.author: alkohli
66
ms.topic: how-to
77
ms.service: azure-local
88
ms.custom: references_regions
9-
ms.date: 04/29/2025
9+
ms.date: 05/06/2025
1010
---
1111

1212
# System requirements for Azure Stack HCI, version 22H2
@@ -112,7 +112,7 @@ For best results, adhere to the following requirements:
112112
An Azure Stack HCI cluster requires a reliable high-bandwidth, low-latency network connection between each server node.
113113

114114
- Verify at least one network adapter is available and dedicated for cluster management.
115-
- Verify that physical switches in your network are configured to allow traffic on any VLANs you'll use.
115+
- Verify that physical switches in your network are configured to allow traffic on any VLANs you use.
116116

117117
For physical networking considerations and requirements, see [Physical network requirements](physical-network-requirements.md).
118118

@@ -153,7 +153,7 @@ If you use Windows Admin Center to [create](../deploy/create-cluster.md) or [man
153153

154154
- If you're running Windows Admin Center on a server (instead of a local PC), use an account that's a member of the Gateway Administrators group, or the local Administrators group on the Windows Admin Center server.
155155

156-
- Verify that your Windows Admin Center management computer is joined to the same Active Directory domain in which you'll create the cluster, or joined to a fully trusted domain. The servers that you'll cluster don't need to belong to the domain yet; they can be added to the domain during cluster creation.
156+
- Verify that your Windows Admin Center management computer is joined to the same Active Directory domain in which you create the cluster, or joined to a fully trusted domain. The servers that you'll cluster don't need to belong to the domain yet; they can be added to the domain during cluster creation.
157157

158158
## Maximum supported hardware specifications
159159

azure-local/deploy/deployment-arc-register-server-permissions.md

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Register your Azure Local machines with Azure Arc and assign permissions
33
description: Learn how to Register your Azure Local machines with Azure Arc and assign permissions for deployment.
44
author: alkohli
55
ms.topic: how-to
6-
ms.date: 05/02/2025
6+
ms.date: 05/06/2025
77
ms.author: alkohli
88
ms.service: azure-local
99
ms.custom: devx-track-azurepowershell
@@ -129,7 +129,7 @@ Before you begin, make sure you complete the following prerequisites:
129129
# [Output](#tab/output)
130130
131131
Here's a sample output of a successful registration of your machines:
132-
132+
133133
```output
134134
PS C:\Users\Administrator> Invoke-AzStackHciArcInitialization -SubscriptionID $Subscription -ResourceGroup $RG -TenantID $Tenant -Region $Region -Cloud "AzureCloud" -ArmAccessToken $ARMtoken -AccountID $id
135135
>>
@@ -154,17 +154,13 @@ Before you begin, make sure you complete the following prerequisites:
154154
155155
4. After the script completes successfully on all the machines, verify that:
156156
157-
158157
1. Your machines are registered with Arc. Go to the Azure portal and then go to the resource group associated with the registration. The machines appear within the specified resource group as **Machine - Azure Arc** type resources.
159158
160159
:::image type="content" source="media/deployment-arc-register-server-permissions/arc-servers-registered-1.png" alt-text="Screenshot of the Azure Local machines in the resource group after the successful registration." lightbox="./media/deployment-arc-register-server-permissions/arc-servers-registered-1.png":::
161160
162-
163-
164161
> [!NOTE]
165162
> Once an Azure Local machine is registered with Azure Arc, the only way to undo the registration is to install the operating system again on the machine.
166163
167-
168164
## Assign required permissions for deployment
169165
170166
This section describes how to assign Azure permissions for deployment from the Azure portal.
@@ -188,17 +184,9 @@ This section describes how to assign Azure permissions for deployment from the A
188184
- **Key Vault Secrets Officer**: This permission is required to read and write secrets in the key vault used for deployment.
189185
- **Key Vault Contributor**: This permission is required to create the key vault used for deployment.
190186
- **Storage Account Contributor**: This permission is required to create the storage account used for deployment.
191-
192187
193188
1. In the right pane, go to **Role assignments**. Verify that the deployment user has all the configured roles.
194189
195-
1. In the Azure portal, go to **Microsoft Entra Roles and Administrators** and assign the **Cloud Application Administrator** role permission at the Microsoft Entra tenant level.
196-
197-
:::image type="content" source="media/deployment-arc-register-server-permissions/cloud-application-administrator-role-at-tenant.png" alt-text="Screenshot of the Cloud Application Administrator permission at the tenant level." lightbox="./media/deployment-arc-register-server-permissions/cloud-application-administrator-role-at-tenant.png":::
198-
199-
> [!NOTE]
200-
> The Cloud Application Administrator permission is temporarily needed to create the service principal. After deployment, this permission can be removed.
201-
202190
## Next steps
203191
204192
After setting up the first machine in your instance, you're ready to deploy using Azure portal:

azure-local/deploy/deployment-azure-resource-manager-template.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Azure Resource Manager template deployment for Azure Local, version 23H2
33
description: Learn how to prepare and then deploy Azure Local instance, version 23H2 using the Azure Resource Manager template.
44
author: alkohli
55
ms.topic: how-to
6-
ms.date: 05/01/2025
6+
ms.date: 05/06/2025
77
ms.author: alkohli
88
ms.reviewer: alkohli
99
ms.service: azure-local
@@ -58,7 +58,7 @@ A Resource Manager template creates and assigns all the resource permissions req
5858
With all the prerequisite and preparation steps complete, you're ready to deploy using a known good and tested Resource Manager deployment template and corresponding parameters JSON file. Use the parameters contained in the JSON file to fill out all values, including the values generated previously.
5959
6060
> [!IMPORTANT]
61-
> In this release, make sure that all the parameters contained in the JSON value are filled out including the ones that have a null value. If there are null values, then those need to be populated or the validation fails.
61+
> In this release, make sure that all the parameters contained in the JSON value are filled out including the ones that have a null value. If there are null values, then those parameters need to be populated or the validation fails.
6262
6363
1. In the Azure portal, go to **Home** and select **+ Create a resource**.
6464

azure-local/manage/maintain-servers.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ author: alkohli
55
ms.author: alkohli
66
ms.topic: how-to
77
ms.service: azure-local
8-
ms.date: 04/29/2025
8+
ms.date: 05/06/2025
99
---
1010

1111
# Failover cluster maintenance procedures
@@ -40,7 +40,7 @@ Before either shutting down or restarting a server, you should pause the server
4040

4141
*Pause server(s) for maintenance: Are you sure you want to pause server(s)? This moves workloads, such as virtual machines, to other servers in the cluster.​*
4242

43-
3. Select **yes** to pause the server and initiate the drain process. The server status shows as **In maintenance, Draining**, and roles such as Hyper-V and VMs will immediately begin live migrating to other server(s) in the cluster. This can take a few minutes. No roles can be added to the server until it's resumed. When the draining process is finished, the server status shows as **In maintenance, Drain completed**. The operating system performs an automatic safety check to ensure it's safe to proceed. If there are unhealthy volumes, it stops and alerts you that it's not safe to proceed.
43+
3. Select **yes** to pause the server and initiate the drain process. The server status shows as **In maintenance, Draining**, and roles such as Hyper-V and VMs immediately begin live migrating to other server(s) in the cluster. This can take a few minutes. No roles can be added to the server until it resumes. When the draining process is finished, the server status shows as **In maintenance, Drain completed**. The operating system performs an automatic safety check to ensure it's safe to proceed. If there are unhealthy volumes, it stops and alerts you that it's not safe to proceed.
4444

4545
### Shut down the server
4646

@@ -168,7 +168,7 @@ Resume-ClusterNode –Failback Immediate
168168

169169
To do this in Failover Cluster Manager, go to **Nodes**, right-click the node, and then select **Resume** > **Fail Roles Back**.
170170

171-
Once the server has resumed, it shows as **Up** in PowerShell and Failover Cluster Manager.
171+
Once the server resumes, it shows as **Up** in PowerShell and Failover Cluster Manager.
172172

173173
### Wait for storage to resync
174174

azure-local/manage/witness.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Set up a cluster witness
33
description: Learn how to set up a cluster witness
44
author: alkohli
55
ms.topic: how-to
6-
ms.date: 04/29/2025
6+
ms.date: 05/06/2025
77
ms.author: alkohli
88
ms.reviewer: stevenek
99
---
@@ -24,7 +24,7 @@ To learn more about cluster witnesses and quorum, see [Understanding cluster and
2424

2525
## Before you begin
2626

27-
Before you can create a cloud witness, you must have an Azure account and subscription, and register your Azure Stack HCI cluster with Azure. See the following articles for more information:
27+
Before you can create a cloud witness, you must have an Azure account and subscription, and register your Azure Stack HCI cluster with Azure. For more information, see the following articles:
2828

2929
- Make sure that port 443 is open in your firewalls and that `*.core.windows.net` is included in any firewall allowlists you're using between the cluster and Azure Storage. For details, see [Required firewall URLs](../concepts/firewall-requirements.md).
3030
- If your network uses a proxy server for internet access, you must [configure proxy settings for Azure Stack HCI](./configure-proxy-settings.md).

0 commit comments

Comments
 (0)