Skip to content

Commit 3ff2fc5

Browse files
Merge pull request #295895 from SoniaLopezBravo/patch-1
Adding ARM errors to IoT Operations deployment troubleshooting
2 parents ebca7df + ac2520f commit 3ff2fc5

File tree

2 files changed

+57
-27
lines changed

2 files changed

+57
-27
lines changed

articles/iot-operations/troubleshoot/iot-operations-faq.yml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,7 @@ sections:
5151
- question: |
5252
Does Azure IoT Operations support Azure Private Link and private endpoints?
5353
answer: Azure IoT Operations currently does not support Azure Private Link and private endpoints.
54-
55-
5654
additionalContent: |
5755
## Related content
5856
59-
To learn more, see [IoT Operations overview](../overview-iot-operations.md) and [documentation](/azure/iot-operations/).
57+
To learn more, see [IoT Operations overview](../overview-iot-operations.md) and [documentation](/azure/iot-operations/).

articles/iot-operations/troubleshoot/troubleshoot.md

Lines changed: 56 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,61 @@
11
---
22
title: Troubleshoot Azure IoT Operations
3-
description: Troubleshoot your Azure IoT Operations deployment
3+
description: Troubleshoot your Azure IoT Operations deployment and configuration
44
author: SoniaLopezBravo
55
ms.author: sonialopez
66
ms.topic: troubleshooting-general
77
ms.custom:
88
- ignite-2023
9-
ms.date: 11/01/2024
9+
ms.date: 03/07/2025
1010
---
1111

1212
# Troubleshoot Azure IoT Operations
1313

1414
This article contains troubleshooting tips for Azure IoT Operations.
1515

16-
## General deployment troubleshooting
16+
## Troubleshoot Azure IoT Operations deployment
1717

18-
For general deployment and configuration troubleshooting, you can use the Azure CLI IoT Operations *check* and *support* commands.
18+
For general deployment and configuration troubleshooting, you can use the Azure CLI IoT Operations `check` and `support` commands.
1919

2020
[Azure CLI version 2.53.0 or higher](/cli/azure/install-azure-cli) is required and the [Azure IoT Operations extension](/cli/azure/iot/ops) installed.
2121

22-
- Use [az iot ops check](/cli/azure/iot/ops#az-iot-ops-check) to evaluate Azure IoT Operations service deployment for health, configuration, and usability. The *check* command can help you find problems in your deployment and configuration.
22+
- Use [az iot ops check](/cli/azure/iot/ops#az-iot-ops-check) to evaluate Azure IoT Operations service deployment for health, configuration, and usability. The `check` command can help you find problems in your deployment and configuration.
2323

24-
- Use [az iot ops support create-bundle](/cli/azure/iot/ops/support#az-iot-ops-support-create-bundle) to collect logs and traces to help you diagnose problems. The *support create-bundle* command creates a standard support bundle zip archive you can review or provide to Microsoft Support.
24+
- Use [az iot ops support create-bundle](/cli/azure/iot/ops/support#az-iot-ops-support-create-bundle) to collect logs and traces to help you diagnose problems. The `support create-bundle` command creates a standard support bundle zip archive you can review or provide to Microsoft Support.
2525

26-
## Secret management
26+
### You see an UnauthorizedNamespaceError error message
27+
28+
If you see the following error message, you either didn't enable the required Azure-arc custom locations feature, or you enabled the custom locations feature with an incorrect custom locations RP OID.
29+
30+
```ouput
31+
Message: Microsoft.ExtendedLocation resource provider does not have the required permissions to create a namespace on the cluster.
32+
```
33+
34+
To resolve, follow [this guidance](/azure-arc/kubernetes/custom-locations#enable-custom-locations-on-your-cluster) for enabling the custom locations feature with the correct OID.
35+
36+
### You see a MissingResourceVersionOnHost error message
37+
38+
If you see the following error message, your custom location resource associated with the deployment isn't properly configured with the API version(s) of resources attempting to be projected to the cluster.
39+
40+
```output
41+
Message: The resource {resource Id} extended location {custom location resource Id} does not support the resource type {IoT Operations resource type} or api version {IoT Operations ARM API}. Please check with the owner of the extended location to ensure the host has the CRD {custom resource name} with group {api group name}.iotoperations.azure.com, plural {custom resource plural name}, and versions [{api group version}] installed.
42+
```
43+
44+
To resolve, delete any provisioned resources associated with prior deployment(s) including custom locations. You can use `az iot ops delete` or alternative mechanism. Due to a potential caching issue, waiting a few minutes after deletion before re-deploying AIO or choosing a custom location name via `az iot ops create --custom-location` is recommended.
45+
46+
### You see a LinkedAuthorizationFailed error message
47+
48+
If you see the following error message, the logged-in principal doesn't have the required permissions to deploy resources to the resource group specified in the resource sync resource ID.
49+
50+
```output
51+
Message: The client {principal Id} with object id {principal object Id} has permission to perform action Microsoft.ExtendedLocation/customLocations/resourceSyncRules/write on scope {resource sync resource Id}; however, it does not have permission to perform action(s) Microsoft.Authorization/roleAssignments/write on the linked scope(s) {resource sync resource group} (respectively) or the linked scope(s) are invalid.
52+
```
53+
54+
Deployment of resource sync rules requires the logged-in principal to have the `Microsoft.Authorization/roleAssignments/write` permission against the resource group that resources are being deployed to. This is a necessary security constraint as edge to cloud resource hydration will create new resources in the target resource group.
55+
56+
To resolve, either elevate principal permissions, or don't deploy resource sync rules. Current AIO CLI has an opt-in mechanism to deploy resource sync rules via `--enable-rsync`. Simply omit this flag. Legacy AIO CLIs had an opt-out mechanism via `--disable-rsync-rules`.
57+
58+
## Troubleshoot Azure Key Vault secret management
2759

2860
If you see the following error message related to secret management, you need to update your Azure Key Vault contents:
2961

@@ -37,7 +69,7 @@ For help resolving this issue, please see https://go.microsoft.com/fwlink/?linki
3769

3870
This error occurs when Azure IoT Operations tries to synchronize a secret from Azure Key Vault that doesn't exist. To resolve this issue, you need to add the secret in Azure Key Vault before you create resources such as a secret provider class.
3971

40-
## Connector for OPC UA
72+
## Troubleshoot OPC UA server connections
4173

4274
An OPC UA server connection fails with a `BadSecurityModeRejected` error if the connector tries to connect to a server that only exposes endpoints with no security. There are two options to resolve this issue:
4375

@@ -50,11 +82,11 @@ An OPC UA server connection fails with a `BadSecurityModeRejected` error if the
5082

5183
- Add a secure endpoint to the OPC UA server and set up the certificate mutual trust to establish the connection.
5284

53-
## Azure IoT Layered Network Management (preview) troubleshooting
85+
## Troubleshoot Azure IoT Layered Network Management (preview)
5486

5587
The troubleshooting guidance in this section is specific to Azure IoT Operations when using the Layered Network Management component. For more information, see [How does Azure IoT Operations work in layered network?](../manage-layered-network/concept-iot-operations-in-layered-network.md).
5688

57-
### Can't install Layered Network Management on the parent level
89+
### You can't install Layered Network Management on the parent level
5890

5991
If the Layered Network Management operator install fails or you can't apply the custom resource for a Layered Network Management instance:
6092

@@ -63,7 +95,7 @@ If the Layered Network Management operator install fails or you can't apply the
6395
1. Verify the Layered Network Management operator is in the *Running and Ready* state.
6496
1. If applying the custom resource `kubectl apply -f cr.yaml` fails, the output of this command lists the reason for error. For example, CRD version mismatch or wrong entry in CRD.
6597

66-
### Can't Arc-enable the cluster through the parent level Layered Network Management
98+
### You can't Arc-enable the cluster through the parent level Layered Network Management
6799

68100
If you repeatedly remove and onboard a cluster with the same machine, you might get an error while Arc-enabling the cluster on nested layers. For example:
69101

@@ -81,12 +113,12 @@ If your cluster is behind an outbound proxy server, please ensure that you have
81113

82114
1. Reboot the host machine.
83115

84-
### Other types of Arc-enablement failures
116+
If you still see the error, check the following:
85117

86118
1. Add the `--debug` parameter when running the `connectedk8s` command.
87-
1. Capture and investigate a network packet trace. For more information, see [capture Layered Network Management packet trace](#capture-layered-network-management-packet-trace).
119+
1. Capture and investigate a network packet trace. For more information, see [capture Layered Network Management to a packet trace](#you-want-to-capture-layered-network-management-to-a-packet-trace).
88120

89-
### Can't install Azure IoT Operations on the isolated cluster
121+
### You can't install Azure IoT Operations on the isolated cluster
90122

91123
You can't install Azure IoT Operations components on nested layers. For example, Layered Network Management on level 4 is running but can't install Azure IoT Operations on level 3.
92124

@@ -99,10 +131,10 @@ You can't install Azure IoT Operations components on nested layers. For example,
99131

100132
DNS should respond with the IP address of the Layered Network Management service.
101133

102-
1. If the domain is being resolved correctly, verify the domain is added to the allowlist. For more information, see [Check the allowlist of Layered Network Management](#check-the-allowlist-of-layered-network-management).
103-
1. Capture and investigate a network packet trace. For more information, see [capture Layered Network Management packet trace](#capture-layered-network-management-packet-trace).
134+
1. If the domain is being resolved correctly, verify the domain is added to the allowlist. For more information, see [Can't connect to the Azure IoT Operations service from the child level Layered Network Management](#you-cant-connect-to-the-azure-iot-operations-service-from-the-child-level-layered-network-management).
135+
1. Capture and investigate a network packet trace. For more information, see [capture Layered Network Management to a packet trace](#you-want-to-capture-layered-network-management-to-a-packet-trace).
104136
105-
### A pod fails when installing Azure IoT Operations on an isolated cluster
137+
### You can install Azure IoT Operations on the isolated cluster but the pods fail to start
106138
107139
When installing the Azure IoT Operations components to a cluster, the installation starts and proceeds. However, initialization of one or few of the components (pods) fails.
108140
@@ -124,9 +156,9 @@ When installing the Azure IoT Operations components to a cluster, the installati
124156
Warning Failed 3m14s kubelet Failed to pull image "…
125157
```
126158
127-
### Check the allowlist of Layered Network Management
159+
### You can't connect to the Azure IoT Operations service from the child level Layered Network Management
128160

129-
Layered Network Management blocks traffic if the destination domain isn't on the allowlist.
161+
Layered Network Management blocks traffic if the destination domain isn't on the allowlist. The allowlist is a list of domains that are allowed to be accessed from the child level Layered Network Management. Check the allowlist of Layered Network Management to verify if the domain is included. If the domain isn't on the allowlist, you can add it to the allowlist.
130162

131163
1. Run the following command to list the config maps.
132164

@@ -150,18 +182,18 @@ Layered Network Management blocks traffic if the destination domain isn't on the
150182

151183
1. All the allowed domains are listed in the output.
152184

153-
### Capture Layered Network Management packet trace
185+
### You want to capture Layered Network Management to a packet trace
154186

155187
In some cases, you might suspect that Layered Network Management instance at the parent level isn't forwarding network traffic to a particular endpoint. Connection to a required endpoint is causing an issue for the service running on your node. It's possible that the service you enabled is trying to connect to a new endpoint after an update. Or you're trying to install a new Arc extension or service that requires connection to endpoints that aren't on the default allowlist. Usually there would be information in the error message to notify the connection failure. However, if there's no clear information about the missing endpoint, you can capture the network traffic on the child node for detailed debugging.
156188
157-
#### Windows host
189+
#### [Windows host](#tab/tabid-windows)
158190
159191
1. Install Wireshark network traffic analyzer on the host.
160192
1. Run Wireshark and start capturing.
161193
1. Reproduce the installation or connection failure.
162194
1. Stop capturing.
163195
164-
#### Linux host
196+
#### [Linux host](#tab/tabid-linux)
165197
166198
1. Run the following command to start capturing:
167199
@@ -172,14 +204,14 @@ In some cases, you might suspect that Layered Network Management instance at the
172204
1. Reproduce the installation or connection failure.
173205
1. Stop capturing.
174206
175-
#### Analyze the packet trace
207+
***
176208
177209
Use Wireshark to open the trace file. Look for connection failures or unresponsive connections.
178210
179211
1. Filter the packets with the *ip.addr == [IP address]* parameter. Input the IP address of your custom DNS service address.
180212
1. Review the DNS query and response, check if there's a domain name that isn't on the allowlist of Layered Network Management.
181213
182-
## Operations experience
214+
## Troubleshoot access to the operations experience web UI
183215
184216
To sign in to the [operations experience](https://iotoperations.azure.com) web UI, you need a Microsoft Entra ID account with at least contributor permissions for the resource group that contains your **Kubernetes - Azure Arc** instance.
185217

0 commit comments

Comments
 (0)