You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/iot-operations/troubleshoot/troubleshoot.md
+56-24Lines changed: 56 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,29 +1,61 @@
1
1
---
2
2
title: Troubleshoot Azure IoT Operations
3
-
description: Troubleshoot your Azure IoT Operations deployment
3
+
description: Troubleshoot your Azure IoT Operations deployment and configuration
4
4
author: SoniaLopezBravo
5
5
ms.author: sonialopez
6
6
ms.topic: troubleshooting-general
7
7
ms.custom:
8
8
- ignite-2023
9
-
ms.date: 11/01/2024
9
+
ms.date: 03/07/2025
10
10
---
11
11
12
12
# Troubleshoot Azure IoT Operations
13
13
14
14
This article contains troubleshooting tips for Azure IoT Operations.
15
15
16
-
## General deployment troubleshooting
16
+
## Troubleshoot Azure IoT Operations deployment
17
17
18
-
For general deployment and configuration troubleshooting, you can use the Azure CLI IoT Operations *check* and *support* commands.
18
+
For general deployment and configuration troubleshooting, you can use the Azure CLI IoT Operations `check` and `support` commands.
19
19
20
20
[Azure CLI version 2.53.0 or higher](/cli/azure/install-azure-cli) is required and the [Azure IoT Operations extension](/cli/azure/iot/ops) installed.
21
21
22
-
- Use [az iot ops check](/cli/azure/iot/ops#az-iot-ops-check) to evaluate Azure IoT Operations service deployment for health, configuration, and usability. The *check* command can help you find problems in your deployment and configuration.
22
+
- Use [az iot ops check](/cli/azure/iot/ops#az-iot-ops-check) to evaluate Azure IoT Operations service deployment for health, configuration, and usability. The `check` command can help you find problems in your deployment and configuration.
23
23
24
-
- Use [az iot ops support create-bundle](/cli/azure/iot/ops/support#az-iot-ops-support-create-bundle) to collect logs and traces to help you diagnose problems. The *support create-bundle* command creates a standard support bundle zip archive you can review or provide to Microsoft Support.
24
+
- Use [az iot ops support create-bundle](/cli/azure/iot/ops/support#az-iot-ops-support-create-bundle) to collect logs and traces to help you diagnose problems. The `support create-bundle` command creates a standard support bundle zip archive you can review or provide to Microsoft Support.
25
25
26
-
## Secret management
26
+
### You see an UnauthorizedNamespaceError error message
27
+
28
+
If you see the following error message, you either didn't enable the required Azure-arc custom locations feature, or you enabled the custom locations feature with an incorrect custom locations RP OID.
29
+
30
+
```ouput
31
+
Message: Microsoft.ExtendedLocation resource provider does not have the required permissions to create a namespace on the cluster.
32
+
```
33
+
34
+
To resolve, follow [this guidance](/azure-arc/kubernetes/custom-locations#enable-custom-locations-on-your-cluster) for enabling the custom locations feature with the correct OID.
35
+
36
+
### You see a MissingResourceVersionOnHost error message
37
+
38
+
If you see the following error message, your custom location resource associated with the deployment isn't properly configured with the API version(s) of resources attempting to be projected to the cluster.
39
+
40
+
```output
41
+
Message: The resource {resource Id} extended location {custom location resource Id} does not support the resource type {IoT Operations resource type} or api version {IoT Operations ARM API}. Please check with the owner of the extended location to ensure the host has the CRD {custom resource name} with group {api group name}.iotoperations.azure.com, plural {custom resource plural name}, and versions [{api group version}] installed.
42
+
```
43
+
44
+
To resolve, delete any provisioned resources associated with prior deployment(s) including custom locations. You can use `az iot ops delete` or alternative mechanism. Due to a potential caching issue, waiting a few minutes after deletion before re-deploying AIO or choosing a custom location name via `az iot ops create --custom-location` is recommended.
45
+
46
+
### You see a LinkedAuthorizationFailed error message
47
+
48
+
If you see the following error message, the logged-in principal doesn't have the required permissions to deploy resources to the resource group specified in the resource sync resource ID.
49
+
50
+
```output
51
+
Message: The client {principal Id} with object id {principal object Id} has permission to perform action Microsoft.ExtendedLocation/customLocations/resourceSyncRules/write on scope {resource sync resource Id}; however, it does not have permission to perform action(s) Microsoft.Authorization/roleAssignments/write on the linked scope(s) {resource sync resource group} (respectively) or the linked scope(s) are invalid.
52
+
```
53
+
54
+
Deployment of resource sync rules requires the logged-in principal to have the `Microsoft.Authorization/roleAssignments/write` permission against the resource group that resources are being deployed to. This is a necessary security constraint as edge to cloud resource hydration will create new resources in the target resource group.
55
+
56
+
To resolve, either elevate principal permissions, or don't deploy resource sync rules. Current AIO CLI has an opt-in mechanism to deploy resource sync rules via `--enable-rsync`. Simply omit this flag. Legacy AIO CLIs had an opt-out mechanism via `--disable-rsync-rules`.
57
+
58
+
## Troubleshoot Azure Key Vault secret management
27
59
28
60
If you see the following error message related to secret management, you need to update your Azure Key Vault contents:
29
61
@@ -37,7 +69,7 @@ For help resolving this issue, please see https://go.microsoft.com/fwlink/?linki
37
69
38
70
This error occurs when Azure IoT Operations tries to synchronize a secret from Azure Key Vault that doesn't exist. To resolve this issue, you need to add the secret in Azure Key Vault before you create resources such as a secret provider class.
39
71
40
-
## Connector for OPC UA
72
+
## Troubleshoot OPC UA server connections
41
73
42
74
An OPC UA server connection fails with a `BadSecurityModeRejected` error if the connector tries to connect to a server that only exposes endpoints with no security. There are two options to resolve this issue:
43
75
@@ -50,11 +82,11 @@ An OPC UA server connection fails with a `BadSecurityModeRejected` error if the
50
82
51
83
- Add a secure endpoint to the OPC UA server and set up the certificate mutual trust to establish the connection.
The troubleshooting guidance in this section is specific to Azure IoT Operations when using the Layered Network Management component. For more information, see [How does Azure IoT Operations work in layered network?](../manage-layered-network/concept-iot-operations-in-layered-network.md).
56
88
57
-
### Can't install Layered Network Management on the parent level
89
+
### You can't install Layered Network Management on the parent level
58
90
59
91
If the Layered Network Management operator install fails or you can't apply the custom resource for a Layered Network Management instance:
60
92
@@ -63,7 +95,7 @@ If the Layered Network Management operator install fails or you can't apply the
63
95
1. Verify the Layered Network Management operator is in the *Running and Ready* state.
64
96
1. If applying the custom resource `kubectl apply -f cr.yaml` fails, the output of this command lists the reason for error. For example, CRD version mismatch or wrong entry in CRD.
65
97
66
-
### Can't Arc-enable the cluster through the parent level Layered Network Management
98
+
### You can't Arc-enable the cluster through the parent level Layered Network Management
67
99
68
100
If you repeatedly remove and onboard a cluster with the same machine, you might get an error while Arc-enabling the cluster on nested layers. For example:
69
101
@@ -81,12 +113,12 @@ If your cluster is behind an outbound proxy server, please ensure that you have
81
113
82
114
1. Reboot the host machine.
83
115
84
-
### Other types of Arc-enablement failures
116
+
If you still see the error, check the following:
85
117
86
118
1. Add the `--debug` parameter when running the `connectedk8s` command.
87
-
1. Capture and investigate a network packet trace. For more information, see [capture Layered Network Management packet trace](#capture-layered-network-management-packet-trace).
119
+
1. Capture and investigate a network packet trace. For more information, see [capture Layered Network Management to a packet trace](#you-want-to-capture-layered-network-management-to-a-packet-trace).
88
120
89
-
### Can't install Azure IoT Operations on the isolated cluster
121
+
### You can't install Azure IoT Operations on the isolated cluster
90
122
91
123
You can't install Azure IoT Operations components on nested layers. For example, Layered Network Management on level 4 is running but can't install Azure IoT Operations on level 3.
92
124
@@ -99,10 +131,10 @@ You can't install Azure IoT Operations components on nested layers. For example,
99
131
100
132
DNS should respond with the IP address of the Layered Network Management service.
101
133
102
-
1. If the domain is being resolved correctly, verify the domain is added to the allowlist. For more information, see [Check the allowlist of Layered Network Management](#check-the-allowlist-of-layered-network-management).
103
-
1. Capture and investigate a network packet trace. For more information, see [capture Layered Network Management packet trace](#capture-layered-network-management-packet-trace).
134
+
1. If the domain is being resolved correctly, verify the domain is added to the allowlist. For more information, see [Can't connect to the Azure IoT Operations service from the child level Layered Network Management](#you-cant-connect-to-the-azure-iot-operations-service-from-the-child-level-layered-network-management).
135
+
1. Capture and investigate a network packet trace. For more information, see [capture Layered Network Management to a packet trace](#you-want-to-capture-layered-network-management-to-a-packet-trace).
104
136
105
-
### A pod fails when installing Azure IoT Operations on an isolated cluster
137
+
### You can install Azure IoT Operations on the isolated cluster but the pods fail to start
106
138
107
139
When installing the Azure IoT Operations components to a cluster, the installation starts and proceeds. However, initialization of one or few of the components (pods) fails.
108
140
@@ -124,9 +156,9 @@ When installing the Azure IoT Operations components to a cluster, the installati
124
156
Warning Failed 3m14s kubelet Failed to pull image "…
125
157
```
126
158
127
-
### Check the allowlist of Layered Network Management
159
+
### You can't connect to the Azure IoT Operations service from the child level Layered Network Management
128
160
129
-
Layered Network Management blocks traffic if the destination domain isn't on the allowlist.
161
+
Layered Network Management blocks traffic if the destination domain isn't on the allowlist. The allowlist is a list of domains that are allowed to be accessed from the child level Layered Network Management. Check the allowlist of Layered Network Management to verify if the domain is included. If the domain isn't on the allowlist, you can add it to the allowlist.
130
162
131
163
1. Run the following command to list the config maps.
132
164
@@ -150,18 +182,18 @@ Layered Network Management blocks traffic if the destination domain isn't on the
150
182
151
183
1. All the allowed domains are listed in the output.
### You want to capture Layered Network Management to a packet trace
154
186
155
187
In some cases, you might suspect that Layered Network Management instance at the parent level isn't forwarding network traffic to a particular endpoint. Connection to a required endpoint is causing an issue for the service running on your node. It's possible that the service you enabled is trying to connect to a new endpoint after an update. Or you're trying to install a new Arc extension or service that requires connection to endpoints that aren't on the default allowlist. Usually there would be information in the error message to notify the connection failure. However, if there's no clear information about the missing endpoint, you can capture the network traffic on the child node for detailed debugging.
156
188
157
-
#### Windows host
189
+
#### [Windows host](#tab/tabid-windows)
158
190
159
191
1. Install Wireshark network traffic analyzer on the host.
160
192
1. Run Wireshark and start capturing.
161
193
1. Reproduce the installation or connection failure.
162
194
1. Stop capturing.
163
195
164
-
#### Linux host
196
+
#### [Linux host](#tab/tabid-linux)
165
197
166
198
1. Run the following command to start capturing:
167
199
@@ -172,14 +204,14 @@ In some cases, you might suspect that Layered Network Management instance at the
172
204
1. Reproduce the installation or connection failure.
173
205
1. Stop capturing.
174
206
175
-
#### Analyze the packet trace
207
+
***
176
208
177
209
Use Wireshark to open the trace file. Look for connection failures or unresponsive connections.
178
210
179
211
1. Filter the packets with the *ip.addr == [IP address]* parameter. Input the IP address of your custom DNS service address.
180
212
1. Review the DNS query and response, check if there's a domain name that isn't on the allowlist of Layered Network Management.
181
213
182
-
## Operations experience
214
+
## Troubleshoot access to the operations experience web UI
183
215
184
216
To sign in to the [operations experience](https://iotoperations.azure.com) web UI, you need a Microsoft Entra ID account with at least contributor permissions for the resource group that contains your **Kubernetes - Azure Arc** instance.
0 commit comments