You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/iot-operations/troubleshoot/known-issues.md
+54-40Lines changed: 54 additions & 40 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: Known issues for the MQTT broker, Layered Network Management (previ
4
4
author: dominicbetts
5
5
ms.author: dobett
6
6
ms.topic: troubleshooting-known-issue
7
-
ms.date: 03/19/2025
7
+
ms.date: 03/24/2025
8
8
---
9
9
10
10
# Known issues: Azure IoT Operations
@@ -19,6 +19,59 @@ This article lists the known issues for Azure IoT Operations.
19
19
20
20
- If you deploy Azure IoT Operations in GitHub Codespaces, shutting down and restarting the Codespace causes a `This codespace is currently running in recovery mode due to a configuration error.` issue. Currently, there's no workaround for the issue. If you need a cluster that supports shutting down and restarting, choose one of the options in [Prepare your Azure Arc-enabled Kubernetes cluster](../deploy-iot-ops/howto-prepare-cluster.md).
21
21
22
+
## Update issues
23
+
24
+
The following issues might occur when you update Azure IoT Operations.
25
+
26
+
### Helm package enters a stuck state
27
+
28
+
When you update Azure IoT Operations, the Helm package might enter a stuck state, preventing any helm install or upgrade operations from proceeding. This results in the `operation in progress` error, blocking further upgrades.
29
+
30
+
Use the following steps to resolve the issue.
31
+
32
+
1. Identify the stuck components by running the following command:
33
+
34
+
```sh
35
+
helm list -n azure-iot-operations --pending
36
+
```
37
+
In the output, look for the release name of components, `<component-release-name>`, which have a status of `pending-upgrade` or `pending-install`. The following components might be affected by this issue:
38
+
39
+
-`-adr`
40
+
-`-akri`
41
+
-`-connectors`
42
+
-`-mqttbroker`
43
+
-`-dataflows`
44
+
-`-schemaregistry`
45
+
46
+
1. Using the release name components from step 1, retrieve the revision history of the stuck release. You need to run the following command for **each component from step 1**. For example, if components `-adr` and `-mqttbroker` are stuck, you run the following command twice, once for each component:
Make sure to replace `<component-release-name>` with the release name of the components that are stuck. In the output, look for the last revision that has a status of `Deployed` or `Superseded` and note the revision number.
53
+
54
+
1. Using the revision number from step 2, rollback the Helm release to the last successful revision. You need to run the following command for each component, `<component-release-name>`, and its revision number, `<revision-number>`, from steps 1 and 2.
1. After the rollback of each component is complete, reattempt the upgrade using the following command:
61
+
62
+
```sh
63
+
az iot ops update
64
+
```
65
+
66
+
If you receive a message stating `Nothing to upgrade or upgrade complete`, force the upgrade by appending:
67
+
68
+
```sh
69
+
az iot ops upgrade ....... --release-train stable --version 1.0.15
70
+
```
71
+
72
+
> [!IMPORTANT]
73
+
> You need to repeat steps 2 and 3 for each component that is stuck. You reattempt the upgrade only after all components are rolled back to the last successful revision.
74
+
22
75
## MQTT broker
23
76
24
77
- Sometimes, the MQTT broker's memory usage can become unexpectedly high due to internal certificate rotation retries. This results in errors like 'failed to connect trace upload task to diagnostics service endpoint' in the logs. The issue is expected to be addressed in the next patch update. In the meantime, as a workaround, restart each broker pod one by one (including the diagnostic service, probe, and authentication service), making sure each backend recovers before moving on. Alternatively, [redeploy Azure IoT Operations with higher internal certificate duration](../manage-mqtt-broker/howto-encrypt-internal-traffic.md#internal-certificates), `1500h` or more.
@@ -144,45 +197,6 @@ kubectl delete pod aio-opc-opc.tcp-1-f95d76c54-w9v9c -n azure-iot-operations
144
197
1. Run `kubectl delete pod aio-dataflow-operator-0 -n azure-iot-operations` to delete the data flow operator pod. Deleting the pod clears the crash status and restarts the pod.
145
198
1. Wait for the operator pod to restart and deploy the data flow.
146
199
147
-
## Helm package enters a stuck state
148
-
149
-
When you update Azure IoT Operations, the Helm package might enter a stuck state, preventing any helm install or upgrade operations from proceeding. This results in the `operation in progress` error, blocking further upgrades.
150
-
151
-
Use the following steps to resolve the issue:
152
-
153
-
1. Identify the stuck Helm release by running the following command:
154
-
155
-
```sh
156
-
helm list -n azure-iot-operations --pending
157
-
```
158
-
In the output, look for a release name that contains `<component>` and has a status of `pending-upgrade` or `pending-install`.
159
-
160
-
1. Retrieve the revision history of the stuck release. For example, for schema registry you run:
0 commit comments