You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/iot-operations/troubleshoot/known-issues.md
+63-3Lines changed: 63 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,9 +4,7 @@ description: Known issues for the MQTT broker, Layered Network Management (previ
4
4
author: dominicbetts
5
5
ms.author: dobett
6
6
ms.topic: troubleshooting-known-issue
7
-
ms.custom:
8
-
- ignite-2023
9
-
ms.date: 03/05/2025
7
+
ms.date: 03/24/2025
10
8
---
11
9
12
10
# Known issues: Azure IoT Operations
@@ -21,6 +19,64 @@ This article lists the known issues for Azure IoT Operations.
21
19
22
20
- If you deploy Azure IoT Operations in GitHub Codespaces, shutting down and restarting the Codespace causes a `This codespace is currently running in recovery mode due to a configuration error.` issue. Currently, there's no workaround for the issue. If you need a cluster that supports shutting down and restarting, choose one of the options in [Prepare your Azure Arc-enabled Kubernetes cluster](../deploy-iot-ops/howto-prepare-cluster.md).
23
21
22
+
## Update issues
23
+
24
+
The following issues might occur when you update Azure IoT Operations.
25
+
26
+
### Helm package enters a stuck state
27
+
28
+
When you update Azure IoT Operations, the Helm package might enter a stuck state, preventing any helm install or upgrade operations from proceeding. This results in the following error message, which blocks further upgrades.
29
+
30
+
```output
31
+
Message: Update failed for this resource, as there is a conflicting operation in progress. Please try after sometime.
32
+
```
33
+
34
+
Follow these steps to resolve the issue.
35
+
36
+
1. Identify the stuck components by running the following command:
37
+
38
+
```sh
39
+
helm list -n azure-iot-operations --pending
40
+
```
41
+
In the output, look for the release name of components, `<component-release-name>`, which have a status of `pending-upgrade` or `pending-install`. The following components might be affected by this issue:
42
+
43
+
-`-adr`
44
+
-`-akri`
45
+
-`-connectors`
46
+
-`-mqttbroker`
47
+
-`-dataflows`
48
+
-`-schemaregistry`
49
+
50
+
1. Using the `<component-release-name>` from step 1, retrieve the revision history of the stuck release. You need to run the following command for **each component from step 1**. For example, if components `-adr` and `-mqttbroker` are stuck, you run the following command twice, once for each component:
Make sure to replace `<component-release-name>` with the release name of the components that are stuck. In the output, look for the last revision that has a status of `Deployed` or `Superseded` and note the revision number.
57
+
58
+
1. Using the **revision number from step 2**, roll back the Helm release to the last successful revision. You need to run the following command for each component, `<component-release-name>`, and its revision number, `<revision-number>`, from steps 1 and 2.
> You need to repeat steps 2 and 3 for each component that is stuck. You reattempt the upgrade only after all components are rolled back to the last successful revision.
66
+
67
+
1. After the rollback of each component is complete, reattempt the upgrade using the following command:
68
+
69
+
```sh
70
+
az iot ops update
71
+
```
72
+
73
+
If you receive a message stating `Nothing to upgrade or upgrade complete`, force the upgrade by appending:
74
+
75
+
```sh
76
+
az iot ops upgrade ....... --release-train stable --version 1.0.15
77
+
```
78
+
79
+
24
80
## MQTT broker
25
81
26
82
- Sometimes, the MQTT broker's memory usage can become unexpectedly high due to internal certificate rotation retries. This results in errors like 'failed to connect trace upload task to diagnostics service endpoint' in the logs. The issue is expected to be addressed in the next patch update. In the meantime, as a workaround, restart each broker pod one by one (including the diagnostic service, probe, and authentication service), making sure each backend recovers before moving on. Alternatively, [redeploy Azure IoT Operations with higher internal certificate duration](../manage-mqtt-broker/howto-encrypt-internal-traffic.md#internal-certificates), `1500h` or more.
@@ -135,3 +191,7 @@ kubectl delete pod aio-opc-opc.tcp-1-f95d76c54-w9v9c -n azure-iot-operations
135
191
If you see both log entries from the two *kubectl log* commands, the cert-manager wasn't ready or running.
136
192
1. Run `kubectl delete pod aio-dataflow-operator-0 -n azure-iot-operations` to delete the data flow operator pod. Deleting the pod clears the crash status and restarts the pod.
137
193
1. Wait for the operator pod to restart and deploy the data flow.
0 commit comments