Skip to content

Commit eae9515

Browse files
Ramit feedback
1 parent ab8eb7b commit eae9515

File tree

1 file changed

+54
-40
lines changed

1 file changed

+54
-40
lines changed

articles/iot-operations/troubleshoot/known-issues.md

Lines changed: 54 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Known issues for the MQTT broker, Layered Network Management (previ
44
author: dominicbetts
55
ms.author: dobett
66
ms.topic: troubleshooting-known-issue
7-
ms.date: 03/19/2025
7+
ms.date: 03/24/2025
88
---
99

1010
# Known issues: Azure IoT Operations
@@ -19,6 +19,59 @@ This article lists the known issues for Azure IoT Operations.
1919

2020
- If you deploy Azure IoT Operations in GitHub Codespaces, shutting down and restarting the Codespace causes a `This codespace is currently running in recovery mode due to a configuration error.` issue. Currently, there's no workaround for the issue. If you need a cluster that supports shutting down and restarting, choose one of the options in [Prepare your Azure Arc-enabled Kubernetes cluster](../deploy-iot-ops/howto-prepare-cluster.md).
2121

22+
## Update issues
23+
24+
The following issues might occur when you update Azure IoT Operations.
25+
26+
### Helm package enters a stuck state
27+
28+
When you update Azure IoT Operations, the Helm package might enter a stuck state, preventing any helm install or upgrade operations from proceeding. This results in the `operation in progress` error, blocking further upgrades.
29+
30+
Use the following steps to resolve the issue.
31+
32+
1. Identify the stuck components by running the following command:
33+
34+
```sh
35+
helm list -n azure-iot-operations --pending
36+
```
37+
In the output, look for the release name of components, `<component-release-name>`, which have a status of `pending-upgrade` or `pending-install`. The following components might be affected by this issue:
38+
39+
- `-adr`
40+
- `-akri`
41+
- `-connectors`
42+
- `-mqttbroker`
43+
- `-dataflows`
44+
- `-schemaregistry`
45+
46+
1. Using the release name components from step 1, retrieve the revision history of the stuck release. You need to run the following command for **each component from step 1**. For example, if components `-adr` and `-mqttbroker` are stuck, you run the following command twice, once for each component:
47+
48+
```sh
49+
helm history <component-release-name> -n azure-iot-operations
50+
```
51+
52+
Make sure to replace `<component-release-name>` with the release name of the components that are stuck. In the output, look for the last revision that has a status of `Deployed` or `Superseded` and note the revision number.
53+
54+
1. Using the revision number from step 2, rollback the Helm release to the last successful revision. You need to run the following command for each component, `<component-release-name>`, and its revision number, `<revision-number>`, from steps 1 and 2.
55+
56+
```sh
57+
helm rollback <component-release-name> <revision-number> -n azure-iot-operations
58+
```
59+
60+
1. After the rollback of each component is complete, reattempt the upgrade using the following command:
61+
62+
```sh
63+
az iot ops update
64+
```
65+
66+
If you receive a message stating `Nothing to upgrade or upgrade complete`, force the upgrade by appending:
67+
68+
```sh
69+
az iot ops upgrade ....... --release-train stable --version 1.0.15
70+
```
71+
72+
> [!IMPORTANT]
73+
> You need to repeat steps 2 and 3 for each component that is stuck. You reattempt the upgrade only after all components are rolled back to the last successful revision.
74+
2275
## MQTT broker
2376

2477
- Sometimes, the MQTT broker's memory usage can become unexpectedly high due to internal certificate rotation retries. This results in errors like 'failed to connect trace upload task to diagnostics service endpoint' in the logs. The issue is expected to be addressed in the next patch update. In the meantime, as a workaround, restart each broker pod one by one (including the diagnostic service, probe, and authentication service), making sure each backend recovers before moving on. Alternatively, [redeploy Azure IoT Operations with higher internal certificate duration](../manage-mqtt-broker/howto-encrypt-internal-traffic.md#internal-certificates), `1500h` or more.
@@ -144,45 +197,6 @@ kubectl delete pod aio-opc-opc.tcp-1-f95d76c54-w9v9c -n azure-iot-operations
144197
1. Run `kubectl delete pod aio-dataflow-operator-0 -n azure-iot-operations` to delete the data flow operator pod. Deleting the pod clears the crash status and restarts the pod.
145198
1. Wait for the operator pod to restart and deploy the data flow.
146199

147-
## Helm package enters a stuck state
148-
149-
When you update Azure IoT Operations, the Helm package might enter a stuck state, preventing any helm install or upgrade operations from proceeding. This results in the `operation in progress` error, blocking further upgrades.
150-
151-
Use the following steps to resolve the issue:
152-
153-
1. Identify the stuck Helm release by running the following command:
154-
155-
```sh
156-
helm list -n azure-iot-operations --pending
157-
```
158-
In the output, look for a release name that contains `<component>` and has a status of `pending-upgrade` or `pending-install`.
159-
160-
1. Retrieve the revision history of the stuck release. For example, for schema registry you run:
161-
162-
```sh
163-
helm history <schema-registry-release-name> -n azure-iot-operations
164-
```
165-
166-
In the output, look for the last revision that has a status of `Deployed` or `Superseded` and note the revision number.
167-
168-
1. Rollback the Helm release to the last successful revision. For example, for schema registry you run:
169-
170-
```sh
171-
helm rollback <schema-registry-release-name> <revision-number> -n azure-iot-operations
172-
```
173-
174-
1. After the rollback is complete, reattempt the upgrade using the following command:
175-
176-
```sh
177-
az iot ops update
178-
```
179-
180-
If you receive a message stating `Nothing to upgrade or upgrade complete`, force the upgrade by appending:
181-
182-
```sh
183-
az iot ops upgrade ....... --release-train stable --version 1.0.15
184-
```
185-
186200

187201

188202

0 commit comments

Comments
 (0)