Skip to content

Commit e53afd3

Browse files
authored
Merge pull request #276595 from dominicbetts/aio-troubleshooting
AIO: Troubleshooting updates
2 parents 3f581fd + 65fc22f commit e53afd3

File tree

3 files changed

+92
-104
lines changed

3 files changed

+92
-104
lines changed

articles/iot-operations/process-data/howto-edit-pipelines.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -55,8 +55,3 @@ To create, delete or copy pipelines, use the **Pipelines** tab in the Azure IoT
5555
:::image type="content" source="media/pipelines-manage.png" alt-text="A screenshot that shows the options in the pipelines list.":::
5656

5757
This list also lets you view the provisioning state and status of your pipelines
58-
59-
## Related content
60-
61-
- [Data Processor pipeline deployment status is failed](../troubleshoot/troubleshoot.md#data-processor-pipeline-deployment-status-is-failed)
62-
- [What are configuration patterns?](concept-configuration-patterns.md)

articles/iot-operations/troubleshoot/known-issues.md

Lines changed: 61 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: "Known issues: Azure IoT Operations Preview"
3-
description: A list of known issues for Azure IoT Operations.
3+
description: Known issues for Azure IoT MQ, Layered Network Management, OPC UA Broker, OPC PLC simulator, Data Processor, and Operations portal.
44
author: dominicbetts
55
ms.author: dobett
66
ms.topic: troubleshooting-known-issue
@@ -13,11 +13,18 @@ ms.date: 05/03/2024
1313

1414
[!INCLUDE [public-preview-note](../includes/public-preview-note.md)]
1515

16-
This article contains known issues for Azure IoT Operations Preview.
16+
This article lists the known issues for Azure IoT Operations Preview.
1717

18-
## Azure IoT Operations Preview
18+
## Deploy and uninstall issues
1919

20-
- You must use the Azure CLI interactive login `az login`. If you don't, you might see an error such as _ERROR: AADSTS530003: Your device is required to be managed to access this resource_.
20+
- You must use the Azure CLI interactive login `az login` when you deploy Azure IoT Operations. If you don't, you might see an error such as _ERROR: AADSTS530003: Your device is required to be managed to access this resource_.
21+
22+
- If your deployment fails with the `"code":"LinkedAuthorizationFailed"` error, it means that you don't have **Microsoft.Authorization/roleAssignments/write** permissions on the resource group that contains your cluster.
23+
24+
To resolve this issue, either request the required permissions or make the following adjustments to your deployment steps:
25+
26+
- If deploying with an Azure Resource Manager template, set the `deployResourceSyncRules` parameter to `false`.
27+
- If deploying with the Azure CLI, include the `--disable-rsync-rules` flag with the [az iot ops init](/cli/azure/iot/ops#az-iot-ops-init) command.
2128

2229
- Uninstalling K3s: When you uninstall k3s on Ubuntu by using the `/usr/local/bin/k3s-uninstall.sh` script, you might encounter an issue where the script gets stuck on unmounting the NFS pod. A workaround for this issue is to run the following command before you run the uninstall script: `sudo systemctl stop k3s`.
2330

@@ -39,23 +46,23 @@ This article contains known issues for Azure IoT Operations Preview.
3946

4047
## Azure IoT Layered Network Management Preview
4148

42-
- If the Layered Network Management service isn't getting an IP address while running K3S on Ubuntu host, reinstall K3S without _trafeik ingress controller_ by using the `--disable=traefik` option.
49+
- If the Layered Network Management service doesn't get an IP address while running K3S on Ubuntu host, reinstall K3S without _trafeik ingress controller_ by using the `--disable=traefik` option.
4350

4451
```bash
4552
curl -sfL https://get.k3s.io | sh -s - --disable=traefik --write-kubeconfig-mode 644
4653
```
4754

4855
For more information, see [Networking | K3s](https://docs.k3s.io/networking#traefik-ingress-controller).
4956

50-
- If DNS queries aren't getting resolved to expected IP address while using [CoreDNS](../manage-layered-network/howto-configure-layered-network.md#configure-coredns) service running on child network level, upgrade to Ubuntu 22.04 and reinstall K3S.
57+
- If DNS queries don't resolve to the expected IP address while using [CoreDNS](../manage-layered-network/howto-configure-layered-network.md#configure-coredns) service running on child network level, upgrade to Ubuntu 22.04 and reinstall K3S.
5158
5259
## Azure IoT OPC UA Broker Preview
5360
54-
- All AssetEndpointProfiles in the cluster have to be configured with the same transport authentication certificate, otherwise the OPC UA Broker might exhibit random behavior. To avoid this issue when using transport authentication, configure all asset endpoints with the same thumbprint for the transport authentication certificate in the Azure IoT Operations (preview) portal.
61+
- All `AssetEndpointProfiles` in the cluster must be configured with the same transport authentication certificate, otherwise the OPC UA Broker might exhibit random behavior. To avoid this issue when using transport authentication, configure all asset endpoints with the same thumbprint for the transport authentication certificate in the Azure IoT Operations (preview) portal.
5562
56-
- If you deploy an AssetEndpointProfile into the cluster and the OPC UA Broker can't connect to the configured endpoint on the first attempt, then the OPC UA Broker never retries to connect.
63+
- If you deploy an `AssetEndpointProfile` into the cluster and the OPC UA Broker can't connect to the configured endpoint on the first attempt, then the OPC UA Broker never retries to connect.
5764

58-
As a workaround, first fix the connection problem. Then either restart all the pods in the cluster with pod names that start with "aio-opc-opc.tcp", or delete the AssetEndpointProfile and deploy it again.
65+
As a workaround, first fix the connection problem. Then either restart all the pods in the cluster with pod names that start with "aio-opc-opc.tcp", or delete the `AssetEndpointProfile` and deploy it again.
5966

6067
## OPC PLC simulator
6168

@@ -84,30 +91,61 @@ kubectl patch AssetEndpointProfile $ENDPOINT_NAME \
8491
done
8592
```
8693

87-
Update the OPC UA Broker cluster extension to accept untrusted server certificates with the following command:
88-
89-
```azurecli
90-
az k8s-extension update --version 0.3.0-preview --name opc-ua-broker --release-train preview --cluster-name <CLUSTER_NAME> --resource-group <RESOURCE_GROUP> --cluster-type connectedClusters --auto-upgrade-minor-version false --config opcPlcSimulation.deploy=true --config opcPlcSimulation.autoAcceptUntrustedCertificates=true
91-
```
92-
93-
> [!CAUTION]
94-
> Don't use this configuration in production or pre-production environments. The configuration lowers the security level for the OPC PLC so that it accepts connections from any client without an explicit peer certificate trust operation.
95-
9694
If the OPC PLC simulator isn't sending data to the IoT MQ broker after you create a new asset, restart the OPC PLC simulator pod. The pod name looks like `aio-opc-opc.tcp-1-f95d76c54-w9v9c`. To restart the pod, use the `k9s` tool to kill the pod, or run the following command:
9795
9896
```bash
9997
kubectl delete pod aio-opc-opc.tcp-1-f95d76c54-w9v9c -n azure-iot-operations
10098
```
10199
100+
## Azure IoT Data Processor Preview
101+
102+
- If you see deployment errors with Data Processor pods, make sure that when you created your Azure Key Vault you chose **Vault access policy** as the **Permission model**.
103+
104+
- If the data processor extension fails to uninstall, run the following commands and try the uninstall operation again:
105+
106+
```bash
107+
kubectl delete pod aio-dp-reader-worker-0 --grace-period=0 --force -n azure-iot-operations
108+
kubectl delete pod aio-dp-runner-worker-0 --grace-period=0 --force -n azure-iot-operations
109+
```
110+
111+
- If edits you make to a pipeline aren't applied to messages, run the following commands to propagate the changes:
112+
113+
```bash
114+
kubectl rollout restart deployment aio-dp-operator -n azure-iot-operations
115+
116+
kubectl rollout restart statefulset aio-dp-runner-worker -n azure-iot-operations
117+
118+
kubectl rollout restart statefulset aio-dp-reader-worker -n azure-iot-operations
119+
```
120+
121+
- It's possible a momentary loss of communication with IoT MQ broker pods can pause the processing of data pipelines. You might also see errors such as `service account token expired`. If you notice this happening, run the following commands:
122+
123+
```bash
124+
kubectl rollout restart statefulset aio-dp-runner-worker -n azure-iot-operations
125+
kubectl rollout restart statefulset aio-dp-reader-worker -n azure-iot-operations
126+
```
127+
128+
- If data is corrupted in the Microsoft Fabric lakehouse table that your Data Processor pipeline is writing to, make sure that no other processes are writing to the table. If you write to the Microsoft Fabric lakehouse table from multiple sources, you might see corrupted data in the table.
129+
102130
## Azure IoT Akri Preview
103131
104-
A sporadic issue might cause the handler to restart with the following error in the logs: `opcua@311 exception="System.IO.IOException: Failed to bind to address http://unix:/var/lib/akri/opcua-asset.sock: address already in use.`.
132+
A sporadic issue might cause the `aio-opc-asset-discovery` pod to restart with the following error in the logs: `opcua@311 exception="System.IO.IOException: Failed to bind to address http://unix:/var/lib/akri/opcua-asset.sock: address already in use.`.
105133
106134
To work around this issue, use the following steps to update the **DaemonSet** specification:
107135
108-
1. Locate the **Target** custom resource provided by **orchestration.iotoperations.azure.com** that contains the deployment specifications for **aio-opc-asset-discovery**.
109-
1. In the **aio-opc-asset-discovery** component of the target file, find the `spect.components.aio-opc-asset-discovery.properties.resource.spec.template.spec.containers.env` parameter.
110-
1. Add the following environment variables:
136+
1. Locate the **target** custom resource provided by `orchestration.iotoperations.azure.com` with a name that ends with `-ops-init-target`:
137+
138+
```console
139+
kubectl get targets -n azure-iot-operations
140+
```
141+
142+
1. Edit the target configuration and find the `spec.components.aio-opc-asset-discovery.properties.resource.spec.template.spec.containers.env` parameter. For example:
143+
144+
```console
145+
kubectl edit target solid-zebra-97r6jr7rw43vqv-ops-init-target -n azure-iot-operations
146+
```
147+
148+
1. Add the following environment variables to the configuration:
111149
112150
```yml
113151
- name: ASPNETCORE_URLS
@@ -118,7 +156,7 @@ To work around this issue, use the following steps to update the **DaemonSet** s
118156
fieldPath: "status.podIP"
119157
```
120158
121-
The final specification should look like the following example:
159+
1. Save your changes. The final specification looks like the following example:
122160
123161
```yml
124162
apiVersion: orchestrator.iotoperations.azure.com/v1

0 commit comments

Comments
 (0)