You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/azure-arc/kubernetes/troubleshooting.md
+88-88Lines changed: 88 additions & 88 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: "Troubleshoot common Azure Arc-enabled Kubernetes issues"
3
-
ms.date: 03/13/2023
3
+
ms.date: 03/28/2023
4
4
ms.topic: how-to
5
5
ms.custom: devx-track-azurecli
6
6
description: "Learn how to resolve common issues with Azure Arc-enabled Kubernetes clusters and GitOps."
@@ -81,15 +81,15 @@ For more information, see [Debugging DNS Resolution](https://kubernetes.io/docs/
81
81
82
82
Issues with outbound network connectivity from the cluster may arise for different reasons. First make sure all of the [network requirements](network-requirements.md) have been met.
83
83
84
-
If you encounter this issue, and your cluster is behind an outbound proxy server, make sure you have passed proxy parameters during the onboarding of your cluster and that the proxy is configured correctly. For more information, see [Connect using an outbound proxy server](quickstart-connect-cluster.md#connect-using-an-outbound-proxy-server).
84
+
If you encounter this issue, and your cluster is behind an outbound proxy server, make sure you've passed proxy parameters during the onboarding of your cluster and that the proxy is configured correctly. For more information, see [Connect using an outbound proxy server](quickstart-connect-cluster.md#connect-using-an-outbound-proxy-server).
85
85
86
86
### Unable to retrieve MSI certificate
87
87
88
88
Problems retrieving the MSI certificate are usually due to network issues. Check to make sure all of the [network requirements](network-requirements.md) have been met, then try again.
89
89
90
90
### Insufficient cluster permissions
91
91
92
-
If the provided kubeconfig file doesn't have sufficient permissions to install the Azure Arc agents, the Azure CLI command will return an error.
92
+
If the provided kubeconfig file doesn't have sufficient permissions to install the Azure Arc agents, the Azure CLI command returns an error.
93
93
94
94
```azurecli
95
95
az connectedk8s connect --resource-group AzureArc --name AzureArcCluster
@@ -159,7 +159,7 @@ To resolve this issue, try the following steps.
3. If the certificate below isn't present, the system assigned managed identity hasn't been installed.
162
+
3. If the `azure-identity-certificate` isn't present, the system assigned managed identity hasn't been installed.
163
163
164
164
```console
165
165
kubectl get secret -n azure-arc -o yaml | grep name:
@@ -169,7 +169,7 @@ To resolve this issue, try the following steps.
169
169
name: azure-identity-certificate
170
170
```
171
171
172
-
To resolve this issue, try deleting the Arc deployment by running the `az connectedk8s delete` command and reinstalling it. If the issue continues to happen, it could be an issue with your proxy settings. In that case, [try connecting your cluster to Azure Arc via a proxy](./quickstart-connect-cluster.md#connect-using-an-outbound-proxy-server) to connect your cluster to Arc via a proxy. Please also verify if all the [network prerequisites](network-requirements.md) have been met.
172
+
To resolve this issue, try deleting the Arc deployment by running the `az connectedk8s delete` command and reinstalling it. If the issue continues to happen, it could be an issue with your proxy settings. In that case, [try connecting your cluster to Azure Arc via a proxy](./quickstart-connect-cluster.md#connect-using-an-outbound-proxy-server) to connect your cluster to Arc via a proxy. Also verify that all of the [network prerequisites](network-requirements.md) have been met.
173
173
174
174
4. If the `clusterconnect-agent` and the `config-agent` pods are running, but the `kube-aad-proxy` pod is missing, check your pod security policies. This pod uses the `azure-arc-kube-aad-proxy-sa` service account, which doesn't have admin permissions but requires the permission to mount host path.
175
175
@@ -183,7 +183,7 @@ To resolve this issue, try the following steps.
183
183
name: kube-aad-proxy-certificate
184
184
```
185
185
186
-
If the certificate is missing, [delete the deployment](quickstart-connect-cluster.md#clean-up-resources) and re-onboard with a different name for the cluster. If the problem continues, please contact support.
186
+
If the certificate is missing, [delete the deployment](quickstart-connect-cluster.md#clean-up-resources) and re-onboard with a different name for the cluster. If the problem continues, contact support.
Ensure that you have the latest helm version installed before proceeding.
198
198
This operation might take a while...
199
199
200
-
Please check if the azure-arc namespace was deployed and run 'kubectl get pods -n azure-arc' to check if all the pods are in running state. A possible cause for pods stuck in pending state could be insufficientresources on the Kubernetes cluster to onboard to arc.
200
+
Check if the azure-arc namespace was deployed, and run 'kubectl get pods -n azure-arc' to check if all the pods are in running state. A possible cause for pods stuck in pending state could be insufficientresources on the Kubernetes cluster to onboard to Azure Arc.
201
201
ValidationError: Unable to install helm release: Error: customresourcedefinitions.apiextensions.k8s.io "connectedclusters.arc.azure.com" not found
202
202
```
203
203
@@ -243,66 +243,6 @@ az extension add --name k8s-configuration
243
243
244
244
## GitOps management
245
245
246
-
### Flux v1 - General
247
-
248
-
> [!NOTE]
249
-
> Eventually Azure will stop supporting GitOps with Flux v1, so begin using [Flux v2](./tutorial-use-gitops-flux2.md) as soon as possible.
250
-
251
-
To help troubleshoot issues with `sourceControlConfigurations` resource (Flux v1), run these Azure CLI commands with `--debug` parameter specified:
252
-
253
-
```azurecli
254
-
az provider show -n Microsoft.KubernetesConfiguration --debug
255
-
az k8s-configuration create <parameters> --debug
256
-
```
257
-
258
-
### Flux v1 - Create configurations
259
-
260
-
Write permissions on the Azure Arc-enabled Kubernetes resource (`Microsoft.Kubernetes/connectedClusters/Write`) are necessary and sufficient for creating configurations on that cluster.
message: 'Error: {exit status 1} occurred while doing the operation : {Installing
295
-
the operator} on the config'
296
-
operatorPropertiesHashed: ""
297
-
publicKey: ""
298
-
retryCountPublicKey: 0
299
-
status: Installing the operator
300
-
kind: List
301
-
metadata:
302
-
resourceVersion: ""
303
-
selfLink: ""
304
-
```
305
-
306
246
### Flux v2 - General
307
247
308
248
To help troubleshoot issues with `fluxConfigurations` resource (Flux v2), run these Azure CLI commands with the `--debug` parameter specified:
@@ -322,7 +262,7 @@ For more information, see [How do I resolve `webhook does not support dry run` e
322
262
323
263
The `microsoft.flux` extension installs the Flux controllers and Azure GitOps agents into your Azure Arc-enabled Kubernetes or Azure Kubernetes Service (AKS) clusters. If the extension isn't already installed in a cluster and you create a GitOps configuration resource for that cluster, the extension will be installed automatically.
324
264
325
-
If you experience an error during installation, or if the extension is in a failed state, run a script to investigate. The cluster-type parameter can be set to `connectedClusters` for an Arc-enabled cluster or `managedClusters` for an AKS cluster. The name of the `microsoft.flux` extension will be "flux" if the extension was installed automatically during creation of a GitOps configuration. Look in the "statuses" object for information.
265
+
If you experience an error during installation, or if the extension is in a failed state, run a script to investigate. The cluster-type parameter can be set to `connectedClusters` for an Arc-enabled cluster or `managedClusters` for an AKS cluster. The name of the `microsoft.flux` extension is "flux" if the extension was installed automatically during creation of a GitOps configuration. Look in the "statuses" object for information.
* For an AKS cluster, assure that the subscription has the `Microsoft.ContainerService/AKS-ExtensionManager` feature flag enabled.
327
+
* For an AKS cluster, ensure that the subscription has the `Microsoft.ContainerService/AKS-ExtensionManager` feature flag enabled.
388
328
389
329
```azurecli
390
330
az feature register --namespace Microsoft.ContainerService --name AKS-ExtensionManager
391
331
```
392
332
393
-
* Assure that the cluster doesn't have any policies that restrict creation of the `flux-system` namespace or resources in that namespace.
333
+
* Ensure that the cluster doesn't have any policies that restrict creation of the `flux-system` namespace or resources in that namespace.
394
334
395
-
With these actions accomplished, you can either [recreate a flux configuration](./tutorial-use-gitops-flux2.md), which will install the flux extension automatically, or you can reinstall the flux extension manually.
335
+
With these actions accomplished, you can either [recreate a flux configuration](./tutorial-use-gitops-flux2.md), which installs the flux extension automatically, or you can reinstall the flux extension manually.
396
336
397
337
### Flux v2 - Installing the `microsoft.flux` extension in a cluster with Azure AD Pod Identity enabled
398
338
@@ -410,7 +350,7 @@ The extension status also returns as "Failed".
410
350
411
351
The extension-agent pod is trying to get its token from IMDS on the cluster in order to talk to the extension service in Azure, but the token request is intercepted by the [pod identity](../../aks/use-azure-ad-pod-identity.md)).
412
352
413
-
You can fix this issue by upgrading to the latest version of the `microsoft.flux` extension. For version 1.6.1 or earlier, the workaround is to create an `AzurePodIdentityException` that will tell Azure AD Pod Identity to ignore the token requests from flux-extension pods.
353
+
You can fix this issue by upgrading to the latest version of the `microsoft.flux` extension. For version 1.6.1 or earlier, the workaround is to create an `AzurePodIdentityException` that tells Azure AD Pod Identity to ignore the token requests from flux-extension pods.
414
354
415
355
```console
416
356
apiVersion: aadpodidentity.k8s.io/v1
@@ -439,17 +379,77 @@ The controllers installed in your Kubernetes cluster with the Microsoft Flux ext
439
379
440
380
| Container Name | CPU limit | Memory limit |
441
381
| -------------- | ----------- | -------- |
442
-
| fluxconfig-agent | 50m | 150Mi |
443
-
| fluxconfig-controller | 100m | 150Mi |
444
-
| fluent-bit | 20m | 150Mi |
445
-
| helm-controller | 1000m | 1Gi |
446
-
| source-controller | 1000m | 1Gi |
447
-
| kustomize-controller | 1000m | 1Gi |
448
-
| notification-controller | 1000m | 1Gi |
449
-
| image-automation-controller | 1000m | 1Gi |
450
-
| image-reflector-controller | 1000m | 1Gi |
451
-
452
-
If you have enabled a custom or built-in Azure Gatekeeper Policy, such as `Kubernetes cluster containers CPU and memory resource limits should not exceed the specified limits`, that limits the resources for containers on Kubernetes clusters, you will need to either ensure that the resource limits on the policy are greater than the limits shown above or the `flux-system` namespace is part of the `excludedNamespaces` parameter in the policy assignment.
382
+
| fluxconfig-agent | 50 m | 150 Mi |
383
+
| fluxconfig-controller | 100 m | 150 Mi |
384
+
| fluent-bit | 20 m | 150 Mi |
385
+
| helm-controller | 1000 m | 1 Gi |
386
+
| source-controller | 1000 m | 1 Gi |
387
+
| kustomize-controller | 1000 m | 1 i |
388
+
| notification-controller | 1000 m | 1 Gi |
389
+
| image-automation-controller | 1000 m | 1 Gi |
390
+
| image-reflector-controller | 1000 m | 1 Gi |
391
+
392
+
If you've enabled a custom or built-in Azure Gatekeeper Policy that limits the resources for containers on Kubernetes clusters, such as `Kubernetes cluster containers CPU and memory resource limits should not exceed the specified limits`, ensure that either the resource limits on the policy are greater than the limits shown above or that the `flux-system` namespace is part of the `excludedNamespaces` parameter in the policy assignment.
393
+
394
+
### Flux v1
395
+
396
+
> [!NOTE]
397
+
> We recommend [migrating to Flux v2](conceptual-gitops-flux2.md#migrate-from-flux-v1) as soon as possible. Support for Flux v1-based cluster configuration resources created prior to May 1, 2023 will end on [May 24, 2025](https://azure.microsoft.com/updates/migrate-your-gitops-configurations-from-flux-v1-to-flux-v2-by-24-may-2025/). Starting on May 1, 2023, you won't be able to create new Flux v1-based cluster configuration resources.
398
+
399
+
To help troubleshoot issues with `sourceControlConfigurations` resource (Flux v1), run these Azure CLI commands with `--debug` parameter specified:
400
+
401
+
```azurecli
402
+
az provider show -n Microsoft.KubernetesConfiguration --debug
403
+
az k8s-configuration create <parameters> --debug
404
+
```
405
+
406
+
#### Flux v1 - Create configurations
407
+
408
+
Write permissions on the Azure Arc-enabled Kubernetes resource (`Microsoft.Kubernetes/connectedClusters/Write`) are necessary and sufficient for creating configurations on that cluster.
message: 'Error: {exit status 1} occurred while doing the operation : {Installing
443
+
the operator} on the config'
444
+
operatorPropertiesHashed: ""
445
+
publicKey: ""
446
+
retryCountPublicKey: 0
447
+
status: Installing the operator
448
+
kind: List
449
+
metadata:
450
+
resourceVersion: ""
451
+
selfLink: ""
452
+
```
453
453
454
454
## Monitoring
455
455
@@ -511,7 +511,7 @@ This warning occurs when you use a service principal to log into Azure. The serv
511
511
az ad sp show --id bc313c14-388c-4e7d-a58e-70017303ee3b --query objectId -o tsv
512
512
```
513
513
514
-
1. Sign in into Azure CLI using the service principal. Use the `<objectId>` value from above step to enable custom locations on the cluster:
514
+
1. Sign in into Azure CLI using the service principal. Use the `<objectId>` value from the previous step to enable custom locations on the cluster:
515
515
516
516
* To enable custom locations when connecting the cluster to Arc, run the following command:
517
517
@@ -527,7 +527,7 @@ This warning occurs when you use a service principal to log into Azure. The serv
527
527
528
528
## Azure Arc-enabled Open Service Mesh
529
529
530
-
The steps below provide guidance on validating the deployment of all the Open Service Mesh (OSM) extension components on your cluster.
530
+
This section shows how to validate the deployment of all the Open Service Mesh (OSM) extension components on your cluster.
531
531
532
532
### Check OSM Controller **Deployment**
533
533
@@ -727,7 +727,7 @@ Example output:
727
727
1845
728
728
```
729
729
730
-
The number in the output indicates the number of bytes, or the size of the CA Bundle. If this is empty, 0, or a number under 1000, the CA Bundle is not correctly provisioned. Without a correct CA Bundle, the `ValidatingWebhook` will throw an error.
730
+
The number in the output indicates the number of bytes, or the size of the CA Bundle. If the output is empty, 0, or a number under 1000, the CA Bundle isn't correctly provisioned. Without a correct CA Bundle, the `ValidatingWebhook` will throw an error.
731
731
732
732
### Check the `osm-mesh-config` resource
733
733
@@ -819,7 +819,7 @@ metadata:
819
819
### Check namespaces
820
820
821
821
>[!Note]
822
-
>The arc-osm-system namespace will never participate in a service mesh and will never be labeled or annotated with the key/values below.
822
+
>The arc-osm-system namespace will never participate in a service mesh and will never be labeled or annotated with the key/values shown here.
823
823
824
824
We use the `osm namespace add` command to join namespaces to a given service mesh. When a Kubernetes namespace is part of the mesh, confirm the following:
825
825
@@ -851,7 +851,7 @@ The following label must be present:
851
851
}
852
852
```
853
853
854
-
If you aren't using `osm` CLI, you could also manually add these annotations to your namespaces. If a namespace isn't annotated with `"openservicemesh.io/sidecar-injection": "enabled"`, or isn't labeled with `"openservicemesh.io/monitored-by": "osm"`, the OSM Injector will not add Envoy sidecars.
854
+
If you aren't using `osm` CLI, you could also manually add these annotations to your namespaces. If a namespace isn't annotated with `"openservicemesh.io/sidecar-injection": "enabled"`, or isn't labeled with `"openservicemesh.io/monitored-by": "osm"`, the OSM Injector won't add Envoy sidecars.
855
855
856
856
>[!Note]
857
857
>After `osm namespace add` is called, only **new** pods will be injected with an Envoy sidecar. Existing pods must be restarted with `kubectl rollout restart deployment` command.
0 commit comments