Skip to content

Commit 665a0c4

Browse files
committed
Merge branch 'rl-dev01' of https://github.com/ReginaLin24/SupportArticles-docs-pr into rl-dev01
2 parents d763cf9 + 4dc9121 commit 665a0c4

File tree

156 files changed

+2995
-789
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

156 files changed

+2995
-789
lines changed

.openpublishing.redirection.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13267,6 +13267,10 @@
1326713267
{
1326813268
"source_path": "support/dynamics-365/sales/the-record-could-not-be-deleted.md",
1326913269
"redirect_url": "/troubleshoot/power-platform/dataverse/working-with-solutions/the-record-could-not-be-deleted"
13270+
},
13271+
{
13272+
"source_path": "support/power-platform/power-automate/dataverse-cds/cds-user-cannot-access-power-automate-business-process-flows-on-demand-workflows.md",
13273+
"redirect_url": "/previous-versions/troubleshoot/power-platform/power-automate/cloud-flows/cds-user-cannot-access-power-automate-business-process-flows-on-demand-workflows"
1327013274
}
1327113275
]
1327213276
}

support/azure/azure-kubernetes/connectivity/tunnel-connectivity-issues.md

Lines changed: 76 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
---
2-
title: Tunnel connectivity issues
2+
title: Tunnel Connectivity Issues
33
description: Resolve communication issues that are related to tunnel connectivity in an Azure Kubernetes Service (AKS) cluster.
44
ms.date: 03/23/2025
55
ms.reviewer: chiragpa, andbar, v-leedennis, v-weizhu, albarqaw
66
ms.service: azure-kubernetes-service
7-
keywords: Azure Kubernetes Service, AKS cluster, Kubernetes cluster, tunnels, connectivity, tunnel-front, aks-link
7+
keywords: Azure Kubernetes Service, AKS cluster, Kubernetes cluster, tunnels, connectivity, tunnel-front, aks-link, Konnectivity agent, Cluster Proportional Autoscaler, CPA, Resource allocation, Performance bottlenecks, Networking reliability, Azure Kubernetes troubleshooting, AKS performance issues
88
#Customer intent: As an Azure Kubernetes user, I want to avoid tunnel connectivity issues so that I can use an Azure Kubernetes Service (AKS) cluster successfully.
99
ms.custom: sap:Connectivity
1010
---
@@ -251,6 +251,80 @@ If everything is OK within the application, you'll have to adjust the allocated
251251

252252
You can set up a new cluster to use a Managed Network Address Translation (NAT) Gateway for outbound connections. For more information, see [Create an AKS cluster with a Managed NAT Gateway](/azure/aks/nat-gateway#create-an-aks-cluster-with-a-managed-nat-gateway).
253253

254+
## Cause 6: Konnectivity Agents performance issues with Cluster growth
255+
256+
As the cluster grows, the performance of Konnectivity Agents might degrade because of increased network traffic, more requests, or resource constraints.
257+
258+
> [!NOTE]
259+
> This cause applies to only the `Konnectivity-agent` pods.
260+
261+
### Solution 6: Cluster Proportional Autoscaler for Konnectivity Agent
262+
263+
To manage scalability challenges in large clusters, we implement the Cluster Proportional Autoscaler for our Konnectivity Agents. This approach aligns with industry standards and best practices. It ensures optimal resource usage and enhanced performance.
264+
265+
**Why this change was made**
266+
Previously, the Konnectivity agent had a fixed replica count that could create a bottleneck as the cluster grew. By implementating the Cluster Proportional Autoscaler, we enable the replica count to adjust dynamically, based on node-scaling rules, to provide optimal performance and resource usage.
267+
268+
**How the Cluster Proportional Autoscaler works**
269+
The Cluster Proportional Autoscaler work uses a ladder configuration to determine the number of Konnectivity agent replicas based on the cluster size. The ladder configuration is defined in the konnectivity-agent-autoscaler configmap in the kube-system namespace. Here is an example of the ladder configuration:
270+
271+
```
272+
nodesToReplicas": [
273+
[1, 2],
274+
[100, 3],
275+
[250, 4],
276+
[500, 5],
277+
[1000, 6],
278+
[5000, 10]
279+
]
280+
```
281+
282+
This configuration makes sure that the number of replicas scales appropriately with the number of nodes in the cluster to provide optimal resource allocation and improved networking reliability.
283+
284+
**How to use the Cluster Proportional Autoscaler?**
285+
You can override default values by updating the konnectivity-agent-autoscaler configmap in the kube-system namespace. Here is a sample command to update the configmap:
286+
287+
```bash
288+
kubectl edit configmap <pod-name> -n kube-system
289+
```
290+
This command opens the configmap in an editor to enable you to make the necessary changes.
291+
292+
**What you should check**
293+
294+
You have to monitor for Out Of Memory (OOM) kills on the nodes because misconfiguration of the Cluster Proportional Autoscaler can cause insufficient memory allocation for the Konnectivity agents. This misconfiguration occurs for the following key reasons:
295+
296+
**High Memory Usage:** As the cluster grows, the memory usage of Konnectivity agents can increase significantly. This increase can occur especially during peak loads or when handling large numbers of connections. If the Cluster Proportional Autoscaler configuration does not scale the replicas appropriately, the agents may run out of memory.
297+
298+
**Fixed Resource Limits:** If the resource requests and limits for the Konnectivity agents are set too low, they might not have enough memory to handle the workload, leading to OOM kills. Misconfigured Cluster Proportional Autoscaler settings can exacerbate this issue by not providing enough replicas to distribute the load.
299+
300+
**Cluster Size and Workload Variability:** The CPU and memory that are needed by the Konnectivity agents can vary widely depending on the size of the cluster and the workload. If the Cluster Proportional Autoscaler ladder configuration is not right-sized and adaptively resized for the cluster's usage patterns, it can cause memory overcommitment and OOM kills.
301+
302+
To identify and troubleshoot OOM kills, follow these steps:
303+
304+
1. Check for OOM Kills on nodes: Use the following command to check for OOM Kills on your nodes:
305+
306+
```
307+
kubectl get events --all-namespaces | grep -i 'oomkill'
308+
```
309+
310+
2. Inspect Node Resource Usage: Verify the resource usage on your nodes to make sure that they aren't running out of memory:
311+
312+
```
313+
kubectl top nodes
314+
```
315+
316+
3. Review Pod Resource Requests and Limits: Make sure that the Konnectivity agent pods have appropriate resource requests and limits set to prevent OOM Kills:
317+
318+
```
319+
kubectl get pod <pod-name> -n kube-system -o yaml | grep -A5 "resources:"
320+
```
321+
322+
4. Adjust Resource Requests and Limits: If necessary, adjust the resource requests and limits for the Konnectivity agent pods by editing the deployment:
323+
324+
```
325+
kubectl edit deployment konnectivity-agent -n kube-system
326+
```
327+
254328
[!INCLUDE [Third-party contact disclaimer](../../../includes/third-party-contact-disclaimer.md)]
255329

256330
[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)]

support/azure/azure-kubernetes/create-upgrade-delete/error-code-operationnotallowed-publicipcountlimitreached.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
---
22
title: Troubleshoot OperationNotAllowed or PublicIPCountLimitReached
33
description: Learn how to troubleshoot the OperationNotAllowed or PublicIPCountLimitReached quota error when you try to create and deploy an Azure Kubernetes Service (AKS) cluster.
4-
ms.date: 10/28/2024
4+
ms.date: 04/03/2024
55
editor: v-jsitser
6-
ms.reviewer: rissing, chiragpa, erbookbi, v-leedennis
6+
ms.reviewer: rissing, chiragpa, erbookbi, v-leedennis, dorinalecu
77
ms.service: azure-kubernetes-service
88
#Customer intent: As an Azure Kubernetes user, I want to troubleshoot the OperationNotAllowed or PublicIPCountLimitReached quota error code so that I can successfully create and deploy an Azure Kubernetes Service (AKS) cluster.
99
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool)

support/azure/azure-kubernetes/create-upgrade-delete/error-code-requestdisallowedbypolicy.md

Lines changed: 18 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
2-
title: RequestDisallowedByPolicy error when deploying an AKS cluster
2+
title: RequestDisallowedByPolicy Error When Deploying an AKS Cluster
33
description: Learn how to fix the RequestDisallowedByPolicy error when you try to create and deploy an Azure Kubernetes Service (AKS) cluster.
4-
ms.date: 10/12/2024
4+
ms.date: 03/13/2025
55
editor: v-jsitser
6-
ms.reviewer: rissing, chiragpa, erbookbi, albarqaw, v-leedennis, v-weizhu
6+
ms.reviewer: rissing, chiragpa, erbookbi, albarqaw, jacobbaek, v-leedennis, v-weizhu
77
ms.service: azure-kubernetes-service
8-
#Customer intent: As an Azure Kubernetes user, I want to troubleshoot the RequestDisallowedByPolicy error code so that I can successfully create and deploy an Azure Kubernetes Service (AKS) cluster.
8+
#Customer intent: As an Azure Kubernetes user, I want to troubleshoot the RequestDisallowedByPolicy error so that I can successfully create and deploy an Azure Kubernetes Service (AKS) cluster.
99
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool)
1010
---
1111
# RequestDisallowedByPolicy error when deploying an AKS cluster
@@ -22,24 +22,29 @@ When you try to deploy an AKS cluster, you receive the following error message:
2222
2323
## Cause
2424

25-
For security or compliance, your subscription administrators might assign policies that limit how resources are deployed. For example, your subscription might have a policy that prevents creating public IP addresses, network security groups, user-defined routes, or route tables. The error message includes the specific reason why the cluster creation was blocked. Only you can manage the policies in your environment. Microsoft can't disable or bypass those policies.
25+
For security or compliance, your subscription administrators might assign policies that limit how resources are deployed. For example, your subscription might have a policy that prevents you from creating public IP addresses, network security groups, user-defined routes, or route tables. The error message includes the specific reason why the cluster creation was blocked.
26+
27+
> [!NOTE]
28+
> Only you can manage the policies in your environment. Microsoft can't disable or bypass those policies.
2629
2730
## Solution
2831

2932
To fix this issue, follow these steps:
3033

31-
1. Find the policy that blocks the action. These policies are listed in the error message. The name of a policy assignment or definition is the last segment of the `id` string shown in the error message.
32-
33-
1. If possible, change your deployment to meet the limitations of the policy, and then retry the deploy operation.
34-
35-
1. Add an [exception to the policy](/azure/governance/policy/concepts/exemption-structure).
34+
1. Find the policy that blocks the action. These policies are listed in the error message.
35+
The name of a policy assignment or definition is the last segment of the `id` string that's shown in the error message.
36+
```
37+
# Example
38+
Code: RequestDisallowedByPolicy
39+
Message: Resource 'resourcegroup' was disallowed by policy. Policy identifiers: '[{"policyAssignment":{"name":"Not allowed resource types","id":"/subscriptions/00000000-0000-0000-0000-000000000000/providers/Microsoft.Authorization/policyAssignments/00000000000000000000000"},"policyDefinition":{"name":"Not allowed resource types","id":"/subscriptions/00000000-0000-0000-0000-000000000000/providers/Microsoft.Authorization/policyDefinitions/not-allowed-resourcetypes","version":"1.0.0"}}]'.
40+
```
3641

37-
1. [Disable the policy](/azure/defender-for-cloud/tutorial-security-policy#disable-security-policies-and-disable-recommendations).
42+
1. If possible, update your deployment to comply with the policy restrictions, and then retry the deployment. Alternatively, if you have permission to update policy, [add an exemption](/azure/governance/policy/tutorials/disallowed-resources#create-an-exemption) to the policy.
3843

39-
To get details about the policy that blocked your cluster deployment operation, see [RequestDisallowedByPolicy error with Azure resource policy](/azure/azure-resource-manager/troubleshooting/error-policy-requestdisallowedbypolicy).
44+
To get details about the policy that blocked your cluster deployment, see [RequestDisallowedByPolicy error with Azure resource policy](/azure/azure-resource-manager/troubleshooting/error-policy-requestdisallowedbypolicy).
4045

4146
> [!NOTE]
42-
> After fixing the policy that blocks the AKS cluster creation, run the `az aks update -g MyResourceGroup -n MyManagedCluster` command to change the cluster from a failed to a success state. This will reconcile the cluster and retry the last failed operation. For more information about clusters in a failed state, see [Troubleshoot Azure Kubernetes Service clusters or nodes in a failed state](../availability-performance/cluster-node-virtual-machine-failed-state.md).
47+
> After you fix the policy that blocks the AKS cluster creation, run the `az aks update -g MyResourceGroup -n MyManagedCluster` command to change the cluster from a failed state to a successful state. This change reconciles the cluster and retries the last failed operation. For more information about clusters in a failed state, see [Troubleshoot Azure Kubernetes Service clusters or nodes in a failed state](../availability-performance/cluster-node-virtual-machine-failed-state.md).
4348
4449
## More information
4550

support/azure/azure-kubernetes/create-upgrade-delete/error-code-toomanyrequestsreceived-subscriptionrequeststhrottled.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
---
22
title: Troubleshoot the TooManyRequestsReceived or SubscriptionRequestsThrottled error code
33
description: Learn how to troubleshoot the TooManyRequestsReceived or SubscriptionRequestsThrottled error when you try to delete an Azure Kubernetes Service (AKS) cluster.
4-
ms.date: 11/18/2024
4+
ms.date: 04/03/2025
55
editor: v-jsitser
6-
ms.reviewer: rissing, chiragpa, edneto, v-leedennis
6+
ms.reviewer: rissing, chiragpa, edneto, v-leedennis, dorinalecu
77
ms.service: azure-kubernetes-service
88
#Customer intent: As an Azure Kubernetes user, I want to troubleshoot the TooManyRequestsReceived or SubscriptionRequestsThrottled error code so that I can successfully delete an Azure Kubernetes Service (AKS) cluster.
99
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool)

support/azure/azure-kubernetes/create-upgrade-delete/pod-stuck-crashloopbackoff-mode.md

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,31 @@
11
---
22
title: Pod is stuck in CrashLoopBackOff mode
33
description: Troubleshoot a scenario in which a pod is stuck in CrashLoopBackOff mode on an Azure Kubernetes Service (AKS) cluster.
4-
ms.date: 09/07/2023
4+
ms.date: 04/07/2025
55
author: VikasPullagura-MSFT
66
ms.author: vipullag
7-
editor: v-jsitser
8-
ms.reviewer: chiragpa, nickoman, cssakscic, v-leedennis
7+
editor: v-jsitser, addobres
8+
ms.reviewer: chiragpa, nickoman, cssakscic, v-leedennis, addobres
99
ms.service: azure-kubernetes-service
1010
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool)
1111
---
1212
# Pod is stuck in CrashLoopBackOff mode
1313

14-
If a pod has a `CrashLoopBackOff` status, then the pod probably failed or exited unexpectedly, and the log contains an exit code that isn't zero. There are several possible reasons why your pod is stuck in `CrashLoopBackOff` mode. Consider the following options and their associated [kubectl](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands) commands.
14+
If a pod has a `CrashLoopBackOff` status, then the pod probably failed or exited unexpectedly, and the log contains an exit code that isn't zero. Here are several possible reasons why your pod is stuck in `CrashLoopBackOff` mode:
15+
16+
1. **Application failure**: The application inside the container crashes shortly after starting, often due to misconfigurations, missing dependencies, or incorrect environment variables.
17+
2. **Incorrect resource limits**: If the pod exceeds its CPU or memory resource limits, Kubernetes might kill the container. This issue can happen if resource requests or limits are set too low.
18+
3. **Missing or misconfigured ConfigMaps/Secrets**: If the application relies on configuration files or environment variables stored in ConfigMaps or Secrets but they're missing or misconfigured, the application might crash.
19+
4. **Image pull issues**: If there's an issue with the image (for example, it's corrupted or has an incorrect tag), the container might not start properly and fail repeatedly.
20+
5. **Init containers failing**: If the pod has init containers and one or more fail to run properly, the pod will restart.
21+
6. **Liveness/Readiness probe failures**: If liveness or readiness probes are misconfigured, Kubernetes might detect the container as unhealthy and restart it.
22+
7. **Application dependencies not ready**: The application might depend on services that aren't yet ready, such as databases, message queues, or other APIs.
23+
8. **Networking issues**: Network misconfigurations can prevent the application from communicating with necessary services, causing it to fail.
24+
9. **Invalid commands or arguments**: The container might be started with an invalid `ENTRYPOINT`, command, or argument, leading to a crash.
25+
26+
For more information about the container status, see [Pod Lifecycle - Container states](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-states).
27+
28+
Consider the following options and their associated [kubectl](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands) commands.
1529

1630
| Option | kubectl command |
1731
|--|--|

support/azure/azure-kubernetes/create-upgrade-delete/troubleshoot-common-azure-linux-aks.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
22
title: Troubleshoot common issues for Azure Linux Container Host for AKS
33
description: Troubleshoot commonly reported issues for Azure Linux container hosts on Azure Kubernetes Service (AKS).
4-
ms.date: 09/08/2023
4+
ms.date: 04/02/2025
55
author: suhuruli
66
ms.author: suhuruli
77
editor: v-jsitser
8-
ms.reviewer: v-leedennis
8+
ms.reviewer: mnasser, v-weizhu
99
ms.service: azure-kubernetes-service
1010
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool), linux-related-content
1111
---
@@ -77,6 +77,8 @@ Most commands in the Azure Linux OS, such as the process status (`ps`) command,
7777
| `apt-mark auto` | `tdnf install dnf mark remove` |
7878
| `apt-mark manual` | `dnf mark install` |
7979
| `apt-mark showmanual` | `dnf history userinstalled` |
80+
| `add-apt-repository` | Edit `/etc/yum.repos.d/*.repo` files |
81+
| `apt-key add` | `rpm --import` |
8082

8183
### Step 2: Check the Azure Linux version
8284

0 commit comments

Comments
 (0)