|
1 | 1 | --- |
2 | | -title: Tunnel connectivity issues |
| 2 | +title: Tunnel Connectivity Issues |
3 | 3 | description: Resolve communication issues that are related to tunnel connectivity in an Azure Kubernetes Service (AKS) cluster. |
4 | 4 | ms.date: 03/23/2025 |
5 | 5 | ms.reviewer: chiragpa, andbar, v-leedennis, v-weizhu, albarqaw |
6 | 6 | ms.service: azure-kubernetes-service |
7 | | -keywords: Azure Kubernetes Service, AKS cluster, Kubernetes cluster, tunnels, connectivity, tunnel-front, aks-link |
| 7 | +keywords: Azure Kubernetes Service, AKS cluster, Kubernetes cluster, tunnels, connectivity, tunnel-front, aks-link, Konnectivity agent, Cluster Proportional Autoscaler, CPA, Resource allocation, Performance bottlenecks, Networking reliability, Azure Kubernetes troubleshooting, AKS performance issues |
8 | 8 | #Customer intent: As an Azure Kubernetes user, I want to avoid tunnel connectivity issues so that I can use an Azure Kubernetes Service (AKS) cluster successfully. |
9 | 9 | ms.custom: sap:Connectivity |
10 | 10 | --- |
@@ -251,6 +251,80 @@ If everything is OK within the application, you'll have to adjust the allocated |
251 | 251 |
|
252 | 252 | You can set up a new cluster to use a Managed Network Address Translation (NAT) Gateway for outbound connections. For more information, see [Create an AKS cluster with a Managed NAT Gateway](/azure/aks/nat-gateway#create-an-aks-cluster-with-a-managed-nat-gateway). |
253 | 253 |
|
| 254 | +## Cause 6: Konnectivity Agents performance issues with Cluster growth |
| 255 | + |
| 256 | +As the cluster grows, the performance of Konnectivity Agents might degrade because of increased network traffic, more requests, or resource constraints. |
| 257 | + |
| 258 | +> [!NOTE] |
| 259 | +> This cause applies to only the `Konnectivity-agent` pods. |
| 260 | +
|
| 261 | +### Solution 6: Cluster Proportional Autoscaler for Konnectivity Agent |
| 262 | + |
| 263 | + To manage scalability challenges in large clusters, we implement the Cluster Proportional Autoscaler for our Konnectivity Agents. This approach aligns with industry standards and best practices. It ensures optimal resource usage and enhanced performance. |
| 264 | + |
| 265 | +**Why this change was made** |
| 266 | +Previously, the Konnectivity agent had a fixed replica count that could create a bottleneck as the cluster grew. By implementating the Cluster Proportional Autoscaler, we enable the replica count to adjust dynamically, based on node-scaling rules, to provide optimal performance and resource usage. |
| 267 | + |
| 268 | +**How the Cluster Proportional Autoscaler works** |
| 269 | +The Cluster Proportional Autoscaler work uses a ladder configuration to determine the number of Konnectivity agent replicas based on the cluster size. The ladder configuration is defined in the konnectivity-agent-autoscaler configmap in the kube-system namespace. Here is an example of the ladder configuration: |
| 270 | + |
| 271 | +``` |
| 272 | +nodesToReplicas": [ |
| 273 | + [1, 2], |
| 274 | + [100, 3], |
| 275 | + [250, 4], |
| 276 | + [500, 5], |
| 277 | + [1000, 6], |
| 278 | + [5000, 10] |
| 279 | +] |
| 280 | +``` |
| 281 | + |
| 282 | +This configuration makes sure that the number of replicas scales appropriately with the number of nodes in the cluster to provide optimal resource allocation and improved networking reliability. |
| 283 | + |
| 284 | +**How to use the Cluster Proportional Autoscaler?** |
| 285 | +You can override default values by updating the konnectivity-agent-autoscaler configmap in the kube-system namespace. Here is a sample command to update the configmap: |
| 286 | + |
| 287 | +```bash |
| 288 | +kubectl edit configmap <pod-name> -n kube-system |
| 289 | +``` |
| 290 | +This command opens the configmap in an editor to enable you to make the necessary changes. |
| 291 | + |
| 292 | +**What you should check** |
| 293 | + |
| 294 | +You have to monitor for Out Of Memory (OOM) kills on the nodes because misconfiguration of the Cluster Proportional Autoscaler can cause insufficient memory allocation for the Konnectivity agents. This misconfiguration occurs for the following key reasons: |
| 295 | + |
| 296 | +**High Memory Usage:** As the cluster grows, the memory usage of Konnectivity agents can increase significantly. This increase can occur especially during peak loads or when handling large numbers of connections. If the Cluster Proportional Autoscaler configuration does not scale the replicas appropriately, the agents may run out of memory. |
| 297 | + |
| 298 | +**Fixed Resource Limits:** If the resource requests and limits for the Konnectivity agents are set too low, they might not have enough memory to handle the workload, leading to OOM kills. Misconfigured Cluster Proportional Autoscaler settings can exacerbate this issue by not providing enough replicas to distribute the load. |
| 299 | + |
| 300 | +**Cluster Size and Workload Variability:** The CPU and memory that are needed by the Konnectivity agents can vary widely depending on the size of the cluster and the workload. If the Cluster Proportional Autoscaler ladder configuration is not right-sized and adaptively resized for the cluster's usage patterns, it can cause memory overcommitment and OOM kills. |
| 301 | + |
| 302 | +To identify and troubleshoot OOM kills, follow these steps: |
| 303 | + |
| 304 | +1. Check for OOM Kills on nodes: Use the following command to check for OOM Kills on your nodes: |
| 305 | + |
| 306 | +``` |
| 307 | +kubectl get events --all-namespaces | grep -i 'oomkill' |
| 308 | +``` |
| 309 | + |
| 310 | +2. Inspect Node Resource Usage: Verify the resource usage on your nodes to make sure that they aren't running out of memory: |
| 311 | + |
| 312 | +``` |
| 313 | +kubectl top nodes |
| 314 | +``` |
| 315 | + |
| 316 | +3. Review Pod Resource Requests and Limits: Make sure that the Konnectivity agent pods have appropriate resource requests and limits set to prevent OOM Kills: |
| 317 | + |
| 318 | +``` |
| 319 | +kubectl get pod <pod-name> -n kube-system -o yaml | grep -A5 "resources:" |
| 320 | +``` |
| 321 | + |
| 322 | +4. Adjust Resource Requests and Limits: If necessary, adjust the resource requests and limits for the Konnectivity agent pods by editing the deployment: |
| 323 | + |
| 324 | +``` |
| 325 | +kubectl edit deployment konnectivity-agent -n kube-system |
| 326 | +``` |
| 327 | + |
254 | 328 | [!INCLUDE [Third-party contact disclaimer](../../../includes/third-party-contact-disclaimer.md)] |
255 | 329 |
|
256 | 330 | [!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)] |
0 commit comments