AB#5420: Private version fo PR#1852

AmandaAZ · web-flow · commit 6f38ae198fbc · 2025-04-17T13:55:19.000+08:00
diff --git a/support/azure/azure-kubernetes/availability-performance/high-memory-consumption-disk-intensive-applications.md b/support/azure/azure-kubernetes/availability-performance/high-memory-consumption-disk-intensive-applications.md
@@ -0,0 +1,173 @@
+---
+title: Troubleshoot High Memory Consumption in Disk-Intensive Applications
+description: Helps identify and resolve excessive memory usage due to Linux kernel behaviors on Kubernetes pods.
+ms.date: 04/16/2025
+ms.reviewer: claudiogodoy, 
+ms.service: azure-kubernetes-service
+ms.custom: sap:Node/node pool availability and performance
+---
+# Troubleshoot high memory consumption in disk-intensive applications
+
+Disk input and output operations are costly, and most operating systems implement caching strategies for reading and writing data to the filesystem. [Linux kernel](https://www.kernel.org/doc) usually uses strategies such as the [page cache](https://www.kernel.org/doc/gorman/html/understand/understand013.html) to improve the overall performance. The primary goal of the page cache is to store data that's read from the filesystem in cache, making it available in memory for future read operations.
+
+When disk-intensive applications perform frequent filesystem operations, high memory consumption might occur. This article helps you to identity and resolve this issue due to Linux kernel behaviors on Kubernetes pods.
+
+## Prerequisites
+
+- A tool to connect to the Kubernetes cluster, such as the kubectl tool. To install kubectl using the [Azure CLI](/cli/azure/install-azure-cli), run the [az aks install-cli](/cli/azure/aks#az-aks-install-cli) command.
+
+## Symptoms
+
+The following table outlines the common symptoms of memory saturation:
+
+| Symptom | Description |
+| --- | --- |
+| [Working set](https://kubernetes.io/docs/tasks/debug/debug-cluster/resource-metrics-pipeline/#memory) metric too high | This issue occurs when there is a significant difference between the working_set metric reported by the [Kubernetes Metrics API](https://kubernetes.io/docs/tasks/debug/debug-cluster/resource-metrics-pipeline/#metrics-server) and the actual memory consumed by an application. |
+| Out-of-memory (OOM) kill | This issue indicates memory issues exist on your pod. |
+
+## Troubleshooting checklist
+
+### Step 1: Inspect pod working set
+
+1. Identify which pod is consuming excessive memory by following the guide[Troubleshoot memory saturation in AKS clusters](identify-memory-saturation-aks.md).
+2. Use the following [kubectl top pods](https://kubernetes.io/docs/reference/kubectl/generated/kubectl_top/) command to show the actual [Working_Set](https://kubernetes.io/docs/tasks/debug/debug-cluster/resource-metrics-pipeline/#memory) reported by the [Kubernetes metrics API](https://kubernetes.io/docs/tasks/debug/debug-cluster/resource-metrics-pipeline/#metrics-server):
+
+    ```console
+    $ kubectl top pods -A | grep -i "<DEPLOYMENT_NAME>"
+    NAME                            CPU(cores)   MEMORY(bytes)
+    my-deployment-fc94b7f98-m9z2l   1m           344Mi
+    ```
+
+### Step 2: Inspect pod memory statistics
+
+Inspect the memory statistics of the [cgroup](https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html) of the pod by following these steps:
+
+1. Connect to the pod:
+
+    ```console
+    $ kubectl exec <POD_NAME> -it -- bash
+    ```
+
+2. Navigate to the cgroup statistics directory and list memory-related files:
+
+    ```console
+    $ ls /sys/fs/cgroup | grep -e memory.stat -e memory.current
+    memory.current memory.stat
+    ```
+
+    - `memory.current`: Total memory currently used by the cgroup and its descendants.
+    - `memory.stat`: This breaks down the cgroup's memory footprint into different types of memory, type-specific details, and other information on the state and past events of the memory management system.
+
+3. All the values listed on those files are in bytes. Get an overview of how the memory consumption is distributed on the `pod`:
+
+    ```console
+    $ cat /sys/fs/cgroup/memory.current
+    10645012480
+    $ cat /sys/fs/cgroup/memory.stat
+    anon 5197824
+    inactive_anon 5152768
+    active_anon 8192
+    ...
+    file 10256240640
+    active_file 32768
+    inactive_file 10256207872
+    ...
+    slab 354682456
+    slab_reclaimable 354554400
+    slab_unreclaimable 128056
+    ...
+    ```
+
+`cAdvisor` uses `memory.current` and `inactive_file` to compute the working set metric. You can replicate the calculation using the following formula:
+
+```sh
+working_set = (memory.current - inactive_file) / 1048576
+            = (10645012480 - 10256207872) / 1048576
+            = 370 MB
+```
+
+### Step 3: Determine kernel vs. application memory consumption
+
+The following table describes some memory segments:
+
+| Segment | Description |
+|---|---|
+| anon | Amount of memory used in anonymous mappings. The majority languages use this segment to allocate memory. |
+| file | Amount of memory used to cache filesystem data, including tmpfs and shared memory. |
+| slab  | Amount of memory used for storing in-kernel data structures. |
+
+The majority of languages use the anon memory segment to allocate resources. In this case, the `anon` represents 5197824 bytes which is not even close to the total amount reported by the working set metric.
+
+On the other hand, there is one of the segments that Kernel uses the `slab` representing 354682456 bytes, which is almost all the memory reported by working set metric on this pod.
+
+### Step 4: Run a node drop cache
+
+> [!NOTE]
+> This step might lead to availability and performance issues. Avoid running it in a production environment.
+
+1. Get the node running the pod:
+
+    ```console
+    $ kubectl get pod -A -o wide | grep "<POD_NAME>"
+    NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE                                NOMINATED NODE   READINESS GATES
+    my-deployment-fc94b7f98-m9z2l   1/1     Running   0          37m   10.244.1.17   aks-agentpool-26052128-vmss000004   <none>           <none>
+    ```
+
+2. Create a debugger pod using the [kubectl debug](https://kubernetes.io/docs/reference/kubectl/generated/kubectl_debug/) command and create a `kubectl` session:
+
+    ```console
+    $ kubectl debug node/<NODE_NAME> -it --image=mcr.microsoft.com/cbl-mariner/busybox:2.0
+    $ chroot /host
+    ```
+
+3. Drop the kernel cache:
+
+    ```console
+    echo 1 > /proc/sys/vm/drop_caches
+    ```
+
+4. Verify if the command in the previous step causes the effect by repeating [Step 1](#step-1-inspect-pod-working-set) and [Step 2](#step-2-inspect-pod-memory-statistics):
+
+    ```console
+    $ kubectl top pods -A | grep -i "<DEPLOYMENT_NAME>"
+    NAME                            CPU(cores)   MEMORY(bytes)
+    my-deployment-fc94b7f98-m9z2l   1m           4Mi
+
+    $ kubectl exec <POD_NAME> -it -- cat /sys/fs/cgroup/memory.stat
+    anon 4632576
+    file 1781760
+    ...
+    slab_reclaimable 219312
+    slab_unreclaimable 173456
+    slab 392768
+    ```
+
+If you observe a significant decrease in both working set and slab memory segment, you are experiencing the issue where a great amount of pod's memory is used by the Kernel.
+
+## Workaround: Set appropriate memory limits and requests
+
+The only effective workaround for high memory consumption on Kubernetes pods is to set realistic resource [limits and requests](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits). For example:
+
+```ymal
+resources:
+    requests:
+        memory: 30Mi
+    limits:
+        memory: 60Mi
+```
+
+By configuring appropriate memory limits and requests in the Kubernetes or specification, you can ensure that Kubernetes manages memory allocation more efficiently, mitigating the impact of excessive kernel-level caching on pod memory usage.
+
+> [!NOTE]
+> Misconfigured pod memory limits can lead to problems such as OOM-Killed errors.
+
+## References
+
+- [Learn more about Azure Kubernetes Service (AKS) best practices](/azure/aks/best-practices)
+- [Monitor your Kubernetes cluster performance with Container insights](/azure/azure-monitor/containers/container-insights-analyze)
+
+[!INCLUDE [Third-party information disclaimer](../../../includes/third-party-disclaimer.md)]
+
+[!INCLUDE [Third-party contact information disclaimer](../../../includes/third-party-contact-disclaimer.md)]
+
+[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)]
diff --git a/support/azure/azure-kubernetes/toc.yml b/support/azure/azure-kubernetes/toc.yml
@@ -168,6 +168,8 @@
     href: availability-performance/identify-high-cpu-consuming-containers-aks.md
   - name: Identify memory saturation in AKS clusters
     href: availability-performance/identify-memory-saturation-aks.md
+  - name: Troubleshoot high memory consumption in disk-intensive applications
+    href: availability-performance/high-memory-consumption-disk-intensive-applications.md
   - name: Troubleshoot cluster service health probe mode issues
     href: availability-performance/cluster-service-health-probe-mode-issues.md
   - name: Troubleshoot node not ready