|
| 1 | +--- |
| 2 | +title: Identify containers causing high disk I/O latency in AKS clusters |
| 3 | +description: Learn how to identify which containers and pods are causing high disk I/O latency in your Azure Kubernetes Service clusters to easily troubleshoot issues using the open source project Inspektor Gadget. |
| 4 | +ms.date: 07/16/2025 |
| 5 | +ms.author: burakok |
| 6 | +ms.reviewer: burakok, mayasingh, blanquicet |
| 7 | +ms.service: azure-kubernetes-service |
| 8 | +ms.custom: sap:Node/node pool availability and performance |
| 9 | +--- |
| 10 | +# Troubleshoot high disk I/O latency in AKS clusters |
| 11 | + |
| 12 | +Disk I/O latency can severely impact the performance and reliability of workloads running in Azure Kubernetes Service (AKS) clusters. This article shows how to use the open source project [Inspektor Gadget](https://aka.ms/ig-website) to identify which containers and pods are causing high disk I/O latency in AKS. |
| 13 | + |
| 14 | +Inspektor Gadget provides eBPF-based gadgets that help you observe and troubleshoot disk I/O issues in Kubernetes environments. |
| 15 | + |
| 16 | +## Symptoms |
| 17 | + |
| 18 | +You may suspect disk I/O latency issues when you observe the following behaviors in your AKS cluster: |
| 19 | + |
| 20 | +- Applications become unresponsive during file operations |
| 21 | +- [Azure Portal metrics](/azure/aks/monitor-aks-reference#supported-metrics-for-microsoftcomputevirtualmachines)(`Data Disk Bandwidth Consumed Percentage` and `Data Disk IOPS Consumed Percentage`) or other system monitoring shows high disk utilization with low throughput |
| 22 | +- Database operations take significantly longer than expected |
| 23 | +- Pod logs show file system operation errors or timeouts |
| 24 | + |
| 25 | +## Prerequisites |
| 26 | + |
| 27 | +- The Kubernetes [kubectl](https://kubernetes.io/docs/reference/kubectl/overview/) command-line tool. To install kubectl by using [Azure CLI](/cli/azure/install-azure-cli), run the [az aks install-cli](/cli/azure/aks#az-aks-install-cli) command. |
| 28 | +- Access to your AKS cluster with sufficient permissions to run privileged pods |
| 29 | +- The open source project [Inspektor Gadget](../logs/capture-system-insights-from-aks.md#what-is-inspektor-gadget) for eBPF-based observability. For more information, see [How to install Inspektor Gadget in an AKS cluster](../logs/capture-system-insights-from-aks.md#how-to-install-inspektor-gadget-in-an-aks-cluster) |
| 30 | + |
| 31 | +> [!NOTE] |
| 32 | +> The `top_blockio` gadget requires kernel version 6.5 or later. You can verify your AKS node kernel version by running `kubectl get nodes -o wide` to see the kernel version in the KERNEL-VERSION column. |
| 33 | +
|
| 34 | +## Troubleshooting checklist |
| 35 | + |
| 36 | +### Step 1: Profile disk I/O latency with `profile_blockio` |
| 37 | + |
| 38 | +The [`profile_blockio`](https://aka.ms/ig-profile-blockio) gadget gathers information about block device I/O usage and periodically generates a histogram distribution of I/O latency. This helps you visualize disk I/O performance and identify latency patterns. We can use this information to gather evidence to support or refute the hypothesis that the symptoms we are seeing are due to disk I/O issues. |
| 39 | + |
| 40 | +```console |
| 41 | +kubectl gadget run profile_blockio --node <node-name> |
| 42 | +``` |
| 43 | + |
| 44 | +> [!NOTE] |
| 45 | +> The `profile_blockio` gadget requires specifying a specific node with the `--node` parameter. You can get node names by running `kubectl get nodes`. |
| 46 | +
|
| 47 | +**Baseline example** (empty cluster with minimal activity): |
| 48 | + |
| 49 | +``` |
| 50 | +latency |
| 51 | + µs : count distribution |
| 52 | + 0 -> 1 : 0 | | |
| 53 | + 1 -> 2 : 0 | | |
| 54 | + 2 -> 4 : 0 | | |
| 55 | + 4 -> 8 : 0 | | |
| 56 | + 8 -> 16 : 0 | | |
| 57 | + 16 -> 32 : 0 | | |
| 58 | + 32 -> 64 : 70 | | |
| 59 | + 64 -> 128 : 22 | | |
| 60 | + 128 -> 256 : 6 | | |
| 61 | + 256 -> 512 : 16 | | |
| 62 | + 512 -> 1024 : 1017 |********* | |
| 63 | + 1024 -> 2048 : 2205 |******************** | |
| 64 | + 2048 -> 4096 : 2740 |************************** | |
| 65 | + 4096 -> 8192 : 1128 |********** | |
| 66 | + 8192 -> 16384 : 708 |****** | |
| 67 | + 16384 -> 32768 : 4211 |****************************************| |
| 68 | + 32768 -> 65536 : 129 |* | |
| 69 | + 65536 -> 131072 : 185 |* | |
| 70 | + 131072 -> 262144 : 402 |*** | |
| 71 | + 262144 -> 524288 : 112 |* | |
| 72 | + 524288 -> 1048576 : 0 | | |
| 73 | + 1048576 -> 2097152 : 0 | | |
| 74 | + 2097152 -> 4194304 : 0 | | |
| 75 | + 4194304 -> 8388608 : 0 | | |
| 76 | + 8388608 -> 16777216 : 0 | | |
| 77 | + 16777216 -> 33554432 : 0 | | |
| 78 | + 33554432 -> 67108864 : 0 | | |
| 79 | +``` |
| 80 | + |
| 81 | +**High disk I/O stress example** (with `stress-ng --hdd 10 --io 10` running to simulate I/O load): |
| 82 | + |
| 83 | +``` |
| 84 | +latency |
| 85 | + µs : count distribution |
| 86 | + 0 -> 1 : 0 | | |
| 87 | + 1 -> 2 : 0 | | |
| 88 | + 2 -> 4 : 0 | | |
| 89 | + 4 -> 8 : 0 | | |
| 90 | + 8 -> 16 : 42 | | |
| 91 | + 16 -> 32 : 236 | | |
| 92 | + 32 -> 64 : 558 |* | |
| 93 | + 64 -> 128 : 201 | | |
| 94 | + 128 -> 256 : 147 | | |
| 95 | + 256 -> 512 : 62 | | |
| 96 | + 512 -> 1024 : 2660 |****** | |
| 97 | + 1024 -> 2048 : 6376 |*************** | |
| 98 | + 2048 -> 4096 : 8374 |******************** | |
| 99 | + 4096 -> 8192 : 3912 |********* | |
| 100 | + 8192 -> 16384 : 2099 |***** | |
| 101 | + 16384 -> 32768 : 16703 |****************************************| |
| 102 | + 32768 -> 65536 : 1718 |**** | |
| 103 | + 65536 -> 131072 : 5758 |************* | |
| 104 | + 131072 -> 262144 : 9552 |********************** | |
| 105 | + 262144 -> 524288 : 6778 |**************** | |
| 106 | + 524288 -> 1048576 : 347 | | |
| 107 | + 1048576 -> 2097152 : 16 | | |
| 108 | + 2097152 -> 4194304 : 0 | | |
| 109 | + 4194304 -> 8388608 : 0 | | |
| 110 | + 8388608 -> 16777216 : 0 | | |
| 111 | + 16777216 -> 33554432 : 0 | | |
| 112 | + 33554432 -> 67108864 : 0 | | |
| 113 | +``` |
| 114 | + |
| 115 | +**Interpreting the results**: To identify which node has I/O pressure you can compare the baseline vs. stress scenarios: |
| 116 | +- **Baseline**: Most operations (4,211 count) in the 16-32ms range, typical for normal system activity |
| 117 | +- **Under stress**: Significantly more operations in higher latency ranges (9,552 operations in 131-262ms, 6,778 in 262-524ms) |
| 118 | +- **Performance degradation**: The stress test shows operations extending into the 500ms-2s range, indicating disk saturation |
| 119 | +- **Concerning signs**: Look for high counts above 100ms (100,000µs) which may indicate disk performance issues |
| 120 | + |
| 121 | +### Step 2: Find top disk I/O consumers with `top_blockio` |
| 122 | + |
| 123 | +The [`top_blockio`](https://aka.ms/ig-top-blockio) gadget provides a periodic list of containers with the highest disk I/O operations. Optionally we can limit the tracing to the node we identified in Step 1. This gadget requires kernel version 6.5 or higher (available on [Azure Linux Container Host clusters](/azure/aks/use-azure-linux)). |
| 124 | + |
| 125 | +```console |
| 126 | +kubectl gadget run top_blockio --namespace <namespace> --sort -bytes [--node <node-name>] |
| 127 | +``` |
| 128 | + |
| 129 | +Sample output: |
| 130 | + |
| 131 | +``` |
| 132 | +K8S.NODE K8S.NAMESPACE K8S.PODNAME K8S.CONTAINERNAME COMM PID TID MAJOR MINOR BYTES US IO RW |
| 133 | +aks-nodepool1-…99-vmss000000 0 0 8 0 173707264 153873788 11954 write |
| 134 | +aks-nodepool1-…99-vmss000000 0 0 8 0 352256 85222 36 read |
| 135 | +aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 324… 324… 8 0 131072 4450 1 write |
| 136 | +aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 324… 324… 8 0 131072 3651 1 write |
| 137 | +aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 324… 324… 8 0 4096 4096 1 write |
| 138 | +``` |
| 139 | + |
| 140 | +From the output, we can identify containers with unusually high number of bytes read/written into the disk (`BYTES` column), time spent on reading/writing operations (`US` column), or number of IO operations (`IO` column) which may indicate high disk activity. In this example, we can see significant write activity (173MB) with considerable time spent (~154 seconds total). |
| 141 | + |
| 142 | +> [!NOTE] |
| 143 | +> Empty K8S.NAMESPACE, K8S.PODNAME, and K8S.CONTAINERNAME fields can occur during kernel space initiated operations or high-volume I/O. You can still use the `top_file` gadget for detailed process information when these fields are empty. |
| 144 | +
|
| 145 | +### Step 3: Identify files causing high disk activity with `top_file` |
| 146 | + |
| 147 | +The [`top_file`](https://aka.ms/ig-top-file) gadget reports periodically the read/write activity by file, helping you identify specific processes in which containers are causing high disk activity. |
| 148 | + |
| 149 | +```console |
| 150 | +kubectl gadget run top_file --namespace <namespace> --max-entries 20 --sort -wbytes_raw,-rbytes_raw |
| 151 | +``` |
| 152 | + |
| 153 | +Sample output: |
| 154 | + |
| 155 | +``` |
| 156 | +K8S.NODE K8S.NAMESPACE K8S.PODNAME K8S.CONTAINERNAME COMM PID TID READS WRITES FILE T RBYTES WBYTES |
| 157 | +aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 49258 49258 0 17 /stress.ADneNJ R 0 B 18 MB |
| 158 | +aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 49254 49254 0 20 /stress.LEbDOb R 0 B 21 MB |
| 159 | +aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 49252 49252 0 18 /stress.eMOjmP R 0 B 19 MB |
| 160 | +aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 49264 49264 0 22 /stress.fLHpBC R 0 B 23 MB |
| 161 | +... |
| 162 | +``` |
| 163 | + |
| 164 | +This output shows which files are being accessed most frequently, helping you pinpoint what specific file a given process is reading/writing the most. In this example, the stress-hdd pod is creating multiple temporary files with significant write activity (18-23MB each) |
| 165 | + |
| 166 | +### Root cause analysis workflow |
| 167 | + |
| 168 | +By combining all three gadgets, you can trace disk latency issues from symptoms to root cause: |
| 169 | + |
| 170 | +1. **`profile_blockio`** identifies that disk latency exists in a given node (high counts in 100ms+ ranges) |
| 171 | +2. **`top_blockio`** shows which processes are generating the most disk I/O (173MB writes with 154 seconds total time spent) |
| 172 | +3. **`top_file`** reveals the specific files and commands causing the issue (stress command creating /stress.* files) |
| 173 | + |
| 174 | +This complete visibility allows you to: |
| 175 | +- **Identify the problematic pod**: `stress-hdd` pod in the `default` namespace |
| 176 | +- **Find the specific process**: `stress` command with PIDs 49258, 49254, etc. |
| 177 | +- **Locate the problematic files**: Multiple `/stress.*` temporary files with 18-23MB each |
| 178 | +- **Understand the I/O pattern**: Heavy write operations creating temporary files |
| 179 | + |
| 180 | +With this information, you can take targeted action rather than making broad system changes. |
| 181 | + |
| 182 | +## Next steps |
| 183 | + |
| 184 | +Based on the results from these gadgets, you can take the following actions: |
| 185 | + |
| 186 | +- **High latency in `profile_blockio`**: Investigate the underlying disk performance and if the workload needs better disk performance, consider using [storage optimized nodes](/azure/virtual-machines/sizes/overview#storage-optimized) |
| 187 | +- **High I/O operations in `top_blockio`**: Review application logic to optimize disk access patterns or implement caching |
| 188 | +- **Specific files in `top_file`**: Analyze if files can be moved to faster storage, cached, or if application logic can be optimized |
| 189 | + |
| 190 | +## Related content |
| 191 | + |
| 192 | +- [Inspektor Gadget documentation](https://inspektor-gadget.io/docs/latest/gadgets/) |
| 193 | +- [How to install Inspektor Gadget in an AKS cluster](../logs/capture-system-insights-from-aks.md#how-to-install-inspektor-gadget-in-an-aks-cluster) |
| 194 | +- [Troubleshoot high memory consumption in disk-intensive applications](high-memory-consumption-disk-intensive-applications.md) |
| 195 | + |
| 196 | +[!INCLUDE [Third-party information disclaimer](../../../includes/third-party-disclaimer.md)] |
| 197 | +[!INCLUDE [Third-party contact information disclaimer](../../../includes/third-party-contact-disclaimer.md)] |
| 198 | +[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)] |
| 199 | + |
| 200 | + |
| 201 | + |
0 commit comments