Skip to content

Commit f551a15

Browse files
committed
Merge remote-tracking branch 'upstream/main' into AB#7246-Troubleshooting-and-Adjusting-Concurrent-Connection-Limits-in-SSH
2 parents 73eff3f + 57ea80b commit f551a15

File tree

9 files changed

+466
-3
lines changed

9 files changed

+466
-3
lines changed

Teams/teams-rooms-and-devices/teams-rooms-known-issues-windows.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ appliesto:
2222
- Microsoft Teams
2323
search.appverid:
2424
- MET150
25-
ms.date: 10/02/2025
25+
ms.date: 10/07/2025
2626
---
2727
# Known issues with Teams Rooms on Windows
2828

@@ -49,6 +49,7 @@ ms.date: 10/02/2025
4949
| --- | --- | --- |
5050
|During a Coordinated meeting, when the meeting volume is changed by using a room remote, the speaker on a Surface Hub or Teams Rooms device turns on.|For a trusted device such as a Surface Hub or Teams Rooms device that is set up to automatically join a Coordinated meeting when the primary device joins, the speaker turns on when a room remote is used to change the meeting volume. This issue occurs even though the audio settings on the device are turned off, and whether they're enabled or disabled.|Turn off proximity join and room remote capabilities on the trusted devices that automatically join a Coordinated meeting.|
5151
|The central part of the console on a Teams Rooms device doesn't respond to touch and mouse input.|On some Microsoft Teams Rooms devices such as the Crestron Dell Optiplex 7080 that use a 4k monitor connected as the front-of-room display, the central portion of the display intermittently stops responding to touch and mouse controls.<br/><br/>Despite this issue, the Teams Rooms app is functional and accepts inputs from a connected keyboard.|Contact Microsoft Support for assistance to work around this issue.|
52+
|Poor audio quality from a Teams Rooms device.|You might experience any of the following audio quality issues from a USB microphone or speaker that's connected to your Teams Rooms device: <ul><li> - Echo sent to remote participants on the call.</li><li> - Low audio quality sent to remote participants on the call.</li><li> - Decrease in audio quality as the call progresses.</li><li> - Low volume in speakers within the room.</li></ul><br/>This issue might be caused when a setting in the audio driver configuration to enhance audio quality is enabled.|To fix the audio quality issue, disable the audio enhancement setting for the attached Teams-certified USB audio device by using the following steps:<ol><li>Select **Settings** on the Teams Rooms device console.</li><li>Select **Windows Settings**, and sign in to the Teams Rooms device as an administrator.</li><li>Open **Control Panel**.</li><li>Select **Hardware and Sound** > **Sound**.</li><li>On the **Playback** tab, select the affected audio device and then select **Properties**.</li><li>Select the **Advanced** tab.</li><li>In the **Signal Enhancements** section, uncheck **Enable audio enhancements**, and then select **OK**.</li><li>Select the **Recording** tab and repeat steps 5, 6 & 7 for the same audio device </li><li>Select **OK** and then close **Control Panel.</li><li>Restart the Teams Rooms device.</li></ol>
5253

5354
## Limitations
5455

Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,201 @@
1+
---
2+
title: Identify containers causing high disk I/O latency in AKS clusters
3+
description: Learn how to identify which containers and pods are causing high disk I/O latency in your Azure Kubernetes Service clusters to easily troubleshoot issues using the open source project Inspektor Gadget.
4+
ms.date: 07/16/2025
5+
ms.author: burakok
6+
ms.reviewer: burakok, mayasingh, blanquicet
7+
ms.service: azure-kubernetes-service
8+
ms.custom: sap:Node/node pool availability and performance
9+
---
10+
# Troubleshoot high disk I/O latency in AKS clusters
11+
12+
Disk I/O latency can severely impact the performance and reliability of workloads running in Azure Kubernetes Service (AKS) clusters. This article shows how to use the open source project [Inspektor Gadget](https://aka.ms/ig-website) to identify which containers and pods are causing high disk I/O latency in AKS.
13+
14+
Inspektor Gadget provides eBPF-based gadgets that help you observe and troubleshoot disk I/O issues in Kubernetes environments.
15+
16+
## Symptoms
17+
18+
You may suspect disk I/O latency issues when you observe the following behaviors in your AKS cluster:
19+
20+
- Applications become unresponsive during file operations
21+
- [Azure Portal metrics](/azure/aks/monitor-aks-reference#supported-metrics-for-microsoftcomputevirtualmachines)(`Data Disk Bandwidth Consumed Percentage` and `Data Disk IOPS Consumed Percentage`) or other system monitoring shows high disk utilization with low throughput
22+
- Database operations take significantly longer than expected
23+
- Pod logs show file system operation errors or timeouts
24+
25+
## Prerequisites
26+
27+
- The Kubernetes [kubectl](https://kubernetes.io/docs/reference/kubectl/overview/) command-line tool. To install kubectl by using [Azure CLI](/cli/azure/install-azure-cli), run the [az aks install-cli](/cli/azure/aks#az-aks-install-cli) command.
28+
- Access to your AKS cluster with sufficient permissions to run privileged pods
29+
- The open source project [Inspektor Gadget](../logs/capture-system-insights-from-aks.md#what-is-inspektor-gadget) for eBPF-based observability. For more information, see [How to install Inspektor Gadget in an AKS cluster](../logs/capture-system-insights-from-aks.md#how-to-install-inspektor-gadget-in-an-aks-cluster)
30+
31+
> [!NOTE]
32+
> The `top_blockio` gadget requires kernel version 6.5 or later. You can verify your AKS node kernel version by running `kubectl get nodes -o wide` to see the kernel version in the KERNEL-VERSION column.
33+
34+
## Troubleshooting checklist
35+
36+
### Step 1: Profile disk I/O latency with `profile_blockio`
37+
38+
The [`profile_blockio`](https://aka.ms/ig-profile-blockio) gadget gathers information about block device I/O usage and periodically generates a histogram distribution of I/O latency. This helps you visualize disk I/O performance and identify latency patterns. We can use this information to gather evidence to support or refute the hypothesis that the symptoms we are seeing are due to disk I/O issues.
39+
40+
```console
41+
kubectl gadget run profile_blockio --node <node-name>
42+
```
43+
44+
> [!NOTE]
45+
> The `profile_blockio` gadget requires specifying a specific node with the `--node` parameter. You can get node names by running `kubectl get nodes`.
46+
47+
**Baseline example** (empty cluster with minimal activity):
48+
49+
```
50+
latency
51+
µs : count distribution
52+
0 -> 1 : 0 | |
53+
1 -> 2 : 0 | |
54+
2 -> 4 : 0 | |
55+
4 -> 8 : 0 | |
56+
8 -> 16 : 0 | |
57+
16 -> 32 : 0 | |
58+
32 -> 64 : 70 | |
59+
64 -> 128 : 22 | |
60+
128 -> 256 : 6 | |
61+
256 -> 512 : 16 | |
62+
512 -> 1024 : 1017 |********* |
63+
1024 -> 2048 : 2205 |******************** |
64+
2048 -> 4096 : 2740 |************************** |
65+
4096 -> 8192 : 1128 |********** |
66+
8192 -> 16384 : 708 |****** |
67+
16384 -> 32768 : 4211 |****************************************|
68+
32768 -> 65536 : 129 |* |
69+
65536 -> 131072 : 185 |* |
70+
131072 -> 262144 : 402 |*** |
71+
262144 -> 524288 : 112 |* |
72+
524288 -> 1048576 : 0 | |
73+
1048576 -> 2097152 : 0 | |
74+
2097152 -> 4194304 : 0 | |
75+
4194304 -> 8388608 : 0 | |
76+
8388608 -> 16777216 : 0 | |
77+
16777216 -> 33554432 : 0 | |
78+
33554432 -> 67108864 : 0 | |
79+
```
80+
81+
**High disk I/O stress example** (with `stress-ng --hdd 10 --io 10` running to simulate I/O load):
82+
83+
```
84+
latency
85+
µs : count distribution
86+
0 -> 1 : 0 | |
87+
1 -> 2 : 0 | |
88+
2 -> 4 : 0 | |
89+
4 -> 8 : 0 | |
90+
8 -> 16 : 42 | |
91+
16 -> 32 : 236 | |
92+
32 -> 64 : 558 |* |
93+
64 -> 128 : 201 | |
94+
128 -> 256 : 147 | |
95+
256 -> 512 : 62 | |
96+
512 -> 1024 : 2660 |****** |
97+
1024 -> 2048 : 6376 |*************** |
98+
2048 -> 4096 : 8374 |******************** |
99+
4096 -> 8192 : 3912 |********* |
100+
8192 -> 16384 : 2099 |***** |
101+
16384 -> 32768 : 16703 |****************************************|
102+
32768 -> 65536 : 1718 |**** |
103+
65536 -> 131072 : 5758 |************* |
104+
131072 -> 262144 : 9552 |********************** |
105+
262144 -> 524288 : 6778 |**************** |
106+
524288 -> 1048576 : 347 | |
107+
1048576 -> 2097152 : 16 | |
108+
2097152 -> 4194304 : 0 | |
109+
4194304 -> 8388608 : 0 | |
110+
8388608 -> 16777216 : 0 | |
111+
16777216 -> 33554432 : 0 | |
112+
33554432 -> 67108864 : 0 | |
113+
```
114+
115+
**Interpreting the results**: To identify which node has I/O pressure you can compare the baseline vs. stress scenarios:
116+
- **Baseline**: Most operations (4,211 count) in the 16-32ms range, typical for normal system activity
117+
- **Under stress**: Significantly more operations in higher latency ranges (9,552 operations in 131-262ms, 6,778 in 262-524ms)
118+
- **Performance degradation**: The stress test shows operations extending into the 500ms-2s range, indicating disk saturation
119+
- **Concerning signs**: Look for high counts above 100ms (100,000µs) which may indicate disk performance issues
120+
121+
### Step 2: Find top disk I/O consumers with `top_blockio`
122+
123+
The [`top_blockio`](https://aka.ms/ig-top-blockio) gadget provides a periodic list of containers with the highest disk I/O operations. Optionally we can limit the tracing to the node we identified in Step 1. This gadget requires kernel version 6.5 or higher (available on [Azure Linux Container Host clusters](/azure/aks/use-azure-linux)).
124+
125+
```console
126+
kubectl gadget run top_blockio --namespace <namespace> --sort -bytes [--node <node-name>]
127+
```
128+
129+
Sample output:
130+
131+
```
132+
K8S.NODE K8S.NAMESPACE K8S.PODNAME K8S.CONTAINERNAME COMM PID TID MAJOR MINOR BYTES US IO RW
133+
aks-nodepool1-…99-vmss000000 0 0 8 0 173707264 153873788 11954 write
134+
aks-nodepool1-…99-vmss000000 0 0 8 0 352256 85222 36 read
135+
aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 324… 324… 8 0 131072 4450 1 write
136+
aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 324… 324… 8 0 131072 3651 1 write
137+
aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 324… 324… 8 0 4096 4096 1 write
138+
```
139+
140+
From the output, we can identify containers with unusually high number of bytes read/written into the disk (`BYTES` column), time spent on reading/writing operations (`US` column), or number of IO operations (`IO` column) which may indicate high disk activity. In this example, we can see significant write activity (173MB) with considerable time spent (~154 seconds total).
141+
142+
> [!NOTE]
143+
> Empty K8S.NAMESPACE, K8S.PODNAME, and K8S.CONTAINERNAME fields can occur during kernel space initiated operations or high-volume I/O. You can still use the `top_file` gadget for detailed process information when these fields are empty.
144+
145+
### Step 3: Identify files causing high disk activity with `top_file`
146+
147+
The [`top_file`](https://aka.ms/ig-top-file) gadget reports periodically the read/write activity by file, helping you identify specific processes in which containers are causing high disk activity.
148+
149+
```console
150+
kubectl gadget run top_file --namespace <namespace> --max-entries 20 --sort -wbytes_raw,-rbytes_raw
151+
```
152+
153+
Sample output:
154+
155+
```
156+
K8S.NODE K8S.NAMESPACE K8S.PODNAME K8S.CONTAINERNAME COMM PID TID READS WRITES FILE T RBYTES WBYTES
157+
aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 49258 49258 0 17 /stress.ADneNJ R 0 B 18 MB
158+
aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 49254 49254 0 20 /stress.LEbDOb R 0 B 21 MB
159+
aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 49252 49252 0 18 /stress.eMOjmP R 0 B 19 MB
160+
aks-nodepool1-…99-vmss000000 default stress-hdd stress-hdd stress 49264 49264 0 22 /stress.fLHpBC R 0 B 23 MB
161+
...
162+
```
163+
164+
This output shows which files are being accessed most frequently, helping you pinpoint what specific file a given process is reading/writing the most. In this example, the stress-hdd pod is creating multiple temporary files with significant write activity (18-23MB each)
165+
166+
### Root cause analysis workflow
167+
168+
By combining all three gadgets, you can trace disk latency issues from symptoms to root cause:
169+
170+
1. **`profile_blockio`** identifies that disk latency exists in a given node (high counts in 100ms+ ranges)
171+
2. **`top_blockio`** shows which processes are generating the most disk I/O (173MB writes with 154 seconds total time spent)
172+
3. **`top_file`** reveals the specific files and commands causing the issue (stress command creating /stress.* files)
173+
174+
This complete visibility allows you to:
175+
- **Identify the problematic pod**: `stress-hdd` pod in the `default` namespace
176+
- **Find the specific process**: `stress` command with PIDs 49258, 49254, etc.
177+
- **Locate the problematic files**: Multiple `/stress.*` temporary files with 18-23MB each
178+
- **Understand the I/O pattern**: Heavy write operations creating temporary files
179+
180+
With this information, you can take targeted action rather than making broad system changes.
181+
182+
## Next steps
183+
184+
Based on the results from these gadgets, you can take the following actions:
185+
186+
- **High latency in `profile_blockio`**: Investigate the underlying disk performance and if the workload needs better disk performance, consider using [storage optimized nodes](/azure/virtual-machines/sizes/overview#storage-optimized)
187+
- **High I/O operations in `top_blockio`**: Review application logic to optimize disk access patterns or implement caching
188+
- **Specific files in `top_file`**: Analyze if files can be moved to faster storage, cached, or if application logic can be optimized
189+
190+
## Related content
191+
192+
- [Inspektor Gadget documentation](https://inspektor-gadget.io/docs/latest/gadgets/)
193+
- [How to install Inspektor Gadget in an AKS cluster](../logs/capture-system-insights-from-aks.md#how-to-install-inspektor-gadget-in-an-aks-cluster)
194+
- [Troubleshoot high memory consumption in disk-intensive applications](high-memory-consumption-disk-intensive-applications.md)
195+
196+
[!INCLUDE [Third-party information disclaimer](../../../includes/third-party-disclaimer.md)]
197+
[!INCLUDE [Third-party contact information disclaimer](../../../includes/third-party-contact-disclaimer.md)]
198+
[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)]
199+
200+
201+

support/azure/azure-kubernetes/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,8 @@ items:
5757
href: availability-performance/cluster-node-virtual-machine-failed-state.md
5858
- name: Identify containers facing high CPU pressure and throttling
5959
href: availability-performance/troubleshoot-node-cpu-pressure-psi.md
60+
- name: Identify nodes and containers creating high disk latency
61+
href: availability-performance/identify-high-disk-io-latency-containers-aks.md
6062
- name: Identify memory saturation in AKS clusters
6163
href: availability-performance/identify-memory-saturation-aks.md
6264
- name: Identify nodes and containers utilizing high CPU

support/azure/azure-storage/blobs/connectivity/storage-use-azcopy-troubleshoot.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ If you receive an error message that states that your parameters aren't recogniz
121121

122122
Also, make sure to use built-in help messages by using the `-h` switch together with any command (for example, `azcopy copy -h`). See [Get command help](/azure/storage/common/storage-use-azcopy-v10?toc=/azure/storage/blobs/toc.json#get-command-help). To view the same information online, see [azcopy copy](/azure/storage/common/storage-ref-azcopy-copy?toc=/azure/storage/blobs/toc.json).
123123

124-
To help you understand commands, we provide an education tool that's located in the [AzCopy command guide](https://azcopyvnextrelease.z22.web.core.windows.net/). This tool demonstrates the most popular AzCopy commands along with the most popular command flags. To find example commands, see [Transfer data](/azure/storage/common/storage-use-azcopy-v10?toc=/azure/storage/blobs/toc.json#transfer-data). If you have a question, try searching through existing [GitHub issues](https://github.com/Azure/azure-storage-azcopy/issues) first to see whether it was already answered.
124+
To find example commands, see [Transfer data](/azure/storage/common/storage-use-azcopy-v10?toc=/azure/storage/blobs/toc.json#transfer-data). If you have a question, try searching through existing [GitHub issues](https://github.com/Azure/azure-storage-azcopy/issues) first to see whether it was already answered.
125125

126126
## Conditional access policy error
127127

support/mem/intune/device-configuration/factory-reset-protection-emails-not-enforced.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ If **Factory reset protection emails** is set to **Not configured** (default), I
4343
> [!NOTE]
4444
> **Android 15** introduced FRP hardening. Some OEMs previously skipped FRP in certain paths. As of Android 15, FRP enforcement now aligns with Google’s intended design.
4545
46-
We recommend that you set the **Factory reset** value to **Block** to prevent users from using the factory reset option in the device settings.
46+
We recommend that you set the **Factory reset** value to **Block** to prevent users from using the factory reset option in the device settings. This is only available for fully managed and dedicated devices.
4747

4848
:::image type="content" source="media/factory-reset-protection-emails-not-enforced/factory-reset.png" alt-text="Screenshot of Factory reset options.":::
4949

0 commit comments

Comments
 (0)