You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: support/azure/azure-kubernetes/create-upgrade-delete/troubleshoot-apiserver-etcd.md
+33Lines changed: 33 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -279,6 +279,39 @@ The following procedure shows you how to throttle an offending client's LIST Pod
279
279
kubectl get --raw /metrics | grep "restrict-bad-client"
280
280
```
281
281
282
+
### Solution 3c: Use the API Server Resource Intensive Listing Detector in Azure Portal
283
+
284
+
> **New:** Azure Kubernetes Service now provides a built-in analyzer to help you identify agents making resource-intensive LIST calls, which are a leading cause of API server and etcd performance issues.
285
+
286
+
**How to access the detector:**
287
+
288
+
1. Open your AKS cluster in the Azure portal.
289
+
2. Go to **Diagnose and solve problems**.
290
+
3. Click **Cluster and Control Plane Availability and Performance**.
291
+
4. Select **API server resource intensive listing detector**.
292
+
293
+
This detector analyzes recent API server activity and highlights agents or workloads generating large or frequent LIST calls. It provides a summary of potential impacts, such as request timeouts, increased 408/503 errors, node instability, health probe failures, and OOM-Kills in API server or etcd.
294
+
295
+
#### How to interpret the detector output
296
+
297
+
- **Summary:**
298
+
Indicates if resource-intensive LIST calls were detected and describes possible impacts on your cluster.
299
+
- **Analysis window:**
300
+
Shows the 30-minute window analyzed, with peak memory and CPU usage.
301
+
- **Read types:**
302
+
Explains whether LIST calls were served from the API server cache (preferred) or required fetching from etcd (most impactful).
303
+
- **Charts and tables:**
304
+
Identify which agents, namespaces, or workloads are generating the most resource-intensive LIST calls.
305
+
306
+
> Only successful LIST calls are counted. Failed or throttled calls are excluded.
307
+
308
+
The analyzer also provides actionable recommendations directly in the Azure portal, tailored to the detected patterns, to help you remediate and optimize your cluster.
309
+
310
+
> **Note:**
311
+
> The API server resource intensive listing detector is available to all users with access to the AKS resource in the Azure portal. No special permissions or prerequisites are required.
312
+
>
313
+
> After identifying the offending agents and applying the above recommendations, you can further use [Priority and Fairness](https://kubernetes.io/docs/concepts/cluster-administration/flow-control/) to throttle or isolate problematic clients.
314
+
282
315
## Cause 4: A custom webhook might cause a deadlock in API server pods
283
316
284
317
A custom webhook, such as Kyverno, might be causing a deadlock within API server pods.
0 commit comments