Skip to content

Commit 06bf26c

Browse files
authored
Merge pull request #262879 from MGoedtel/task169315
Control plan metrics content for AKS
2 parents e76fc91 + 7068946 commit 06bf26c

File tree

4 files changed

+354
-0
lines changed

4 files changed

+354
-0
lines changed

articles/aks/TOC.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -646,8 +646,12 @@
646646
href: events.md
647647
- name: Monitor kube-audit events
648648
href: monitor-apiserver.md
649+
- name: Monitor control plane metrics
650+
href: monitor-control-plane-metrics.md
649651
- name: Monitor reference
650652
href: monitor-aks-reference.md
653+
- name: Control plane metrics reference
654+
href: control-plane-metrics-default-list.md
651655
- name: View the kubelet logs
652656
href: kubelet-logs.md
653657
- name: View container data real-time
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
---
2+
title: List of control plane metrics in Azure Monitor managed service for Prometheus (preview)
3+
description: This article describes the minimal ingestion profile metrics for Azure Kubernetes Service (AKS) control plane metrics.
4+
ms.topic: conceptual
5+
ms.date: 01/31/2024
6+
ms.reviewer: aritraghosh
7+
---
8+
9+
# Minimal ingestion profile for control plane Metrics in Managed Prometheus
10+
11+
Azure Monitor metrics addon collects many Prometheus metrics by default. `Minimal ingestion profile` is a setting that helps reduce ingestion volume of metrics, as only metrics used by default dashboards, default recording rules and default alerts are collected. This article describes how this setting is configured specifically for control plane metrics. This article also lists metrics collected by default when `minimal ingestion profile` is enabled.
12+
13+
> [!NOTE]
14+
> For addon based collection, `Minimal ingestion profile` setting is enabled by default. The discussion here is focused on control plane metrics. The current set of default targets and metrics is listed [here][azure-monitor-prometheus-metrics-scrape-config-minimal].
15+
16+
Following targets are **enabled/ON** by default - meaning you don't have to provide any scrape job configuration for scraping these targets, as metrics addon scrapes these targets automatically by default:
17+
18+
- `controlplane-apiserver` (job=`controlplane-apiserver`)
19+
- `controlplane-etcd` (job=`controlplane-etcd`)
20+
21+
Following targets are available to scrape, but scraping isn't enabled (**disabled/OFF**) by default. Meaning you don't have to provide any scrape job configuration for scraping these targets, and you need to turn **ON/enable** scraping for these targets using the [ama-metrics-settings-configmap][ama-metrics-settings-configmap-github] under the `default-scrape-settings-enabled` section.
22+
23+
- `controlplane-cluster-autoscaler`
24+
- `controlplane-kube-scheduler`
25+
- `controlplane-kube-controller-manager`
26+
27+
> [!NOTE]
28+
> The default scrape frequency for all default targets and scrapes is `30 seconds`. You can override it for each target using the [ama-metrics-settings-configmap][ama-metrics-settings-configmap-github] under `default-targets-scrape-interval-settings` section.
29+
30+
### Minimal ingestion for default ON targets
31+
32+
The following metrics are allow-listed with `minimalingestionprofile=true` for default **ON** targets. The below metrics are collected by default, as these targets are scraped by default.
33+
34+
**controlplane-apiserver**
35+
36+
- `apiserver_request_total`
37+
- `apiserver_cache_list_fetched_objects_total`
38+
- `apiserver_cache_list_returned_objects_total`
39+
- `apiserver_flowcontrol_demand_seats_average`
40+
- `apiserver_flowcontrol_current_limit_seats`
41+
- `apiserver_request_sli_duration_seconds_bucket`
42+
- `apiserver_request_sli_duration_seconds_sum`
43+
- `apiserver_request_sli_duration_seconds_count`
44+
- `process_start_time_seconds`
45+
- `apiserver_request_duration_seconds_bucket`
46+
- `apiserver_request_duration_seconds_sum`
47+
- `apiserver_request_duration_seconds_count`
48+
- `apiserver_storage_list_fetched_objects_total`
49+
- `apiserver_storage_list_returned_objects_total`
50+
- `apiserver_current_inflight_requests`
51+
52+
**controlplane-etcd**
53+
54+
- `etcd_server_has_leader`
55+
- `rest_client_requests_total`
56+
- `etcd_mvcc_db_total_size_in_bytes`
57+
- `etcd_mvcc_db_total_size_in_use_in_bytes`
58+
- `etcd_server_slow_read_indexes_total`
59+
- `etcd_server_slow_apply_total`
60+
- `etcd_network_client_grpc_sent_bytes_total`
61+
- `etcd_server_heartbeat_send_failures_total`
62+
63+
### Minimal ingestion for default OFF targets
64+
65+
The following are metrics that are allow-listed with `minimalingestionprofile=true` for default **OFF** targets. These metrics aren't collected by default. You can turn **ON** scraping for these targets using `default-scrape-settings-enabled.<target-name>=true` using the [ama-metrics-settings-configmap][ama-metrics-settings-configmap-github] under the `default-scrape-settings-enabled` section.
66+
67+
**controlplane-kube-controller-manager**
68+
69+
- `workqueue_depth `
70+
- `rest_client_requests_total`
71+
- `rest_client_request_duration_seconds `
72+
73+
**controlplane-kube-scheduler**
74+
75+
- `scheduler_pending_pods`
76+
- `scheduler_unschedulable_pods`
77+
- `scheduler_queue_incoming_pods_total`
78+
- `scheduler_schedule_attempts_total`
79+
- `scheduler_preemption_attempts_total`
80+
81+
**controlplane-cluster-autoscaler**
82+
83+
- `rest_client_requests_total`
84+
- `cluster_autoscaler_last_activity`
85+
- `cluster_autoscaler_cluster_safe_to_autoscale`
86+
- `cluster_autoscaler_failed_scale_ups_total`
87+
- `cluster_autoscaler_scale_down_in_cooldown`
88+
- `cluster_autoscaler_scaled_up_nodes_total`
89+
- `cluster_autoscaler_unneeded_nodes_count`
90+
- `cluster_autoscaler_unschedulable_pods_count`
91+
- `cluster_autoscaler_nodes_count`
92+
- `cloudprovider_azure_api_request_errors`
93+
- `cloudprovider_azure_api_request_duration_seconds_bucket`
94+
- `cloudprovider_azure_api_request_duration_seconds_count`
95+
96+
> [!NOTE]
97+
> The CPU and memory usage metrics for all control-plane targets are not exposed irrespective of the profile.
98+
99+
## References
100+
101+
- [Kubernetes Upstream metrics list][kubernetes-metrics-instrumentation-reference]
102+
103+
- [Cluster autoscaler metrics list][kubernetes-metrics-autoscaler-reference]
104+
105+
## Next steps
106+
107+
- [Learn more about control plane metrics in Managed Prometheus](monitor-control-plane-metrics.md)
108+
109+
<!-- EXTERNAL LINKS -->
110+
[ama-metrics-settings-configmap-github]: https://github.com/Azure/prometheus-collector/blob/89e865a73601c0798410016e9beb323f1ecba335/otelcollector/configmaps/ama-metrics-settings-configmap.yaml
111+
[kubernetes-metrics-instrumentation-reference]: https://kubernetes.io/docs/reference/instrumentation/metrics/
112+
(https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/metrics.md)
113+
[kubernetes-metrics-autoscaler-reference]: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/metrics.md
114+
115+
<!-- INTERNAL LINKS -->
116+
[azure-monitor-prometheus-metrics-scrape-config-minimal]: ../azure-monitor/containers/prometheus-metrics-scrape-configuration-minimal.md
409 KB
Loading

0 commit comments

Comments
 (0)