|
38 | 38 | - [Resource Quota](#resource-quota)
|
39 | 39 | - [Affected Components](#affected-components)
|
40 | 40 | - [Instrumentation](#instrumentation)
|
41 |
| - - [<code>kubelet_container_resize_requests_total</code>](#kubelet_container_resize_requests_total) |
42 |
| - - [<code>kubelet_pod_resize_sli_duration_seconds</code>](#kubelet_pod_resize_sli_duration_seconds) |
43 |
| - - [<code>kubelet_pod_pending_resize_total</code>](#kubelet_pod_pending_resize_total) |
44 |
| - - [<code>kubelet_pod_in_progress_resize_total</code>](#kubelet_pod_in_progress_resize_total) |
| 41 | + - [<code>kubelet_container_resize_attempts_total</code>](#kubelet_container_resize_attempts_total) |
| 42 | + - [<code>kubelet_pod_resize_duration_seconds</code>](#kubelet_pod_resize_duration_seconds) |
| 43 | + - [<code>kubelet_pod_pending_resizes</code>](#kubelet_pod_pending_resizes) |
| 44 | + - [<code>kubelet_pod_in_progress_resizes</code>](#kubelet_pod_in_progress_resizes) |
45 | 45 | - [<code>kubelet_pod_deferred_resize_accepted_total</code>](#kubelet_pod_deferred_resize_accepted_total)
|
46 | 46 | - [Static CPU & Memory Policy](#static-cpu--memory-policy)
|
47 | 47 | - [Future Enhancements](#future-enhancements)
|
@@ -891,51 +891,52 @@ Other components:
|
891 | 891 |
|
892 | 892 | The kubelet will record the following metrics:
|
893 | 893 |
|
894 |
| -#### `kubelet_container_resize_requests_total` |
| 894 | +#### `kubelet_container_resize_attempts_total` |
895 | 895 |
|
896 |
| -This metric tracks the total number of resize requests observed by the Kubelet, counted at the container level. |
897 |
| -A single pod update changing multiple containers will be considered separate resize requests. |
| 896 | +This metric tracks the total number of resize attempts observed by the Kubelet, counted at the container level. |
| 897 | +A single pod update changing multiple containers will be considered separate resize attempts. |
898 | 898 |
|
899 | 899 | Labels:
|
900 |
| -- `resource_type` - what type of resource is being resized. Possible values: `cpu_limits`, `cpu_requests` `memory_limits`, or `memory_requests`. If more than one of these resource types is changing in the resize request, |
901 |
| -we increment the counter multiple times, once for each. This means that a single container update changing multiple |
902 |
| -resource types will be considered multiple requests for this metric. |
903 |
| -- `operation_type` - whether the resize is an increase or a decrease. Possible values: `increase`, `decrease`, `add`, or `remove`. |
| 900 | +- `resource` - what resource. Possible values: `cpu`, or `memory`. If more than one of these resource types is changing in the resize request, we increment the counter multiple times, once for each. |
| 901 | +- `type` - what type of resource is being resized. Possible values: `limits`, or `requests`. If more than one of these resource types is changing in the resize request, we increment the counter multiple times, once for each. |
| 902 | +- `operation` - whether the resize is an increase or a decrease. Possible values: `increase`, `decrease`, `add`, or `remove`. |
904 | 903 | - `namespace` - the namespace of the pod.
|
905 | 904 |
|
906 | 905 | This metric is recorded as a counter.
|
907 | 906 |
|
908 |
| -#### `kubelet_pod_resize_sli_duration_seconds` |
909 |
| -This metric tracks the latency between when the kubelet accepts a resize request and when it finshes actuating the request. More precisely, this metric tracks the total amount of time that the PodResizeInProgress condition is present on a pod. |
| 907 | +#### `kubelet_pod_resize_duration_seconds` |
| 908 | +This metric tracks the duration of [doPodResizeAction](https://github.com/kubernetes/kubernetes/blob/92de70895830ea1a9c2c6554bdab4cbee7ce867d/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L699), which |
| 909 | +is responsible for actuating the resize. |
910 | 910 |
|
911 | 911 | Labels:
|
912 | 912 | - `namespace` - the namespace of the pod.
|
913 | 913 |
|
914 |
| -This metric is recorded as a gauge. |
| 914 | +This metric is recorded as a histogram. |
915 | 915 |
|
916 |
| -#### `kubelet_pod_pending_resize_total` |
| 916 | +#### `kubelet_pod_pending_resizes` |
917 | 917 |
|
918 |
| -This metric tracks the total count of pods that the kubelet marks as pending. This will make it |
| 918 | +This metric tracks the current count of pods that the kubelet marks as pending. This will make it |
919 | 919 | easier for us to see which of the current limitations users are running into the most.
|
920 | 920 |
|
921 | 921 | Labels:
|
922 | 922 | - `reason` - why the resize is pending. Possible values: `infeasible` or `deferred`.
|
923 |
| -- `message` - more details about why the resize is pending. Although a more detailed "message" will be provided in the `PodResizePending` |
| 923 | +- `reason_detail` - more details about why the resize is pending. Although a more detailed "message" will be provided in the `PodResizePending` |
924 | 924 | condition in the pod, we limit this label to only the following possible values to keep cardinality low:
|
925 | 925 | - `guaranteed_pod_cpu_manager_static_policy` - In-place resize is not supported for Guaranteed Pods alongside CPU Manager static policy.
|
926 | 926 | - `guaranteed_pod_memory_manager_static_policy` - In-place resize is not supported for Guaranteed Pods alongside Memory Manager static policy.
|
927 | 927 | - `static_pod` - In-place resize is not supported for static pods.
|
928 | 928 | - `swap_limitation` - In-place resize is not supported for containers with swap.
|
929 |
| - - `node_capacity` - The node doesn't have enough capacity for this resize request. |
| 929 | + - `insufficient_node_allocatable` - The node doesn't have enough capacity for this resize request. |
930 | 930 | - `namespace` - the namespace of the pod.
|
931 | 931 |
|
932 | 932 | This list of possible reasons may shrink or grow depending on limitations that are added or removed in the future.
|
933 | 933 |
|
934 | 934 | This metric is recorded as a gauge.
|
935 | 935 |
|
936 |
| -#### `kubelet_pod_in_progress_resize_total` |
| 936 | +#### `kubelet_pod_in_progress_resizes` |
937 | 937 |
|
938 |
| -This metric tracks the total count of resize requests that the kubelet marks as in progress. |
| 938 | +This metric tracks the total count of resize requests that the kubelet marks as in progress, meaning that |
| 939 | +the resources have been allocated but not yet actuated. |
939 | 940 |
|
940 | 941 | Labels:
|
941 | 942 | - `namespace` - the namespace of the pod.
|
|
0 commit comments