Skip to content

Commit a1d83ed

Browse files
committed
review comments
1 parent 0390966 commit a1d83ed

File tree

1 file changed

+21
-20
lines changed
  • keps/sig-node/1287-in-place-update-pod-resources

1 file changed

+21
-20
lines changed

keps/sig-node/1287-in-place-update-pod-resources/README.md

Lines changed: 21 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,10 @@
3838
- [Resource Quota](#resource-quota)
3939
- [Affected Components](#affected-components)
4040
- [Instrumentation](#instrumentation)
41-
- [<code>kubelet_container_resize_requests_total</code>](#kubelet_container_resize_requests_total)
42-
- [<code>kubelet_pod_resize_sli_duration_seconds</code>](#kubelet_pod_resize_sli_duration_seconds)
43-
- [<code>kubelet_pod_pending_resize_total</code>](#kubelet_pod_pending_resize_total)
44-
- [<code>kubelet_pod_in_progress_resize_total</code>](#kubelet_pod_in_progress_resize_total)
41+
- [<code>kubelet_container_resize_attempts_total</code>](#kubelet_container_resize_attempts_total)
42+
- [<code>kubelet_pod_resize_duration_seconds</code>](#kubelet_pod_resize_duration_seconds)
43+
- [<code>kubelet_pod_pending_resizes</code>](#kubelet_pod_pending_resizes)
44+
- [<code>kubelet_pod_in_progress_resizes</code>](#kubelet_pod_in_progress_resizes)
4545
- [<code>kubelet_pod_deferred_resize_accepted_total</code>](#kubelet_pod_deferred_resize_accepted_total)
4646
- [Static CPU &amp; Memory Policy](#static-cpu--memory-policy)
4747
- [Future Enhancements](#future-enhancements)
@@ -891,51 +891,52 @@ Other components:
891891

892892
The kubelet will record the following metrics:
893893

894-
#### `kubelet_container_resize_requests_total`
894+
#### `kubelet_container_resize_attempts_total`
895895

896-
This metric tracks the total number of resize requests observed by the Kubelet, counted at the container level.
897-
A single pod update changing multiple containers will be considered separate resize requests.
896+
This metric tracks the total number of resize attempts observed by the Kubelet, counted at the container level.
897+
A single pod update changing multiple containers will be considered separate resize attempts.
898898

899899
Labels:
900-
- `resource_type` - what type of resource is being resized. Possible values: `cpu_limits`, `cpu_requests` `memory_limits`, or `memory_requests`. If more than one of these resource types is changing in the resize request,
901-
we increment the counter multiple times, once for each. This means that a single container update changing multiple
902-
resource types will be considered multiple requests for this metric.
903-
- `operation_type` - whether the resize is an increase or a decrease. Possible values: `increase`, `decrease`, `add`, or `remove`.
900+
- `resource` - what resource. Possible values: `cpu`, or `memory`. If more than one of these resource types is changing in the resize request, we increment the counter multiple times, once for each.
901+
- `type` - what type of resource is being resized. Possible values: `limits`, or `requests`. If more than one of these resource types is changing in the resize request, we increment the counter multiple times, once for each.
902+
- `operation` - whether the resize is an increase or a decrease. Possible values: `increase`, `decrease`, `add`, or `remove`.
904903
- `namespace` - the namespace of the pod.
905904

906905
This metric is recorded as a counter.
907906

908-
#### `kubelet_pod_resize_sli_duration_seconds`
909-
This metric tracks the latency between when the kubelet accepts a resize request and when it finshes actuating the request. More precisely, this metric tracks the total amount of time that the PodResizeInProgress condition is present on a pod.
907+
#### `kubelet_pod_resize_duration_seconds`
908+
This metric tracks the duration of [doPodResizeAction](https://github.com/kubernetes/kubernetes/blob/92de70895830ea1a9c2c6554bdab4cbee7ce867d/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L699), which
909+
is responsible for actuating the resize.
910910

911911
Labels:
912912
- `namespace` - the namespace of the pod.
913913

914-
This metric is recorded as a gauge.
914+
This metric is recorded as a histogram.
915915

916-
#### `kubelet_pod_pending_resize_total`
916+
#### `kubelet_pod_pending_resizes`
917917

918-
This metric tracks the total count of pods that the kubelet marks as pending. This will make it
918+
This metric tracks the current count of pods that the kubelet marks as pending. This will make it
919919
easier for us to see which of the current limitations users are running into the most.
920920

921921
Labels:
922922
- `reason` - why the resize is pending. Possible values: `infeasible` or `deferred`.
923-
- `message` - more details about why the resize is pending. Although a more detailed "message" will be provided in the `PodResizePending`
923+
- `reason_detail` - more details about why the resize is pending. Although a more detailed "message" will be provided in the `PodResizePending`
924924
condition in the pod, we limit this label to only the following possible values to keep cardinality low:
925925
- `guaranteed_pod_cpu_manager_static_policy` - In-place resize is not supported for Guaranteed Pods alongside CPU Manager static policy.
926926
- `guaranteed_pod_memory_manager_static_policy` - In-place resize is not supported for Guaranteed Pods alongside Memory Manager static policy.
927927
- `static_pod` - In-place resize is not supported for static pods.
928928
- `swap_limitation` - In-place resize is not supported for containers with swap.
929-
- `node_capacity` - The node doesn't have enough capacity for this resize request.
929+
- `insufficient_node_allocatable` - The node doesn't have enough capacity for this resize request.
930930
- `namespace` - the namespace of the pod.
931931

932932
This list of possible reasons may shrink or grow depending on limitations that are added or removed in the future.
933933

934934
This metric is recorded as a gauge.
935935

936-
#### `kubelet_pod_in_progress_resize_total`
936+
#### `kubelet_pod_in_progress_resizes`
937937

938-
This metric tracks the total count of resize requests that the kubelet marks as in progress.
938+
This metric tracks the total count of resize requests that the kubelet marks as in progress, meaning that
939+
the resources have been allocated but not yet actuated.
939940

940941
Labels:
941942
- `namespace` - the namespace of the pod.

0 commit comments

Comments
 (0)