Skip to content

Commit 67e3eaa

Browse files
committed
quota-monitoring:update KEP following john's comments
1 parent 8c45cad commit 67e3eaa

File tree

3 files changed

+18
-11
lines changed

3 files changed

+18
-11
lines changed

keps/prod-readiness/sig-node/1029.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@ kep-number: 1029
22
alpha:
33
approver: "@deads2k"
44
beta:
5-
approver: "@deads2k"
5+
approver: "@johnbelamaric"

keps/sig-node/1029-ephemeral-storage-quotas/README.md

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -759,8 +759,8 @@ filesystem walk for better performance and accuracy.
759759
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
760760

761761
Yes, but only for newly created pods.
762-
- Existed Pods: If the pod was created with enforcing quota, disable the feature gate
763-
will not change the running pod.
762+
- Existed Pods: If the pod was created with enforcing quota, pod will not use the enforcing
763+
quota after the feature gate is disabled.
764764
- Newly Created Pods: After setting the feature gate to false, the newly created pod
765765
will not use the enforcing quota.
766766

@@ -798,9 +798,10 @@ If LocalStorageCapacityIsolationFSQuotaMonitoring is turned on but LocalStorageC
798798

799799
* **How can an operator determine if the feature is in use by workloads?**
800800

801-
- A cluster-admin can set kubelet on each node. If the feature gate is disabled, workloads on that node will not use it.
802-
For example, run `xfs_quota -x -c 'report -h' /dev/sdc` to check quota settings in the device.
803-
Check `spec.containers[].resources.limits.ephemeral-storage` of each container.
801+
- In kubelet metrics, an operator can check the histgram metric `kubelet_volume_metric_collection_duration_seconds`
802+
with metric_source equals "fsquota". If there is no `metric_source=fsquota`, this feature should be disabled.
803+
- However, to figure out if a workload is use this feature, there is no direct way now and see more in below
804+
methods of how to check fsquota settings on a node.
804805

805806
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
806807

@@ -818,7 +819,12 @@ the health of the service?**
818819
* **Are there any missing metrics that would be useful to have to improve observability of this feature? **
819820

820821
- Yes, there are no histogram metrics for each volume. The above metric was grouped by volume types because
821-
the cost for every volume is too expensive.
822+
the cost for every volume is too expensive. As a result, users cannot figure out if the feature is used by
823+
a workload directly by the metrics. A cluster-admin can check kubelet configuration on each node. If the
824+
feature gate is disabled, workloads on that node will not use it.
825+
For example, run `xfs_quota -x -c 'report -h' /dev/sdc` to check quota settings in the device.
826+
Check `spec.containers[].resources.limits.ephemeral-storage` of each container to compare.
827+
822828

823829
### Dependencies
824830
* **Does this feature depend on any specific services running in the cluster? **
@@ -872,8 +878,9 @@ details). For now, we leave it here.
872878

873879
###### What steps should be taken if SLOs are not being met to determine the problem?
874880

875-
- Restart kubelet and wait for 1 minute to make the SLOs clear.(The volume stats checking interval is determined by kubelet flag `volumeStatsAggPeriod`(default 1m).)
876-
881+
If the metrics shows some problems, we can check the log and quota dir with below commands.
882+
- There will be warning logs([after the # is merged](https://github.com/kubernetes/kubernetes/pull/107490)) if volume calculation took too long than 1 second
883+
- If quota is enabled, you can find the volume information and the process time with `time repquota -P /var/lib/kubelet -s -v`
877884

878885
## Implementation History
879886

keps/sig-node/1029-ephemeral-storage-quotas/kep.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ approvers:
1414
- "@derekwaynecarr"
1515
editor: TBD
1616
creation-date: 2018-09-06
17-
last-updated: 2022-03-01
18-
status: implemented
17+
last-updated: 2022-06-20
18+
status: implementable
1919
latest-milestone: "1.25"
2020
stage: "alpha"
2121
milestone:

0 commit comments

Comments
 (0)