Skip to content

Commit 92cdcf6

Browse files
authored
Merge pull request kubernetes#2941 from verult/csi-fsgroup-beta-prr
KEP-2317: PRR questionnaire for DelegateFSGroupToCSI beta
2 parents 9959c04 + 42e06f1 commit 92cdcf6

File tree

3 files changed

+51
-35
lines changed

3 files changed

+51
-35
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 2317
22
alpha:
33
approver: "@deads2k"
4+
beta:
5+
approver: "@deads2k"

keps/sig-storage/2317-fsgroup-on-mount/README.md

Lines changed: 44 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@ you need any help or guidance.
175175

176176
* **How can this feature be enabled / disabled in a live cluster?**
177177
- [x] Feature gate (also fill in values in `kep.yaml`)
178-
- Feature gate name: MountWithFSGroup
178+
- Feature gate name: DelegateFSGroupToCSIDriver
179179
- Components depending on the feature gate:
180180
- Kubelet
181181
- [ ] Other
@@ -238,26 +238,27 @@ _This section must be completed when targeting beta graduation to a release._
238238

239239
* **What are the SLIs (Service Level Indicators) an operator can use to determine
240240
the health of the service?**
241-
- [ ] Metrics
242-
- Metric name:
243-
- [Optional] Aggregation method:
244-
- Components exposing the metric:
241+
- [x] Metrics
242+
- Metric name: csi_operations_seconds
243+
- [Optional] Aggregation method: filter by `method_name=NodeStageVolume|NodePublishVolume`, `driver_name` (CSI driver name), `grpc_status_code`.
244+
- Components exposing the metric: kubelet
245245
- [ ] Other (treat as last resort)
246246
- Details:
247+
248+
The `csi_operations_seconds` metrics reports a latency histogram of kubelet-initiated CSI gRPC calls by gRPC status code. Filtering by `NodeStageVolume` and `NodePublishVolume` will give us latency data for the respective gRPC calls which include FSGroup operations for drivers with `VOLUME_MOUNT_GROUP` capability, but analyzing driver logs is necessary to further isolate the problem to this feature.
249+
250+
An SLI isn't necessary for kubelet logic since it just passes the FSGroup parameter to the CSI driver.
247251

248252
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
249-
At a high level, this usually will be in the form of "high percentile of SLI
250-
per day <= X". It's impossible to provide comprehensive guidance, but at the very
251-
high level (needs more precise definitions) those may be things like:
252-
- per-day percentage of API calls finishing with 5XX errors <= 1%
253-
- 99% percentile over day of absolute value from (job creation time minus expected
254-
job creation time) for cron job <= 10%
255-
- 99,9% of /health requests per day finish with 200 code
253+
254+
For a particular CSI driver, per-day percentage of gRPC calls with `method_name=NodeStageVolume|NodePublishVolume` returning error status codes (as defined by the CSI spec) <= 1%.
255+
256+
Latency SLO would be specific to each driver.
256257

257258
* **Are there any missing metrics that would be useful to have to improve observability
258259
of this feature?**
259-
Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
260-
implementation difficulties, etc.).
260+
261+
https://github.com/kubernetes/kubernetes/issues/98667 as mentioned above - aiming to implement this as part of beta.
261262

262263
### Dependencies
263264

@@ -282,37 +283,33 @@ _For GA, this section is required: approvers should be able to confirm the
282283
previous answers based on experience in the field._
283284

284285
* **Will enabling / using this feature result in any new API calls?**
285-
Describe them, providing:
286-
- API call type (e.g. PATCH pods)
287-
- estimated throughput
288-
- originating component(s) (e.g. Kubelet, Feature-X-controller)
289-
focusing mostly on:
290-
- components listing and/or watching resources they didn't before
291-
- API calls that may be triggered by changes of some Kubernetes resources
292-
(e.g. update of object X triggers new updates of object Y)
293-
- periodic API calls to reconcile state (e.g. periodic fetching state,
294-
heartbeats, leader election, etc.)
286+
287+
No.
295288

296289
* **Will enabling / using this feature result in introducing new API types?**
297-
Describe them, providing:
298-
- API type
299-
- Supported number of objects per cluster
300-
- Supported number of objects per namespace (for namespace-scoped objects)
290+
291+
No.
301292

302293
* **Will enabling / using this feature result in any new calls to the cloud
303294
provider?**
304295

296+
No.
297+
305298
* **Will enabling / using this feature result in increasing size or count of
306299
the existing API objects?**
307-
Describe them, providing:
308-
- API type(s):
309-
- Estimated increase in size: (e.g., new annotation of size 32B)
310-
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
300+
301+
No.
311302

312303
* **Will enabling / using this feature result in increasing time taken by any
313304
operations covered by [existing SLIs/SLOs]?**
314305
Think about adding additional work or introducing new steps in between
315306
(e.g. need to do X to start a container), etc. Please describe the details.
307+
308+
Depending on the driver implementation of applying FSGroup, latency for the following SLI may increase:
309+
310+
"Startup latency of schedulable stateful pods, excluding time to pull images, run init containers, provision volumes (in delayed binding mode) and unmount/detach volumes (from previous pod if needed), measured from pod creation timestamp to when all its containers are reported as started and observed via watch, measured as 99th percentile over last 5 minutes"
311+
312+
Comparing to the existing recursive `chown` and `chmod` strategy, this operation will likely improve pod startup latency in the most common case.
316313

317314
* **Will enabling / using this feature result in non-negligible increase of
318315
resource usage (CPU, RAM, disk, IO, ...) in any components?**
@@ -321,6 +318,8 @@ resource usage (CPU, RAM, disk, IO, ...) in any components?**
321318
volume), significant amount of data sent and/or received over network, etc.
322319
This through this both in small and large cases, again with respect to the
323320
[supported limits].
321+
322+
Not in Kubernetes components. CSI drivers may vary in their implementation and may increase resource usage.
324323

325324
### Troubleshooting
326325

@@ -332,6 +331,8 @@ _This section must be completed when targeting beta graduation to a release._
332331

333332
* **How does this feature react if the API server and/or etcd is unavailable?**
334333

334+
This feature is part of the volume mount path in kubelet, and does not add extra communication with the API server, so this does not introduce new failure modes in the presence of API server or etcd downtime.
335+
335336
* **What are other known failure modes?**
336337
For each of them, fill in the following information by copying the below template:
337338
- [Failure mode brief description]
@@ -343,9 +344,20 @@ _This section must be completed when targeting beta graduation to a release._
343344
levels that could help debug the issue?
344345
Not required until feature graduated to beta.
345346
- Testing: Are there any tests for failure mode? If not, describe why.
347+
348+
In addition to existing k8s volume and CSI failure modes:
349+
350+
- Driver fails to apply FSGroup (due to a driver error).
351+
- Detection: SLI above, in conjunction with the metric in https://github.com/kubernetes/kubernetes/issues/98667 to determine if this feature is being used.
352+
- Mitigations: Revert the CSI driver version to one without the issue, or avoid specifying an FSGroup in the pod's security context, if possible.
353+
- Diagnostics: Depends on the driver. Generally look for FSGroup-related messages in `NodeStageVolume` and `NodePublishVolume` logs.
354+
- Testing: Will add an e2e test with a test driver (csi-driver-host-path) simulating a FSGroup failure.
355+
346356

347357
* **What steps should be taken if SLOs are not being met to determine the problem?**
348358

359+
The CSI driver log should be inspected to look for `NodeStageVolume` and/or `NodePublishVolume` errors.
360+
349361
[supported limits]: https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
350362
[existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
351363

keps/sig-storage/2317-fsgroup-on-mount/kep.yaml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ title: Provide fsgroup of pod to CSI driver on mount
22
kep-number: 2317
33
authors:
44
- "@gnufied"
5+
- "@verult"
56
owning-sig: sig-storage
67
participating-sigs:
78
status: implementable
@@ -17,20 +18,21 @@ see-also:
1718
replaces:
1819

1920
# The target maturity stage in the current dev cycle for this KEP.
20-
stage: alpha
21+
stage: beta
2122

2223
# The most recent milestone for which work toward delivery of this KEP has been
2324
# done. This can be the current (upcoming) milestone, if it is being actively
2425
# worked on.
25-
latest-milestone: "v1.22"
26+
latest-milestone: "v1.23"
2627

2728
# The milestone at which this feature was, or is targeted to be, at each stage.
2829
milestone:
2930
alpha: "v1.22"
31+
beta: "v1.23"
3032
# The following PRR answers are required at alpha release
3133
# List the feature gate name and the components for which it must be enabled
3234
feature-gates:
33-
- name: MountWithFSGroup
35+
- name: DelegateFSGroupToCSIDriver
3436
components:
3537
- kubelet
3638
disable-supported: true

0 commit comments

Comments
 (0)