Skip to content

Commit 5ea3f1e

Browse files
authored
Merge pull request kubernetes#1830 from pohly/generic-inline-volumes
generic ephemeral volumes: PRR and beta review
2 parents aacf9db + e669a14 commit 5ea3f1e

File tree

3 files changed

+139
-98
lines changed

3 files changed

+139
-98
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 1698
2+
beta:
3+
approver: "@wojtek-t"

keps/sig-storage/1698-generic-ephemeral-volumes/README.md

Lines changed: 130 additions & 96 deletions
Original file line numberDiff line numberDiff line change
@@ -1,63 +1,3 @@
1-
<!--
2-
**Note:** When your KEP is complete, all of these comment blocks should be removed.
3-
4-
To get started with this template:
5-
6-
- [ ] **Pick a hosting SIG.**
7-
Make sure that the problem space is something the SIG is interested in taking
8-
up. KEPs should not be checked in without a sponsoring SIG.
9-
- [ ] **Create an issue in kubernetes/enhancements**
10-
When filing an enhancement tracking issue, please ensure to complete all
11-
fields in that template. One of the fields asks for a link to the KEP. You
12-
can leave that blank until this KEP is filed, and then go back to the
13-
enhancement and add the link.
14-
- [ ] **Make a copy of this template directory.**
15-
Copy this template into the owning SIG's directory and name it
16-
`NNNN-short-descriptive-title`, where `NNNN` is the issue number (with no
17-
leading-zero padding) assigned to your enhancement above.
18-
- [ ] **Fill out as much of the kep.yaml file as you can.**
19-
At minimum, you should fill in the "title", "authors", "owning-sig",
20-
"status", and date-related fields.
21-
- [ ] **Fill out this file as best you can.**
22-
At minimum, you should fill in the "Summary", and "Motivation" sections.
23-
These should be easy if you've preflighted the idea of the KEP with the
24-
appropriate SIG(s).
25-
- [ ] **Create a PR for this KEP.**
26-
Assign it to people in the SIG that are sponsoring this process.
27-
- [ ] **Merge early and iterate.**
28-
Avoid getting hung up on specific details and instead aim to get the goals of
29-
the KEP clarified and merged quickly. The best way to do this is to just
30-
start with the high-level sections and fill out details incrementally in
31-
subsequent PRs.
32-
33-
Just because a KEP is merged does not mean it is complete or approved. Any KEP
34-
marked as a `provisional` is a working document and subject to change. You can
35-
denote sections that are under active debate as follows:
36-
37-
```
38-
<<[UNRESOLVED optional short context or usernames ]>>
39-
Stuff that is being argued.
40-
<<[/UNRESOLVED]>>
41-
```
42-
43-
When editing KEPS, aim for tightly-scoped, single-topic PRs to keep discussions
44-
focused. If you disagree with what is already in a document, open a new PR
45-
with suggested changes.
46-
47-
One KEP corresponds to one "feature" or "enhancement", for its whole lifecycle.
48-
You do not need a new KEP to move from beta to GA, for example. If there are
49-
new details that belong in the KEP, edit the KEP. Once a feature has become
50-
"implemented", major changes should get new KEPs.
51-
52-
The canonical place for the latest set of instructions (and the likely source
53-
of this file) is [here](/keps/NNNN-kep-template/README.md).
54-
55-
**Note:** Any PRs to move a KEP to `implementable` or significant changes once
56-
it is marked `implementable` must be approved by each of the KEP approvers.
57-
If any of those approvers is no longer appropriate than changes to that list
58-
should be approved by the remaining approvers and/or the owning SIG (or
59-
SIG Architecture for cross cutting KEPs).
60-
-->
611
# KEP-1698: generic ephemeral inline volumes
622

633
<!-- toc -->
@@ -103,28 +43,14 @@ SIG Architecture for cross cutting KEPs).
10343

10444
## Release Signoff Checklist
10545

106-
<!--
107-
**ACTION REQUIRED:** In order to merge code into a release, there must be an
108-
issue in [kubernetes/enhancements] referencing this KEP and targeting a release
109-
milestone **before the [Enhancement Freeze](https://git.k8s.io/sig-release/releases)
110-
of the targeted release**.
111-
112-
For enhancements that make changes to code or processes/procedures in core
113-
Kubernetes i.e., [kubernetes/kubernetes], we require the following Release
114-
Signoff checklist to be completed.
115-
116-
Check these off as they are completed for the Release Team to track. These
117-
checklist items _must_ be updated for the enhancement to be released.
118-
-->
119-
12046
Items marked with (R) are required *prior to targeting to a milestone / release*.
12147

12248
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
123-
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
124-
- [ ] (R) Design details are appropriately documented
125-
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
49+
- [X] (R) KEP approvers have approved the KEP status as `implementable`
50+
- [X] (R) Design details are appropriately documented
51+
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
12652
- [X] (R) Graduation criteria is in place
127-
- [ ] (R) Production readiness review completed
53+
- [X] (R) Production readiness review completed
12854
- [ ] Production readiness review approved
12955
- [ ] "Implementation History" section is up-to-date for milestone
13056
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
@@ -306,11 +232,21 @@ directly. Cluster administrators must be made aware of this. If this
306232
does not fit their security model, they can disable the feature
307233
through the feature gate that will be added for the feature.
308234

309-
In addition, with a new
235+
In addition, with a new `ephemeral` value for
310236
[`FSType`](https://github.com/kubernetes/kubernetes/blob/1fb0dd4ec5134014e466509163152112626d52c3/pkg/apis/policy/types.go#L278-L309)
311237
it will be possible to limit the usage of this volume source via the
312238
[PodSecurityPolicy
313239
(PSP)](https://kubernetes.io/docs/concepts/policy/pod-security-policy/#volumes-and-file-systems).
240+
If a PSP exists, `FSType` either has to include `all` or `ephemeral`
241+
for this feature to be allowed. If no PSP exists, the feature is
242+
allowed.
243+
244+
Adding that new value is an API change for PSP because it changes
245+
validation. When the feature is disabled, validation must tolerate
246+
this new value in updates of existing PSP objects that already contain
247+
the value, but must not allow it when creating a new PSP or updating a
248+
PSP that does not already contain the value. When the feature is
249+
enabled, validation must allow this value on any create or update.
314250

315251
The normal namespace quota for PVCs in a namespace still applies, so
316252
even if users are allowed to use this new mechanism, they cannot use
@@ -445,10 +381,12 @@ automatically enable late binding for PVCs which are owned by a pod.
445381
- Gather feedback from developers and surveys
446382
- Errors emitted as pod events
447383
- Decide whether `CSIVolumeSource` (in beta at the moment) should be
448-
merged with `EphemeralVolumeSource`
384+
merged with `EphemeralVolumeSource`: no, instead the goal is
385+
to [rename `CSIVolumeSource`](https://github.com/kubernetes/enhancements/issues/596#issuecomment-726185967)
449386
- Decide whether in-tree ephemeral volume sources, like EmptyDir (GA
450387
already), should also be added EphemeralVolumeSource for sake of API
451-
consistency
388+
consistency: [no](https://docs.google.com/document/d/1yAe3SPPosgC_QgmnY7oJTmZYWrqLrii1oA4de67DEcw/edit),
389+
this just causes API churn without tangible benefits
452390
- Tests are in Testgrid and linked in KEP
453391

454392
#### Beta -> GA Graduation
@@ -497,77 +435,173 @@ version will prevent pods from starting.
497435
Pods that got stuck will work again.
498436

499437
* **Are there any tests for feature enablement/disablement?**
500-
Yes, unit tests for the apiserver and kubelet.
501438

502-
### Rollout, Upgrade and Rollback Planning
439+
Yes, unit tests for the apiserver, kube-controller-manager and kubelet cover scenarios
440+
where the feature is disabled or enabled. Tests for transitions
441+
between these states will be added before beta.
503442

504-
Will be added before the transition to beta.
443+
### Rollout, Upgrade and Rollback Planning
505444

506445
* **How can a rollout fail? Can it impact already running workloads?**
507446

447+
A rollout could fail because the implementation turns out to be
448+
faulty. Such bugs may cause unexpected shutdowns of kube-scheduler,
449+
kube-apiserver, kube-controller-manager and kubelet. For the API
450+
server, broken support for the new volume type may also show up as 5xx
451+
error codes for any object that embeds a `VolumeSource` (Pod,
452+
StatefulSet, DaemonSet, etc.).
453+
454+
Already running workloads should not be affected unless they depend on
455+
these components at runtime and bugs cause unexpected shutdowns.
456+
508457
* **What specific metrics should inform a rollback?**
509458

459+
One indicator are unexpected restarts of the cluster control plane
460+
components. Another are an increase in the number of pods that fail to
461+
start. In both cases further analysis of logs and pod events is needed
462+
to determine whether errors are related to this feature.
463+
510464
* **Were upgrade and rollback tested? Was upgrade->downgrade->upgrade path tested?**
511465

466+
Not yet, but will be done manually before transition to beta.
467+
512468
* **Is the rollout accompanied by any deprecations and/or removals of features,
513469
APIs, fields of API types, flags, etc.?**
514470

515-
### Monitoring requirements
471+
No.
516472

517-
Will be added before the transition to beta.
473+
### Monitoring requirements
518474

519475
* **How can an operator determine if the feature is in use by workloads?**
520476

477+
There will be pods which have a non-nil
478+
`VolumeSource.Ephemeral.VolumeClaimTemplate`.
479+
480+
521481
* **What are the SLIs (Service Level Indicators) an operator can use to
522482
determine the health of the service?**
523483

484+
The service here is the Kubernetes control plane. Overall health and
485+
performance can be observed by measuring the the pod creation rate for
486+
pods using generic ephemeral inline volumes. Such [a
487+
SLI](https://github.com/kubernetes/community/blob/master/sig-scalability/slos/pod_startup_latency.md)
488+
is defined for pods without volumes and work in progress for pods with
489+
volumes.
490+
491+
For kube-controller-manager, a metric that exposes the usual work
492+
queue metrics data (like queue length) will be made available.
493+
Furthermore, a count of PVC creation attempts will be added, labeled
494+
with the result (successful vs. error code). A non-zero count of attempts
495+
with "already exists" will indicate that there were conflicts with
496+
manually created PVCs.
497+
498+
TODO: list metrics names here and in kep.yaml
499+
524500
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
525501

502+
The goal is to achieve the same pod creation rate for pods using
503+
generic ephemeral inline volumes as for pods that use PVCs which get
504+
created separately. To make this comparable, the storage class should
505+
use late binding.
506+
507+
This will need further discussion before going to GA.
508+
526509
* **Are there any missing metrics that would be useful to have to improve
527-
observability if this feature?**
510+
observability of this feature?**
528511

529-
### Dependencies
512+
No.
530513

531-
Will be added before the transition to beta.
514+
### Dependencies
532515

533516
* **Does this feature depend on any specific services running in the cluster?**
534517

535-
### Scalability
518+
A dynamic provisioner from some kind of storage system is needed:
536519

537-
Will be added before the transition to beta.
520+
* Volume provisioner
521+
* Usage description:
522+
* Impact of its outage on the feature: pods that use generic inline volumes
523+
provided by the storage system will not be able to start
524+
* Impact of its degraded performance or high-error rates on the
525+
feature: slower pod startup
526+
527+
### Scalability
538528

539529
* **Will enabling / using this feature result in any new API calls?**
540530

531+
Enabling will not change anything.
532+
533+
Using the feature in a pod will lead to one PVC creation per inline
534+
volume, followed by garbage collection of those PVCs when the pod
535+
terminates.
536+
541537
* **Will enabling / using this feature result in introducing new API types?**
542538

539+
No.
540+
543541
* **Will enabling / using this feature result in any new calls to cloud
544542
provider?**
545543

544+
Enabling the feature doesn't. Using it will cause new calls to cloud
545+
providers, but the amount is exactly the same as without this feature:
546+
for each per-pod volume, a PVC has to be created (either manually or
547+
using this feature) and a volume needs to be provisioned in a storage
548+
backend. When a pod terminates, that volume needs to be deleted again.
549+
546550
* **Will enabling / using this feature result in increasing size or count
547551
of the existing API objects?**
548552

553+
Enabling it will not change existing objects. Using it in a pod spec
554+
will increase the size by one `PersistentVolumeClaimTemplate` per
555+
inline volume and cause one PVC to be created for each inline volume.
556+
549557
* **Will enabling / using this feature result in increasing time taken by any
550558
operations covered by [existing SLIs/SLOs][]?**
551559

560+
There is a SLI for [scheduling of pods without
561+
volumes](https://github.com/kubernetes/community/blob/master/sig-scalability/slos/pod_startup_latency.md)
562+
with a corresponding SLO. Those are not expected to be affected.
563+
564+
A SLI for scheduling of pods with volumes is work in progress. The SLO
565+
for it will depend on the specific storage driver.
566+
552567
* **Will enabling / using this feature result in non-negligible increase of
553568
resource usage (CPU, RAM, disk, IO, ...) in any components?**
554569

555-
### Troubleshooting
570+
Potentially in kube-scheduler and kube-controller-manager, but mostly only if
571+
the feature is actually used. Merely enabling it will cause the new controller
572+
in kube-controller-manager to check new pods for the new volume type, which
573+
should be fast. In kube-scheduler the feature adds an additional case to
574+
switch statements that check for persistent volume sources.
556575

557-
Will be added before the transition to beta.
576+
### Troubleshooting
558577

559578
* **How does this feature react if the API server and/or etcd is unavailable?**
560579

580+
Pods will not start and volumes for them will not get provisioned.
581+
561582
* **What are other known failure modes?**
562583

584+
As [explained
585+
above](#preventing-accidental-collision-with-existing-pvcs), the PVC
586+
that needs to be created for a pod may conflict with an already
587+
existing PVC that was created independently of the pod. In such a
588+
case, the pod will not be able to start until that independent PVC is
589+
deleted. This scenario will be exposed as events for the pod by
590+
kube-controller-manager.
591+
592+
If the storage system fails to provision volumes, then this will be
593+
exposed as events for the PVC and (depending on the storage system)
594+
may also show up in metrics data.
595+
563596
* **What steps should be taken if SLOs are not being met to determine the problem?**
564597

565-
[supported limits]: https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
566-
[existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
598+
SLOs only exist for pods which don't use the new feature. If those are
599+
somehow affected, then error messages in the kube-scheduler and kube-controller-manager
600+
output may provide additional information.
567601

568602
## Implementation History
569603

570-
- Kubernetes 1.19: alpha (tentative)
604+
- Kubernetes 1.19: alpha
571605

572606
## Drawbacks
573607

keps/sig-storage/1698-generic-ephemeral-volumes/kep.yaml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,14 @@ reviewers:
1111
- "@jsafrane"
1212
approvers:
1313
- "@saad-ali"
14-
stage: alpha
15-
latest-milestone: "v1.19"
14+
prr-approvers:
15+
- "@wojtek-t"
16+
stage: beta
17+
latest-milestone: "v1.21"
1618
milestone:
1719
alpha: "v1.19"
20+
beta: "v1.21"
21+
stable: "v1.23"
1822
feature-gates:
1923
- name: GenericEphemeralVolumes
2024
components:

0 commit comments

Comments
 (0)