Skip to content

Commit 1f9a626

Browse files
committed
Update KEP README
1 parent 1c07cd2 commit 1f9a626

File tree

1 file changed

+238
-36
lines changed
  • keps/sig-storage/2589-csi-migration-portworx

1 file changed

+238
-36
lines changed
Lines changed: 238 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,68 @@
1-
# In-tree Storage Plugin to CSI Migration - Portworx Design Doc
1+
# KEP-2589: In-tree Storage Plugin to CSI Migration - Portworx Design Doc
22

3-
## Table of Contents
43

54
<!-- toc -->
5+
- [Release Signoff Checklist](#release-signoff-checklist)
66
- [Summary](#summary)
7-
- [New Feature Gates](#new-feature-gates)
8-
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
9-
- [Implementation History](#implementation-history)
10-
- [Design details](#design-details)
7+
- [Motivation](#motivation)
8+
- [Goals](#goals)
9+
- [Non-Goals](#non-goals)
10+
- [Proposal](#proposal)
11+
- [User Stories (Optional)](#user-stories-optional)
12+
- [Story 1](#story-1)
13+
- [Story 2](#story-2)
14+
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
15+
- [Risks and Mitigations](#risks-and-mitigations)
16+
- [Design Details](#design-details)
1117
- [Test Plan](#test-plan)
1218
- [Prerequisite testing updates](#prerequisite-testing-updates)
1319
- [Unit tests](#unit-tests)
1420
- [Integration tests](#integration-tests)
1521
- [e2e tests](#e2e-tests)
22+
- [Graduation Criteria](#graduation-criteria)
23+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
24+
- [Version Skew Strategy](#version-skew-strategy)
25+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
26+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
27+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
28+
- [Monitoring Requirements](#monitoring-requirements)
29+
- [Dependencies](#dependencies)
30+
- [Scalability](#scalability)
31+
- [Troubleshooting](#troubleshooting)
32+
- [Implementation History](#implementation-history)
33+
- [Drawbacks](#drawbacks)
34+
- [Alternatives](#alternatives)
35+
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
1636
<!-- /toc -->
1737

38+
## Release Signoff Checklist
39+
40+
41+
Items marked with (R) are required *prior to targeting to a milestone / release*.
42+
43+
- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
44+
- [x] (R) KEP approvers have approved the KEP status as `implementable`
45+
- [x] (R) Design details are appropriately documented
46+
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
47+
- [x] e2e Tests for all Beta API Operations (endpoints)
48+
- [x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
49+
- [x] (R) Minimum Two Week Window for GA e2e tests to prove flake free
50+
- [x] (R) Graduation criteria is in place
51+
- [x] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
52+
- [x] (R) Production readiness review completed
53+
- [] (R) Production readiness review approved
54+
- [x] "Implementation History" section is up-to-date for milestone
55+
- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
56+
- [x] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
57+
58+
<!--
59+
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
60+
-->
61+
62+
[kubernetes.io]: https://kubernetes.io/
63+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
64+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
65+
[kubernetes/website]: https://git.k8s.io/website
1866

1967
## Summary
2068

@@ -24,50 +72,48 @@ This document present as a vendor specific KEP for the parent KEP
2472
This inherits all the contents from its parent KEP. It will introduce two new feature gates to be
2573
used as described in its parent KEP. For all other contents, please refer to the parent KEP.
2674

27-
### New Feature Gates
75+
## Motivation
76+
77+
Currently the Portworx volume provisioning happens through Portworx in-tree driver. As part of the parent KEP [CSI Migration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration), the Portworx in-tree driver needs to be deprecated in order to give to use Portworx CSI driver.
78+
79+
### Goals
80+
81+
- To migrate Portworx in-tree plugin to CSI
82+
83+
84+
### Non-Goals
85+
86+
- This doesn't target the core in-tree to CSI migration code in k/k.
87+
88+
## Proposal
89+
90+
The in-tree to CSI migration feature is already in place in k/k. We just need to enable vendor specific feature gates for it to work for each vendor.
91+
92+
93+
### Risks and Mitigations
94+
95+
- Portworx CSI driver needs to be already deployed in before enabling this feature.
96+
97+
## Design Details
98+
99+
The in-tree to CSI migration feature is already in place in k/k. We just need to enable vendor specific feature gates for it to work for each vendor. Below are the feature gates we need to enable:
28100

29101
- CSIMigrationPortworx
30102
- As describe in [CSI Migration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration),
31103
when this feature flag && the `CSIMigration` is enabled at the same time, the in-tree volume
32-
plugin `kubernetes.io/portworx-volume` will be redirected to use the corresponding CSI driver. From a
33-
user perspective, nothing will be noticed.
104+
plugin `kubernetes.io/portworx-volume` will be redirected to use the corresponding CSI driver. From a user perspective, nothing will be noticed.
34105
- InTreePluginPortworxUnregister
35106
- This flag technically is not part of CSI Migration design. But it happens to be related and helps with
36107
CSI Migration. The name speaks for itself, when this flag is enabled, kubernetes will not register the
37108
`kubernetes.io/portworx-volume` as one of the in-tree storage plugin provisioners. This flag standalone
38109
can work out of CSI Migration features.
39110
- However, when all `InTreePluginPortworxUnregister`, `CSIMigrationPortworx` and `CSIMigration` feature
40111
flags are enabled at the same time. The kube-controller-manager will skip the feature flag checking
41-
on kubelet and treat Portworx CSI migration as already complete. And directly redirect traffic to CSI
42-
driver for all portworx related operations.
43-
44-
45-
## Production Readiness Review Questionnaire
46-
47-
Please refer to the [CSI Migration Production Readiness Review Questionnaire](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration#production-readiness-review-questionnaire).
48-
49-
## Implementation History
50-
51-
Major milestones in the life cycle of a KEP should be tracked in `Implementation History`.
52-
53-
- 2021-09-08 KEP created
54-
55-
Major milestones for Portworx in-tree plugin CSI migration:
56-
57-
- 1.23
58-
- Portworx CSI migration to Alpha
59-
- 1.25
60-
- Portworx CSI migration to Beta, off by default
61-
- 1.31
62-
- Portworx CSI migration to Beta, on by default
63-
- 1.33
64-
- Portworx CSI migration to Stable
65-
66-
## Design details
112+
on kubelet and treat Portworx CSI migration as already complete. And directly redirect traffic to CSI driver for all portworx related operations.
67113

68114
### Test Plan
69115

70-
I/we understand the owners of the involved components may require updates to
116+
[x] I/we understand the owners of the involved components may require updates to
71117
existing tests to make this code solid enough prior to committing the changes necessary
72118
to implement this enhancement.
73119

@@ -90,3 +136,159 @@ N/A
90136
##### e2e tests
91137

92138
- `sig-storage` `Driver: portworx-volume` To ensure the implementation correctness, I/we have manually run the e2e tests, [located in the main k8s repository](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/storage/drivers/in_tree.go). Test results are attached to the pull requests
139+
140+
141+
### Graduation Criteria
142+
143+
* Alpha in 1.23 provided all tests are passing.
144+
* All functionality is guarded by alpha `CSIMigrationPortworx` feature gate.
145+
* Beta in 1.31 with design validated by customer deployments
146+
(non-production)
147+
* Manual tests with in-tree Portworx volumes
148+
* GA in 1.33, with `CSIMigrationPortworx` feature gate graduating to GA.
149+
150+
151+
### Upgrade / Downgrade Strategy
152+
153+
N/A
154+
155+
### Version Skew Strategy
156+
157+
N/A
158+
159+
## Production Readiness Review Questionnaire
160+
161+
### Feature Enablement and Rollback
162+
163+
###### How can this feature be enabled / disabled in a live cluster?
164+
165+
- [x] Feature gate (also fill in values in `kep.yaml`)
166+
- Feature gate name: CSIMigrationPortworx
167+
- Components depending on the feature gate: kubelet, A/D controller
168+
169+
###### Does enabling the feature change any default behavior?
170+
171+
It will switch the control plane volume operations from in-tree driver to CSI driver.
172+
173+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
174+
175+
Yes. Disabling the feature gate will revert back to the existing behavior of using in-tree driver.
176+
177+
###### What happens if we reenable the feature if it was previously rolled back?
178+
If reenabled, any subsequent CSI operations will go through Portworx CSI driver.
179+
180+
###### Are there any tests for feature enablement/disablement?
181+
We will need to create unit tests that enable this feature.
182+
183+
184+
### Rollout, Upgrade and Rollback Planning
185+
186+
187+
###### How can a rollout or rollback fail? Can it impact already running workloads?
188+
No, a rollout should not impact running workloads, since the default behavior
189+
remains the same to use in-tree driver.
190+
191+
192+
###### What specific metrics should inform a rollback?
193+
No known rollback criteria.
194+
195+
196+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
197+
198+
The upgrade->downgrade->upgrade path should work fine as it's already handled in the parent KEP [CSI Migration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration)
199+
200+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
201+
202+
It deprecates the use of in-tree Portworx driver.
203+
204+
### Monitoring Requirements
205+
206+
###### How can an operator determine if the feature is in use by workloads?
207+
Check the `migrated-plugins` annotation on `CSINode` object, which will have the list of plugins for which in-tree to CSI migration feature is turned on.
208+
209+
###### How can someone using this feature know that it is working for their instance?
210+
211+
- [x] Other (treat as last resort)
212+
- Details:
213+
The PV object will have a `migrated-to` annotation on it.
214+
215+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
216+
- No increased failure rates during mounting a volume created using in-tree driver.
217+
218+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
219+
220+
- [ ] Other (treat as last resort)
221+
- Details:
222+
We can use the SLIs for parent KEP [CSI Migration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration), if any.
223+
224+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
225+
226+
No additional metrics needed.
227+
228+
### Dependencies
229+
230+
###### Does this feature depend on any specific services running in the cluster?
231+
232+
It has a pre-requisite of Portworx CSI driver to be already deployed in the cluster.
233+
234+
### Scalability
235+
236+
###### Will enabling / using this feature result in any new API calls?
237+
238+
There will be no new API calls.
239+
240+
###### Will enabling / using this feature result in introducing new API types?
241+
242+
There are no new API types.
243+
244+
245+
###### Will enabling / using this feature result in any new calls to the cloud provider?
246+
247+
There should be no new calls to the cloud providers.
248+
249+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
250+
251+
There will be no increase in size or count of existing API objects.
252+
253+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
254+
255+
No
256+
257+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
258+
259+
No
260+
261+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
262+
263+
No
264+
265+
### Troubleshooting
266+
267+
###### How does this feature react if the API server and/or etcd is unavailable?
268+
The CSI operations like mounting a volume will fail.
269+
270+
###### What are other known failure modes?
271+
272+
## Implementation History
273+
274+
- 2021-09-08 KEP created
275+
276+
Major milestones for Portworx in-tree plugin CSI migration:
277+
278+
- 1.23
279+
- Portworx CSI migration to Alpha
280+
- 1.25
281+
- Portworx CSI migration to Beta, off by default
282+
- 1.31
283+
- Portworx CSI migration to Beta, on by default
284+
- 1.33
285+
- Portworx CSI migration to Stable
286+
287+
## Drawbacks
288+
289+
N/A
290+
291+
## Alternatives
292+
293+
N/A
294+

0 commit comments

Comments
 (0)