Skip to content

Commit 2e2f530

Browse files
authored
Merge pull request kubernetes#5163 from gohilankit/portworx-csi-migration-ga-1-33
KEP-2589: Update KEP README
2 parents 715bda6 + 0ed2442 commit 2e2f530

File tree

1 file changed

+239
-36
lines changed
  • keps/sig-storage/2589-csi-migration-portworx

1 file changed

+239
-36
lines changed
Lines changed: 239 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,63 @@
1-
# In-tree Storage Plugin to CSI Migration - Portworx Design Doc
1+
# KEP-2589: In-tree Storage Plugin to CSI Migration - Portworx Design Doc
22

3-
## Table of Contents
43

54
<!-- toc -->
5+
- [Release Signoff Checklist](#release-signoff-checklist)
66
- [Summary](#summary)
7-
- [New Feature Gates](#new-feature-gates)
8-
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
9-
- [Implementation History](#implementation-history)
10-
- [Design details](#design-details)
7+
- [Motivation](#motivation)
8+
- [Goals](#goals)
9+
- [Non-Goals](#non-goals)
10+
- [Proposal](#proposal)
11+
- [Risks and Mitigations](#risks-and-mitigations)
12+
- [Design Details](#design-details)
1113
- [Test Plan](#test-plan)
1214
- [Prerequisite testing updates](#prerequisite-testing-updates)
1315
- [Unit tests](#unit-tests)
1416
- [Integration tests](#integration-tests)
1517
- [e2e tests](#e2e-tests)
18+
- [Graduation Criteria](#graduation-criteria)
19+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
20+
- [Version Skew Strategy](#version-skew-strategy)
21+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
22+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
23+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
24+
- [Monitoring Requirements](#monitoring-requirements)
25+
- [Dependencies](#dependencies)
26+
- [Scalability](#scalability)
27+
- [Troubleshooting](#troubleshooting)
28+
- [Implementation History](#implementation-history)
29+
- [Drawbacks](#drawbacks)
30+
- [Alternatives](#alternatives)
1631
<!-- /toc -->
1732

33+
## Release Signoff Checklist
34+
35+
36+
Items marked with (R) are required *prior to targeting to a milestone / release*.
37+
38+
- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
39+
- [x] (R) KEP approvers have approved the KEP status as `implementable`
40+
- [x] (R) Design details are appropriately documented
41+
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
42+
- [x] e2e Tests for all Beta API Operations (endpoints)
43+
- [x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
44+
- [x] (R) Minimum Two Week Window for GA e2e tests to prove flake free
45+
- [x] (R) Graduation criteria is in place
46+
- [x] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
47+
- [x] (R) Production readiness review completed
48+
- [x] (R) Production readiness review approved
49+
- [x] "Implementation History" section is up-to-date for milestone
50+
- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
51+
- [x] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
52+
53+
<!--
54+
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
55+
-->
56+
57+
[kubernetes.io]: https://kubernetes.io/
58+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
59+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
60+
[kubernetes/website]: https://git.k8s.io/website
1861

1962
## Summary
2063

@@ -24,50 +67,48 @@ This document present as a vendor specific KEP for the parent KEP
2467
This inherits all the contents from its parent KEP. It will introduce two new feature gates to be
2568
used as described in its parent KEP. For all other contents, please refer to the parent KEP.
2669

27-
### New Feature Gates
70+
## Motivation
71+
72+
Currently the Portworx volume provisioning happens through Portworx in-tree driver. As part of the parent KEP [CSI Migration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration), the Portworx in-tree driver logic needs to be "migrated" to use Portworx CSI driver instead.
73+
74+
### Goals
75+
76+
- To migrate Portworx in-tree plugin to CSI
77+
78+
79+
### Non-Goals
80+
81+
- This doesn't target the core in-tree to CSI migration code in k/k.
82+
83+
## Proposal
84+
85+
The in-tree to CSI migration feature is already in place in k/k. We just need to enable Portworx specific feature gates for it to work for Portworx driver.
86+
87+
88+
### Risks and Mitigations
89+
90+
- Portworx CSI driver needs to be already deployed before enabling this feature.
91+
92+
## Design Details
93+
94+
The in-tree to CSI migration feature is already in place in k/k. We just need to enable vendor specific feature gates for it to work for each vendor. Below are the feature gates we need to enable:
2895

2996
- CSIMigrationPortworx
3097
- As describe in [CSI Migration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration),
3198
when this feature flag && the `CSIMigration` is enabled at the same time, the in-tree volume
32-
plugin `kubernetes.io/portworx-volume` will be redirected to use the corresponding CSI driver. From a
33-
user perspective, nothing will be noticed.
99+
plugin `kubernetes.io/portworx-volume` will be redirected to use the corresponding CSI driver. From a user perspective, nothing will be noticed.
34100
- InTreePluginPortworxUnregister
35101
- This flag technically is not part of CSI Migration design. But it happens to be related and helps with
36102
CSI Migration. The name speaks for itself, when this flag is enabled, kubernetes will not register the
37103
`kubernetes.io/portworx-volume` as one of the in-tree storage plugin provisioners. This flag standalone
38104
can work out of CSI Migration features.
39105
- However, when all `InTreePluginPortworxUnregister`, `CSIMigrationPortworx` and `CSIMigration` feature
40106
flags are enabled at the same time. The kube-controller-manager will skip the feature flag checking
41-
on kubelet and treat Portworx CSI migration as already complete. And directly redirect traffic to CSI
42-
driver for all portworx related operations.
43-
44-
45-
## Production Readiness Review Questionnaire
46-
47-
Please refer to the [CSI Migration Production Readiness Review Questionnaire](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration#production-readiness-review-questionnaire).
48-
49-
## Implementation History
50-
51-
Major milestones in the life cycle of a KEP should be tracked in `Implementation History`.
52-
53-
- 2021-09-08 KEP created
54-
55-
Major milestones for Portworx in-tree plugin CSI migration:
56-
57-
- 1.23
58-
- Portworx CSI migration to Alpha
59-
- 1.25
60-
- Portworx CSI migration to Beta, off by default
61-
- 1.31
62-
- Portworx CSI migration to Beta, on by default
63-
- 1.33
64-
- Portworx CSI migration to Stable
65-
66-
## Design details
107+
on kubelet and treat Portworx CSI migration as already complete. And directly redirect traffic to CSI driver for all portworx related operations.
67108

68109
### Test Plan
69110

70-
I/we understand the owners of the involved components may require updates to
111+
[x] I/we understand the owners of the involved components may require updates to
71112
existing tests to make this code solid enough prior to committing the changes necessary
72113
to implement this enhancement.
73114

@@ -90,3 +131,165 @@ N/A
90131
##### e2e tests
91132

92133
- `sig-storage` `Driver: portworx-volume` To ensure the implementation correctness, I/we have manually run the e2e tests, [located in the main k8s repository](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/storage/drivers/in_tree.go). Test results are attached to the pull requests
134+
135+
136+
### Graduation Criteria
137+
138+
* Alpha in 1.23 provided all tests are passing.
139+
* All functionality is guarded by alpha `CSIMigrationPortworx` feature gate.
140+
* Portworx CSI migration to Beta, off by default in 1.25. e2e test results are provided in the PR.
141+
* Beta in 1.31 with design validated by customer deployments
142+
(non-production)
143+
* Manual testing with in-tree Portworx volumes should be passing.
144+
* GA in 1.33, with `CSIMigrationPortworx` feature gate graduating to GA.
145+
146+
147+
### Upgrade / Downgrade Strategy
148+
149+
When `CSIMigrationPortworx` feature gate gets enabled and customers are not using Portworx security feature, the upgrade/downgrade will work without any changes to cluster objects or configurations.
150+
In case of downgrade, it will revert back to the existing behavior of using in-tree driver.
151+
152+
With Portworx security feature enabled, customers will have to add certain annotations to in-tree PVs mentioning the CSI secret name/namespace which the kubelet or CSI sidecar containers can use(using `csi-translation-lib`) to pass secret contents to Portworx CSI driver for operations on in-tree PVs. The annotations to be added will be documented in Portworx documentation.
153+
The downgrade will work without any changes.
154+
155+
### Version Skew Strategy
156+
157+
N/A
158+
159+
## Production Readiness Review Questionnaire
160+
Please refer to the [CSI Migration Production Readiness Review Questionnaire](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration#production-readiness-review-questionnaire).
161+
162+
### Feature Enablement and Rollback
163+
164+
###### How can this feature be enabled / disabled in a live cluster?
165+
166+
- [x] Feature gate (also fill in values in `kep.yaml`)
167+
- Feature gate name: CSIMigrationPortworx
168+
- Components depending on the feature gate: kubelet, A/D controller
169+
170+
###### Does enabling the feature change any default behavior?
171+
172+
It will switch the control plane volume operations from in-tree driver to CSI driver.
173+
174+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
175+
176+
Yes. Disabling the feature gate will revert back to the existing behavior of using in-tree driver.
177+
178+
###### What happens if we reenable the feature if it was previously rolled back?
179+
If reenabled, any subsequent CSI operations will go through Portworx CSI driver.
180+
181+
###### Are there any tests for feature enablement/disablement?
182+
We will need to create unit tests that enable this feature.
183+
184+
185+
### Rollout, Upgrade and Rollback Planning
186+
187+
188+
###### How can a rollout or rollback fail? Can it impact already running workloads?
189+
No, a rollout should not impact running workloads, since the default behavior
190+
remains the same to use in-tree driver.
191+
192+
193+
###### What specific metrics should inform a rollback?
194+
No known rollback criteria.
195+
196+
197+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
198+
199+
The upgrade->downgrade->upgrade path should work fine as it's already handled in the parent KEP [CSI Migration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration)
200+
201+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
202+
203+
It deprecates the use of in-tree Portworx driver.
204+
205+
### Monitoring Requirements
206+
207+
###### How can an operator determine if the feature is in use by workloads?
208+
Check the `migrated-plugins` annotation on `CSINode` object, which will have the list of plugins for which in-tree to CSI migration feature is turned on.
209+
210+
###### How can someone using this feature know that it is working for their instance?
211+
212+
- [x] Other (treat as last resort)
213+
- Details:
214+
The PV object will have a `migrated-to` annotation on it.
215+
216+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
217+
- No increased failure rates during mounting a volume created using in-tree driver.
218+
219+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
220+
221+
- [ ] Other (treat as last resort)
222+
- Details:
223+
We can use the SLIs for parent KEP [CSI Migration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/625-csi-migration), if any.
224+
225+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
226+
227+
No additional metrics needed.
228+
229+
### Dependencies
230+
231+
###### Does this feature depend on any specific services running in the cluster?
232+
233+
It has a pre-requisite of Portworx CSI driver to be already deployed in the cluster.
234+
235+
### Scalability
236+
237+
###### Will enabling / using this feature result in any new API calls?
238+
239+
There will be no new API calls.
240+
241+
###### Will enabling / using this feature result in introducing new API types?
242+
243+
There are no new API types.
244+
245+
246+
###### Will enabling / using this feature result in any new calls to the cloud provider?
247+
248+
There should be no new calls to the cloud providers.
249+
250+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
251+
252+
There will be no increase in size or count of existing API objects.
253+
254+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
255+
256+
No
257+
258+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
259+
260+
No
261+
262+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
263+
264+
No
265+
266+
### Troubleshooting
267+
268+
###### How does this feature react if the API server and/or etcd is unavailable?
269+
The CSI operations like mounting a volume will fail.
270+
271+
###### What are other known failure modes?
272+
273+
## Implementation History
274+
275+
- 2021-09-08 KEP created
276+
277+
Major milestones for Portworx in-tree plugin CSI migration:
278+
279+
- 1.23
280+
- Portworx CSI migration to Alpha
281+
- 1.25
282+
- Portworx CSI migration to Beta, off by default
283+
- 1.31
284+
- Portworx CSI migration to Beta, on by default
285+
- 1.33
286+
- Portworx CSI migration to Stable
287+
288+
## Drawbacks
289+
290+
N/A
291+
292+
## Alternatives
293+
294+
N/A
295+

0 commit comments

Comments
 (0)