Skip to content

Commit ec8a0d2

Browse files
authored
Merge pull request kubernetes#2738 from bswartz/populators-api-redesign
Redesign volume populators API
2 parents 0c8d92e + fa233c7 commit ec8a0d2

File tree

2 files changed

+122
-76
lines changed

2 files changed

+122
-76
lines changed

keps/sig-storage/1495-volume-populators/README.md

Lines changed: 118 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -80,19 +80,38 @@ problem is that current API validation logic uses a white list of object types,
8080
feature gates for each object type. This approach won't scale to generic populators, which
8181
will by their nature be too numerous and varied.
8282

83-
This proposal recommends that we relax validation (in core Kubernetes) on the `DataSource`
84-
field to allow arbitrary object types to be data sources, and rely on new controllers to
85-
perform the validation on those data sources and provide feedback to users. We will introduce
86-
a new CRD to register valid datasource kinds, and individual volume populators will
87-
create a CR for each kind of data source that it understands.
83+
For historical reasons (not intentional), the validation of the `DataSource` field caused
84+
unsupported objects to be ignored, rather than rejected, and for the request to yield an
85+
empty volume, as if no data source had been requested. It's not possible to update this
86+
behavior in a backwards-compatible way, which forces us to use a different mechanism for
87+
populators.
88+
89+
This proposal recommends that we replace the `DataSource` field with a new `DataSourceRef`
90+
field with expanded semantics. The existing `DataSource` field will be deprecated and
91+
maintained only for backwards compatibility. The new `DataSourceRef` field will allow
92+
objects other than existing data sources (PVC and VolumeSnapshots), and crucially will
93+
never ignore inputs.
94+
95+
Because volume populators will be implemented as new out-of-tree controllers, the in-tree
96+
validation of data sources will be limited to rejecting definitely-invalid values. All
97+
potentially-valid data sources will result in successful PVC creation, followed by either
98+
the PVC binding to a volume with the desired contents, or events explaining why that hasn't
99+
happened yet.
100+
101+
We will introduce a new CRD to register valid datasource kinds, and individual volume
102+
populators will create a CR for each kind of data source that it understands. A new
103+
`volume-data-source-validator` controller will generate events on PVCs with data sources
104+
that don't match any registered volume populator.
88105

89106
## Motivation
90107

91108
### Goals
92109

110+
- Don't break any existing behavior, even obviously unintentional behavior that currently
111+
works.
93112
- Enable users to create pre-populated volumes in a manner consistent with current practice
94113
- Enable developers to innovate with new an interesting data sources
95-
- Avoid changing existing workflows and breaking existing tools with new behavior
114+
- Minimize change to existing workflows and breaking existing tools with new behavior
96115
- Ensure that users continue to get accurate error messages and feedback about volume
97116
creation progress, whether using existing methods of volume creation or specifying a
98117
data source associated with a volume populator.
@@ -110,11 +129,32 @@ create a CR for each kind of data source that it understands.
110129

111130
## Proposal
112131

113-
Validation for the `DataSource` field will be relaxed so that any API object (except
114-
core API objects other than PVCs) may be specified. For objects that are not valid data
115-
source, rather than relying the the API server to reject them or ignore them (the prior
116-
behavior was to ignore them by clearing the field) allow them and provide feedback to
117-
API users with events.
132+
Introduce a new field on PVCs called `DataSourceRef`. This new field operates like
133+
`DataSource` except that it never ignores input -- contents are always accepted or the
134+
entire PVC is rejected if the contents are deemed invalid.
135+
136+
Deprecate the `DataSource` field, but continue to support it in a backwards-compatible
137+
way. In particular:
138+
139+
1. If `DataSource` and `DataSourceRef` are both unset; accept
140+
1. If `DataSource` is set valid (PVC or VolumeSnapshot) AND `DataSourceRef` is not set;
141+
set `DataSourceRef` from `DataSource`
142+
1. If `DataSource` is set valid AND `DataSourceRef` is set the same; accept
143+
1. If `DataSource` is set valid AND `DataSourceRef` is set differently; reject
144+
1. If `DataSource` is set invalid (anything but PVC or VolumeSnapshot) and `DataSourceRef`
145+
is not set; wipe `DataSource` (as today)
146+
1. If `DataSource` is set invalid and `DataSourceRef` is set; reject
147+
1. If `DataSource` is not set and `DataSourceRef` is set invalid (any core object other
148+
than PVC); reject
149+
1. If `DataSource` is not set and `DataSourceRef` is set valid (PVC or any non-core
150+
object); set `DataSource` from `DataSourceRef` and accept
151+
152+
Note that in the last case, we allow setting the CRD data sources, which enables
153+
volume populators to work. This is the only way that the `DataSource` field could get
154+
filled in with something not currently valid, like a CRD. Existing data
155+
sources continue to work, either through the old interface or the new interface. New
156+
data sources only work through the new interface, and additional validation is
157+
performed after the PVC is created.
118158

119159
A new controller will be introduced called `data-source-validator` which will watch PVCs
120160
objects and generate events on objects with data sources that are not considered valid. To
@@ -137,11 +177,10 @@ sourceKind:
137177
```
138178

139179
In this model there will be 4 sources of feedback to the end user:
140-
1. The API server will reject core objects other than PVCs with it current behavior, which
141-
is to accept the PVC but to blank out the data source. This is backwards compatible, and
180+
1. The API server will reject core objects other than PVCs. This is backwards compatible, and
142181
appropriate because we don't expect any existing core object (other than a PVC) to be a
143182
valid data source.
144-
1. For data sources that refer to a unknown group/kind, the data-source-validator will emit
183+
1. For data sources that refer to a unknown group/kind, the `volume-data-source-validator` will emit
145184
an event on the PVC informing the user that the PVC is currently not being acted on, so
146185
the user can fix any errors, and so the user doesn't wait forever wondering why the PVC
147186
is not binding. Alternatively, the event might remind the user that they forgot to
@@ -205,14 +244,19 @@ restoring from various kinds of things.
205244

206245
### Implementation Details/Notes/Constraints
207246

208-
As noted above, the proposal is extremely simple -- just remove the validation on
209-
the `DataSource` field. This raises the question of WHAT will happen when users
247+
As noted above, the proposal recommends a modest change to the PVC API which
248+
creates a new more flexible field to specify data sources without breaking any
249+
existing behaviors. Mostly the new `DataSourceRef` will never ignore user input,
250+
so that arbitrary CRDs can be used as data sources.
251+
This raises the question of WHAT will happen when users
210252
put new things in that field, and HOW populators will actually work with so small
211253
a change.
212254

213255
It's first important to note that only consumers of the `DataSource` field today
214256
are the various dynamic provisioners, most notably the external-provisioner CSI
215-
sidecar. If the external-provisioner sidecar sees a data source it doesn't
257+
sidecar. Due to the API change, these consumers will need to switch to consume the
258+
`DataSourceRef` field, but will otherwise continue to function mostly the same.
259+
Currently, if the external-provisioner sidecar sees a data source it doesn't
216260
understand, it simply ignores the request, which is both important for forward
217261
compatibility, and also perfect for the purposes of a data populator. This allows
218262
developers to add new types of data sources that the dynamic provisioners will
@@ -236,21 +280,18 @@ common controller logic should be reusable.
236280

237281
### Risks and Mitigations
238282

239-
Clearly there is concern that bad things might happen if we don't restrict
240-
the contents of the `DataSource` field, otherwise the validation wouldn't
241-
have been added. The main risk that I'm aware of is that badly-coded dynamic
242-
provisioners might crash if they see something they don't understand.
243-
Fortunately, the external-provisioner sidecar correctly handles this case,
244-
and so would any other dynamic provisioner designed with forward compatibility
245-
in mind.
246-
247-
Removing validation of the field relinquishes control over what kind of
248-
data sources are okay, and gives developers the freedom to decide. The biggest
249-
problem this leads to is that users might attempt to use a data source that's
250-
not supported (on a particular cluster), and they won't get any feedback
251-
telling them that their request will never succeed. This is not unlike a
252-
situation where a storage class refers to a provisioner that doesn't exist,
253-
but it's something that will need to be solved eventually.
283+
The replacement of `DataSource` with `DataSourceRef` greatly reduces risk
284+
relative to earlier designs. It leaves existing behavior the same and
285+
requires all clients to understand and use the new fields to enable the
286+
new functionality. The main risk is that if we eventually change our minds
287+
about this functionality, we've introduced a new field that we might
288+
need to carry into the future.
289+
290+
We are not concerned about backwards compatibility problems, as any existing
291+
request should continue to behave the same, and existing controllers will
292+
continue to work the same. Nevertheless we need to be careful about
293+
implementing the admission control logic that ensures backwards compatibility
294+
and we need good tests to cover all of the possible cases.
254295

255296
Security issues are hard to measure, because any security issues would be the
256297
result of badly designed data populators that failed to put appropriate
@@ -273,18 +314,27 @@ the uses and implications of any populators they chose to install.
273314

274315
### Test Plan
275316

276-
The test for this feature gate is to create a PVC with a data source
277-
that's not a PVC or VolumeSnapshot, and verify that the data source reference
278-
becomes part of the PVC API object. Any very simple CRD would be okay
279-
for this purpose. We would expect such a PVC to be ignored by existing
280-
dynamic provisioners.
281-
282-
To test the data-source-validator, we need to check the following cases:
283-
- Creation of a PVC with no datasource causes no events.
284-
- Creation of a PVC with a VolumeSnapshot or PVC datasource causes no events.
285-
- Creation of a PVC with a CRD datasource that's not registered by any
317+
To test the API changes to PVC, we need test cases for each of the data source
318+
situations outlined above, focusing especially on backwards compatibility.
319+
- Create PVC with `DataSource` and `DataSourceRef` both unset; except success and both unset
320+
- Create PVC with `DataSource` is set to PVC and `DataSourceRef` not set; expect success and `DataSourceRef` contains PVC
321+
- Create PVC with `DataSource` is set to VolumeSnapshot and `DataSourceRef` not set; expect success and `DataSourceRef` contains VolumeSnapshot
322+
- Create PVC with `DataSource` is set valid AND `DataSourceRef` is set the same; expect success
323+
- Create PVC with `DataSource` is set valid AND `DataSourceRef` is set differently; expect error
324+
- Create PVC with `DataSource` is set to Pod and `DataSourceRef` is not set; expect success and both unset
325+
- Create PVC with `DataSource` is set to CRD and `DataSourceRef` is not set; expect success and both unset
326+
- Create PVC with `DataSource` is set invalid and `DataSourceRef` valid; expect error
327+
- Create PVC with `DataSource` is not set and `DataSourceRef` is set to Pod, expect error
328+
- Create PVC with `DataSource` is not set and `DataSourceRef` is set to PVC; expect success and `DataSource` contains PVC
329+
- Create PVC with `DataSource` is not set and `DataSourceRef` is set to VolumeSnapshot; expect success and `DataSource` contains VolumeSnapshot
330+
- Create PVC with `DataSource` is not set and `DataSourceRef` is set to CRD; expect success and `DataSource` contains CRD
331+
332+
To test the `volume-data-source-validator`, we need to check the following cases:
333+
- Creation of a PVC with no `DataSourceRef` causes no events.
334+
- Creation of a PVC with a VolumeSnapshot or PVC `DataSourceRef` causes no events.
335+
- Creation of a PVC with a CRD `DataSourceRef` that's not registered by any
286336
volume populator causes UnrecognizedDataSourceKind events.
287-
- Creation of a PVC with a CRD datasource that's registered by a
337+
- Creation of a PVC with a CRD `DataSourceRef` that's registered by a
288338
volume populator causes no events.
289339

290340
### Graduation Criteria
@@ -300,8 +350,8 @@ To test the data-source-validator, we need to check the following cases:
300350
any other required functionality for data population is working.
301351
- We will need to see several implementations of working data populators that
302352
solve real world problems implemented in the community.
303-
- Data-source-validator and associated VolumePopulator CRD should be released
304-
with versions available.
353+
- `volume-data-source-validator` and associated VolumePopulator CRD should be released
354+
with beta versions available.
305355

306356
#### Beta -> GA Graduation
307357

@@ -311,16 +361,14 @@ To test the data-source-validator, we need to check the following cases:
311361

312362
### Upgrade / Downgrade Strategy
313363

314-
Data sources are only considered at provisioning time -- once the PVC becomes
315-
bound, the `DataSource` field becomes merely a historical note.
316-
317-
On upgrade, there are no potential problems because this change merely
318-
relaxes an existing limitation.
364+
On upgrade, a new field is added, the contents of which can be reliably
365+
determined for existing objects. New objects could put new contents into
366+
that field, but the admission controllers will keep the old field in sync
367+
with it for backwards compatibility.
319368

320-
On downgrade, there is a potential for unbound (not yet provisioned) PVCs to
321-
have data sources that never would have been allowed on the lower version. In
322-
this case we might want to revalidate the field and possibly wipe it out on
323-
downgrade.
369+
On downgrade, simply dropping the new field would be safe because any
370+
clients accessing the new alpha field would have caused the
371+
backwards-compatible values to be set in the existing field.
324372

325373
### Version Skew Strategy
326374

@@ -345,28 +393,25 @@ _This section must be completed when targeting alpha to a release._
345393
of a node? no
346394

347395
* **Does enabling the feature change any default behavior?**
348-
Before this feature, kube API server would silently drop any PVC data source
349-
that it didn't recognize, as if the client never populated the field. After
350-
enabling this feature, the API server allows any object to be specified.
351-
The data-source-validator controller will emit events for PVCs if the data
352-
source is not recognized as one which is be handled by another controller.
353-
396+
A new field is introduced on PVCs, but existing clients can safely ignore it.
397+
New clients that use the new field can cause PVCs to be created which can be
398+
used by volume populators. Clients using the new field can cause the existing
399+
`DataSource` field to contain values not otherwise possible, but clients
400+
already cope with this by ignore unknown values, which is exactly the intended
401+
behvior.
402+
354403
* **Can the feature be disabled once it has been enabled (i.e. can we roll back
355404
the enablement)?**
356-
Turning off the feature gate should causes the API server to go back to dropping
357-
data sources for new PVCs. An admin would also want to remove any data populator
358-
controllers, and the data-source-validator controller, and the VolumePopulator
359-
CRD. Lastly, and unbound PVCs with data sources should be deleted, as they
360-
will never bind.
405+
Yes, dropping or ignoring the new field returns the system to it's previous
406+
state. Worst case, some PVCs which were trying to use the new field might need
407+
to be deleted because they will never have anything happen to them after the
408+
feature is disabled.
361409

362-
Alternatively, just disable the feature gate and remove the populator controllers,
363-
but leave the data-source-validator controller and the VolumePopulator CRD,
364-
and it will generate events on any PVCs which need to be deleted. Then it
365-
could be left up to end users to delete their own PVCs which will never bind.
366-
367410
* **What happens if we reenable the feature if it was previously rolled back?**
368-
PVC which previously had data sources that were invalided by the rollback
369-
would be empty volumes. Users would need to delete and re-create those PVCs.
411+
As long as the alpha field was not dropped, things will go back to how they
412+
were. If the alpha field is dropped, and then re-added, the same PVCs which
413+
were defunct after the disablement will need to be deleted, but nothing else
414+
bad will happen.
370415

371416
* **Are there any tests for feature enablement/disablement?**
372417
Not at this time.
@@ -505,6 +550,7 @@ resource usage (CPU, RAM, disk, IO, ...) in any components?**
505550
- KEP updated to new format September 2020
506551
- Webhook replaced with controller in December 2020
507552
- KEP updated Feb 2021 for v1.21
553+
- Resign with new `DataSourceRef` field in May 2021 for v1.22, still alpha
508554

509555
## Alternatives
510556

keps/sig-storage/1495-volume-populators/kep.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ participating-sigs:
77
- sig-api-machinery
88
status: implementable
99
creation-date: 2019-12-03
10-
last-updated: 2021-02-01
10+
last-updated: 2021-05-13
1111
reviewers:
1212
- "@thockin"
1313
- "@saad-ali"
@@ -21,11 +21,11 @@ prr-approvers:
2121
replaces:
2222
- "/keps/sig-storage/20200120-generic-data-populators.md"
2323
stage: beta
24-
latest-milestone: "v1.18"
24+
latest-milestone: "v1.22"
2525
milestone:
2626
alpha: "v1.18"
27-
beta: "v1.21"
28-
stable: "v1.22"
27+
beta: "v1.23"
28+
stable: "v1.24"
2929
feature-gates:
3030
- name: AnyVolumeDataSource
3131
components:

0 commit comments

Comments
 (0)