@@ -267,48 +267,61 @@ delievered to a subset of kube-apiservers, this is not a problem, because:
267
267
should still be stored there (unless there is heavy churn of those objects
268
268
which is the case that doesn't suffer from this problem)
269
269
270
-
271
270
The POC PR can be found in: https://github.com/kubernetes/kubernetes/pull/92472
272
271
273
272
### Risks and Mitigations
274
273
275
- <!--
276
- What are the risks of this proposal, and how do we mitigate? Think broadly.
277
- For example, consider both security and how this will impact the larger
278
- Kubernetes ecosystem.
279
-
280
- How will security be reviewed, and by whom?
281
-
282
- How will UX be reviewed, and by whom?
283
-
284
- Consider including folks who also work outside the SIG or subproject.
285
- -->
274
+ The biggest risk are bugs in the implementation. To mitigate this, the
275
+ implementation will be hidden behind ` EfficientWatchResumption ` feature
276
+ gate and necessary tests will be added and/or extended (details below).
286
277
287
278
## Design Details
288
279
289
280
### Test Plan
290
281
291
- TODO: Fill in before making ` Implementable ` .
282
+ - unit tests for logic enhancing resource version tracking in reflector
283
+ - unit tests for newly added watch cache logic
284
+ - integration test for sending bookmark on kube-apiserver shutdown
285
+ - integration test for proving that resource version that
286
+ kube-apiserver can serve from cache progresses eventually when objects of
287
+ other types are being added/updated/deleted;
288
+ this test should store events (or other type) in a separate etcd cluster
289
+ (to test split-etcd backend mode) and ensure no RV leak across etcd clusters
292
290
293
291
### Graduation Criteria
294
292
295
- TODO: Fill in before making ` Implementable ` .
293
+ Alpha should provide basic functionality covered with tests described above .
296
294
297
295
#### Alpha -> Beta Graduation
298
296
299
- TODO: Fill in before making ` Implementable ` .
297
+ - Appropriate metrics are agreed on and implemented
298
+ - Ad-hoc manual rolling-upgrade of kube-apiservers in 5k-node cluster
299
+ is not resulting in required re-listing for watched resources from
300
+ node components
300
301
301
302
#### Beta -> GA Graduation
302
303
303
- TODO: Fill in before making ` Implementable ` .
304
+ - Enabled in Beta for at least two releases without complaints
305
+ - Rolling-upgrade of kube-apiservers in 5k-node cluster test is
306
+ automated and running periodically.
304
307
305
308
### Upgrade / Downgrade Strategy
306
309
307
- TODO: Fill in before making ` Implementable ` .
310
+ Kubernetes can be safely updated/downgraded, as the implementation
311
+ is purely in memory:
312
+ - if etcd doesn't support frequent enough progress notify events,
313
+ we won't get expected benefits (problems may not be addressed),
314
+ but also no unexpected consequences
315
+ - enabling the feature may only result in additional watch bookmark
316
+ events for clients, which they are explicitly opting-in anyway
317
+ - disabling the feature reverts the behavior of watchcache being
318
+ synced to values of objects of different types; however given
319
+ the initialization is happening at "now" anyway, the time won't
320
+ go back
308
321
309
322
### Version Skew Strategy
310
323
311
- TODO: Fill in before making ` Implementable ` .
324
+ n/a - watch bookmarks don't have any frequency guarantees
312
325
313
326
## Production Readiness Review Questionnaire
314
327
@@ -319,33 +332,23 @@ TODO: Fill in before making `Implementable`.
319
332
_ This section must be completed when targeting alpha to a release._
320
333
321
334
* ** How can this feature be enabled / disabled in a live cluster?**
322
- - [ ] Feature gate (also fill in values in ` kep.yaml ` )
323
- - Feature gate name:
324
- - Components depending on the feature gate:
325
- - [ ] Other
326
- - Describe the mechanism:
327
- - Will enabling / disabling the feature require downtime of the control
328
- plane?
329
- - Will enabling / disabling the feature require downtime or reprovisioning
330
- of a node? (Do not assume ` Dynamic Kubelet Config ` feature is enabled).
335
+ - [x] Feature gate (also fill in values in ` kep.yaml ` )
336
+ - Feature gate name: EfficientWatchResumption
337
+ - Components depending on the feature gate: kube-apiserver
331
338
332
339
* ** Does enabling the feature change any default behavior?**
333
- Any change of default behavior may be surprising to users or break existing
334
- automations, so be extremely careful here.
340
+ No.
335
341
336
342
* ** Can the feature be disabled once it has been enabled (i.e. can we roll back
337
343
the enablement)?**
338
- Also set ` disable-supported ` to ` true ` or ` false ` in ` kep.yaml ` .
339
- Describe the consequences on existing workloads (e.g., if this is a runtime
340
- feature, can it break the existing applications?).
344
+ Yes, watchcache (and watch bookmark events) will not be propagated with
345
+ resource versions of objects of other types.
341
346
342
347
* ** What happens if we reenable the feature if it was previously rolled back?**
348
+ The expected behavior will be restored.
343
349
344
350
* ** Are there any tests for feature enablement/disablement?**
345
- The e2e framework does not currently support enabling or disabling feature
346
- gates. However, unit tests in each component dealing with managing data, created
347
- with and without the feature, are necessary. At the very least, think about
348
- conversion tests if API types are being modified.
351
+ No.
349
352
350
353
### Rollout, Upgrade and Rollback Planning
351
354
@@ -498,6 +501,7 @@ _This section must be completed when targeting beta graduation to a release._
498
501
## Implementation History
499
502
500
503
2020-06-30: KEP Proposed.
504
+ 2020-08-04: KEP marked as implementable.
501
505
502
506
## Drawbacks
503
507
0 commit comments